Geolocalizating Onion Services and detecting them on a given LAN

#TOR #De-anonymization #Cloud #Sniffing

Abstract

This report explores practical methods for de-anonymizing TOR onion services, specifically by analyzing traffic within a local area network (LAN) and measuring latency from various global locations to the onion server. Onion services, accessible via .onion addresses, utilize Tor’s network of encrypted relays to ensure user anonymity. By employing traffic analysis and latency probing, we aim to discover Onion Services running on a LAN and infer their physical location on a worldwide scale, respectively.

Map

Introduction

The Tor network is a decentralized system designed to provide anonymity for users and services by routing traffic through a series of encrypted relays. It is widely used for protecting user privacy and enables hosting and accessing hidden services known as \textbf{onion services}. These services use .onion addresses, which mask the IP addresses of both users and service providers, preserving their anonymity

This report aims to explore how to pinpoint in a map where is physically hosted a onion service and how to detect if a local network is hosting an onion service.

What is Tor?

Tor (The Onion Router) is an anonymity network that routes internet traffic through multiple encrypted relays (nodes) worldwide. This multi-hop structure ensures that no single node knows both the origin and destination of the traffic, making it difficult to trace. Tor achieves anonymity by:

Relays: These intermediary nodes route encrypted traffic.
Onion Routing: Traffic is encrypted in layers; each relay peels off one layer until the data reaches its destination.
Entry and Exit Nodes: The entry node sees the user's IP, while the exit node sees the destination, but neither node has complete information.
Rendezvous Point: TOR node where both the client and the web server agree to "meet" (both of them use their 3 nodes secure route to reach this node)

What are Onion Services?

Onion services (formerly called hidden services) are web services that operate entirely within the Tor network. They are accessed using .onion addresses, which obscure the server's IP address. Key characteristics of onion services include:

Anonymity for both client and server: Neither party knows each other’s true IP address.
End-to-end encryption: Communication between the client and the onion service is encrypted, offering multiple layers of protection.
Decentralized routing: Communication is routed through several relays, making it highly resistant to tracing by external observers.

How Do Onion Services Work?

The lifecycle of an onion service can be described through the following steps:

Service Creation and Publication:
- The onion service establishes a Tor circuit and selects several Tor relays to act as introduction points}*.
- It generates a .onion} address (derived from a public key) and publishes this address along with the introduction points in the hidden service directory} within Tor.
Client Request:
- A Tor client retrieves the .onion}* address and its introduction points from the hidden service directory.
Rendezvous Point Establishment:
- The client creates a Tor circuit and chooses a random relay to serve as a rendezvous point}*. It then sends a connection request through one of the service’s introduction points, including information about the rendezvous point.
Rendezvous Connection:
- The onion service, upon receiving the request, creates a circuit to the rendezvous point.
- Both client and service communicate through this rendezvous point without revealing their IP addresses.

Basic Tor Architecture

Detecting a Tor Onion Service within the Same LAN

Detecting whether a local network is hosting Onion services involves analyzing specific network patterns, identifying Tor-related characteristics, and recognizing known traffic behaviors.

Configuration for Onion Service Detection

The laboratory setup is configured on a Raspberry Pi 4B. Ideally, we would detect the activity of the TOR Onion Service from another machine within the same network. However, since the Raspberry Pi is directly connected to the router via an RJ45 Ethernet cable, this is not feasible. As a result, the TShark instance is also running on the Raspberry Pi, albeit from a different Docker container. Both containers operate on the network as if they were the Raspberry Pi itself, rather than through a special IP or NAT configuration. This arrangement allows the TShark container to monitor the traffic generated by the Onion Service container.

Once the laboratory environment is fully operational, we can initiate our testing procedures. We conducted two distinct experiments to manually analyze the generated traffic. In both tests, we began by launching the TShark instance to capture all traffic on ports 80 and 443, followed by executing the action to be tested. The first test involved starting the Onion Service to monitor its connections to the Introduction Points. The second test consisted of accessing the Onion Service using the TOR Browser. The setup of the environment, including files and source code, can be found in Annex 1.

Lab setup

After capturing and manually analyzing the network traces from both tests, we reached the following conclusions:

All connections discussed below utilize UDP on port 443.
Throughout the connections, only one IP address is observed, which corresponds to the Entry Guard of the Relay Circuit for the Onion Service. This indicates that it is impossible to identify the IP addresses of the Introduction Point or the Rendezvous Points. It is also worth mentioning that this IP does not appear on public lists of TOR entry points, which are used by the TOR Browser client.
The Onion Service initiates communication with a small number of packets (in our test, three packets). Subsequently, a series of packets are exchanged with the TOR node to establish the circuit. Once the circuit is established, nearly all packets originate from the TOR Entry Guard to the Onion Service, with most packets measuring 1250 bytes in length. Occasionally, the Onion Service sends an isolated packet to the Entry Guard, which ranges from 30 to 40 bytes in length.
When the web server communicates with the Rendezvous Point, the Onion Service transmits multiple packets of the same size (30 to 40 bytes).
This one should be pretty obvious but if we scan the host suspected of hosting a TOR Onion Service, we are going to find always a web server running on a TCP port (it doesn't need to be a standard one). Also is worth mentioning that the TOR hidden service doesn't require a port forwarding in order to be able to work properly as a normal web server would.

Pinpointing the Physical Location of a TOR Onion Service on a Global Map

The second practical part of this report focuses on determining whether it is possible to approximate the physical location of a TOR Onion Service. For this experiment, we will reuse the same laboratory setup as in the previous section, which is located in Zaragoza, Aragón, Spain.

To determine the server's location, we will measure the loading time of a reasonably sized website (approximately 8 MB) from various locations around the globe. In theory, the greater the distance between the server and the requesting host, the longer it will take for all data to reach the querying host. By collecting measurements from a sufficient number of locations, we can estimate the server's location more accurately.

Factors Affecting Time Measurements

Measuring the time required to establish a connection and transfer data over the TOR network yields significant insights into the nature of the traffic. Several factors influence these time measurements:

Physical Distance: The geographical distance between the client and the TOR nodes can significantly impact latency.
Circuit Complexity: The number of hops and the distance between TOR relays affect the overall connection time.
Network Congestion: The state of the network and the performance of intermediary nodes can introduce additional latencies. Furthermore, TOR attempts to camouflage the traffic signature by varying packet sizes at each hop, which can also affect the time required to transmit all data.

Measuring the loading times

To perform the test we have booted up several Windows 11 VMs around the world by using Microsoft Azure cloud, inside each VM we accesed the web server from within TOR Browser. Details regarding the setup of the Virtual Machines can be found in Annex 2. The following table summarizes the time measurements observed when connecting from various Azure regions:

Az Region	Loading Time	Circuit Node 1	Circuit Node 2	Circuit Node 3
Spain Central	16 sec	Switzerland	Germany	Poland
Brazil South	28 sec	UK	Luxembourg	Germany
Australia East	55 sec	Finland	Germany	UK
US West 2	46 sec	Switzerland	France	UK
Africa South	37 sec	Netherlands	Italy	Germany
Korea Central	2 min 55 sec	Spain	US	France
Israel Central	1 min 18 sec	Netherlands	Germany	Finland
Poland Central	22 sec	Moldova	US	Netherlands
UK South	40 sec	Sweden	Germany	US
US East	46 sec	Luxembourg	Finland	Russia

Analyzing Time Measures

The loading times recorded from various Azure regions provide critical insights into the geographical proximity of the Tor circuit nodes and the potential locations of onion services. Additional routing information can be found in Annex 3.

The following key observations emerge from the analysis:

Shortest Loading Times:
- The Spain Central region demonstrates the shortest loading time of 16 seconds. This indicates that the associated circuit nodes, located in Switzerland, Germany, and Poland, are relatively close. Such proximity suggests that any onion service accessed from this region is likely hosted nearby, potentially within Europe. Given the low latency, it is reasonable to infer that the onion service may be hosted within Spain itself, enhancing access speed.
Longest Loading Times:
- The Korea Central region registers the longest loading time of 2 minutes and 55 seconds. The distant nodes involved—Spain, the US, and France—highlight significant latency, suggesting that the onion services accessed from this region are likely situated far away, potentially in Europe or North America.
Moderate Loading Times:
- Brazil South exhibits a loading time of 28 seconds, utilizing nodes in the UK, Luxembourg, and Germany. This pattern indicates that the onion services accessed from Brazil may also be located in Europe, considering the routing paths involved.
- Both Africa South and Australia East show loading times of 37 seconds and 55 seconds, respectively. These values further support the likelihood that the onion services accessed from these regions are hosted in Europe, though with increased latency due to greater geographical distance.
Regional Node Analysis:
- Israel Central records a loading time of 1 minute and 18 seconds, while Poland Central has a loading time of 22 seconds. These observations indicate that onion services accessed from these regions likely utilize nodes located in both Europe and adjacent areas, with Poland suggesting closer proximity to central European services.
Consistent Trends:
- Observations from US West 2 and US East, both reporting loading times of 46 seconds, reveal that similar routing efficiencies are present when connecting through nodes in Europe (Switzerland, France, and Luxembourg). This implies that regional connections are optimized in a comparable manner, regardless of the differing geographical starting points.

Conclusions

This study demonstrated the effectiveness of analyzing network traffic to detect TOR Onion Services on a LAN and approximate their physical locations around the globe. By capturing and examining the traffic, we identified key characteristics, including the use of UDP on port 443. Only the Entry Guard IP address was observable, preventing the identification of the Introduction Point or Rendezvous Points. We also found that the Onion Service initiates communication with a limited number of packets, followed by exchanges to establish the circuit.

Additionally, we measured the loading time of an 8 MB website accessed through the TOR network from various global locations. Results indicated that the Onion Service accessed from the Spain Central region is likely hosted within Spain, as evidenced by a short loading time of 16 seconds. Latency patterns highlighting the influence of geographical proximity from the different relays that compose the whole circuit.

Annex 1: Onion Service Hosting on Raspberry Pi Setup

This section presents the directory and file structure used to configure TOR on a Raspberry Pi and to utilize TShark for capturing all network traffic.

+------------------------+
| /                      |
| |-- docker-compose.yml |
| |-- hiddenservice      |
| |   `-- Dockerfile     |
| |-- trafficcaps        |
| |-- tshark             |
| |   `-- Dockerfile     |
| `-- webserver          |
|     |-- randomtext.txt |
|     `-- webserver.py   |
+------------------------+

The project structure consists of the following components:

docker-compose.yml: This configuration file is used by Docker Compose to orchestrate the services in the setup.
hiddenservice/Dockerfile: A Dockerfile responsible for setting up the TOR hidden service.
trafficcaps: A directory designated for storing captured traffic data.
tshark/Dockerfile: A Dockerfile that configures the TShark service for traffic capture.
webserver/randomtext.txt: A sample text file used for testing purposes.
webserver/webserver.py: A Python script that establishes a simple web server using Flask.

docker-compose.yml

The contents of the docker-compose.yml file, which is essential for defining and managing the services, are as follows:

version: '3.9'

services:
  hiddenservice:
    build:
      context: ./hiddenservice
    container_name: hiddenservice
    network_mode: host
    restart: always
    volumes:
      - ./webserver:/app

  tshark:
    build:
      context: ./tshark
    container_name: tshark
    network_mode: host
    volumes:
      - ./trafficcaps:/trafficcaps
    cap_add:
      - NET_ADMIN       # Grants network administration capabilities
      - NET_RAW         # Grants raw socket access
    privileged: true     # Gives the container elevated privileges (optional)
    restart: always

This configuration defines two main services: - hiddenservice: Builds from the ./hiddenservice directory and runs a TOR hidden service. - tshark: Builds from the ./tshark directory and runs TShark for capturing network traffic, with the necessary permissions to operate effectively.

hiddenservice/Dockerfile

The hiddenservice/Dockerfile is crucial for setting up the TOR hidden service. The contents of this Dockerfile are as follows:

# Use the Ubuntu 20.04 base image
FROM ubuntu:20.04

# Set environment variable to prevent interactive prompts during installation
ENV DEBIAN_FRONTEND=noninteractive

# Update package lists and install required packages (curl, git, vim, tor, python3)
RUN apt update && apt install -y curl git vim tor python3 python3-pip

# Append the required lines to the torrc configuration file
RUN echo "HiddenServiceDir /var/lib/tor/hidden_service/" >> /etc/tor/torrc && \
    echo "HiddenServicePort 80 127.0.0.1:50505" >> /etc/tor/torrc

RUN pip3 install flask

# Expose the port that the Python HTTP server will run on (8000 by default)
EXPOSE 50505

# Run both the Python HTTP server and the tor service in the background
CMD cd /app && python3 webserver.py & \
    tor & \
    tail -f /dev/null

tshark/Dockerfile

The tshark/Dockerfile configures TShark within a container. Its contents are as follows:

# Use a lightweight alpine image
FROM alpine:latest

# Install tshark
RUN apk update && \
    apk add --no-cache tshark

# Keep the container idle by running a sleep loop (or tail -f /dev/null)
CMD ["tail", "-f", "/dev/null"]

This Dockerfile uses a lightweight Alpine image, installs TShark, and keeps the container running with an idle command, allowing TShark to operate and capture traffic effectively.

webserver/webserver.py

The following code block presents the webserver/webserver.py script, which sets up a simple web server using Flask:

from flask import Flask

app = Flask(__name__)
with open('randomtext.txt', 'r') as file:
    file_contents = file.read()

@app.route('/')
def home():
    return '''
    <!DOCTYPE html>
    <html lang="en">
    <head>
        <meta charset="UTF-8">
        <meta name="viewport" content="width=device-width, initial-scale=1.0">
        <title>My Simple Web Page</title>
    </head>
    <body>
        <h1>Welcome to My Web Page!</h1>
        <p>It works!</p>
        <p>''' + file_contents + '''
        </p>
    </body>
    </html>
    '''

if __name__ == '__main__':
    app.run(host='localhost', port=50505)

This script utilizes Flask to create a simple web server that serves an HTML page. It reads the contents of randomtext.txt and dynamically displays them on the web page, providing an easy way to test the setup.

Annex 2: Configuration of the Microsoft Azure Laboratory

This section presents how to set up the laboratory on Microsoft Azure Cloud. Starting with the general configurations followed by images depicting the configuration of various virtual machines (VMs) in Microsoft Azure. The configuration illustrates the setup, including the types of virtual machines used, their specifications, and their interconnections.

Microsoft Azure is a leading cloud computing platform that provides a wide range of services, including the ability to create and manage virtual machines (VMs). These VMs can be quickly deployed to meet various workloads and requirements.

Before starting to create things we must create a subscription (a.k.a where are going the bills, and how are going to be paid), in this step is crucial to put a budget limit (we don't want to become poor from one day to another because of a error in the Cloud). Once we have our brand new subscription with a budget limit we need to create a resource group to group all the resources we are creating to run our laboratory (This can be done directly from the creation process of the 1º VM). Finally we can start creating VMs, to do so users typically follow these steps:

Select an Image: Choose an operating system image from the Azure Marketplace or use a custom image.
Configure Settings: Specify the VM's size, networking options, storage requirements, and other settings.
Create the VM: Review the configuration and create the VM, which will then be deployed in the Azure environment.

The images below showcase the results of the VMs created, displaying their configurations and how they are set up to interact with each other.

Initial Configuration of Virtual Machines in Microsoft Azure

Overview of Virtual Machine Setup in Microsoft Azure

Interconnection of Virtual Machines within Azure

Accessing the Website from Deployed Virtual Machines

The previous images show the layout of multiple virtual machines deployed in Microsoft Azure. Each virtual machine is configured with specific resources, such as CPU, RAM, and storage, to meet the needs of different applications. The configuration also includes networking setups, allowing the virtual machines to communicate with each other and with other Azure services. This flexibility enables efficient resource management and scalability for various workloads.

Final Configuration of Virtual Machines in Microsoft Azure

Network Diagram of the Deployed Virtual Machines

The total cost for all the tests conducted was 1.80€

Annex 3: Images of Established Tor Circuits

This section presents representative images of the Tor circuits established during the tests. Each image illustrates the configuration of the nodes and the traffic routes within the Tor system. The map shows the locations of the virtual machines in green and the .onion server in pink.

Map indicating virtual machines in green and the .onion server in pink.

Spain Circuit: Brazil Circui:

UK South Circuit: East US Circuit:

Australia Circuit: US West Circuit:

Africa South Circuit: Korea Central Circuit:

Israel Central Circuit: Poland Central Circuit:

Each image represents a different Tor circuit established during the experiments.
The images provide a visual insight into the routing of traffic and the selection of nodes in the Tor network.
It can be observed how each circuit can vary in node selection and route configuration.