Hello there! This is the third in a series of blogposts detailing the work I will be doing for the Tor Project as part of Google Summer of Code 2023.
Here I will detail some of the work I did during some period of time, the challenges I faced and the outcomes of that work.
This will be a bit longer update since I had been a bit busier the past two weeks and wasn’t able to create a new entry here.
The project I was selected for is titled “Arti API exploration to build example tools”.
Confused? Let’s break it down.
Arti is the Rust rewrite of the Tor software that allows you to make TCP connections through the Tor Network, run Tor relays and bridges etc. You can read more about it from its repository
Arti was built in mind with the goal that other developers should be able to use Arti to build Tor-powered software, and has incorporated that thinking into its design.
So, it exports some crates that other developers can use in their Rust projects, and has some documentation on them, including some basic demonstration code snippets that you can follow along with at home.
However, Arti is a fairly bleeding-edge project. It didn’t hit version 1.0.0 too long ago, and due to the breakneck speed of development, APIs are not set in stone. There is a lot of breakage that could be potentially encountered by another developer.
In this project, I will be creating certain sample programs in Rust using Arti’s APIs
My goal will be to build my sample programs and document any difficulties that come up.
In this way, the project can get valuable feedback from an outsider who doesn’t have much knowledge of the codebase and the way the Arti proxy does things.
After the work done last time, there was little work left for the DNS client. I added in basic CLI arg parsing and tested fetching DNS records for:
google.com
: typically returns only one IP address,
it is also a very simple domain name
news.ycombinator.com
: it is subdomain and also
returns one IP address
cloudflare.com
: simple domain name but it returns
multiple IP addresses
While the previous code would work for the first two situations, I
found that for cloudflare.com
I would only get one IP
address printed to the display.
I held off getting a review of the code to fix this issue, and it involved me once again using direct DNS over TCP connections so I could inspect the traffic using Wireshark, and re-reading RFC 1035 to interpret any mysterious bytes I could not decipher.
The main issue was found to be the compression scheme that DNS uses for hostnames. In DNS, each time there is a new IP address that is said to correspond to a queried domain name, we have to transmit a resource record again. This RR includes the domain name, the record type (A record), the class (Internet), the TTL, the record size (for IPv4, it is 4 bytes), and finally the record name (ie, IP address).
Now, the flaw here is that we will have to pack the hostname into the DNS response again and again for multiple IP addresses. To solve this, subsequent DNS resource records will “compress” the hostname into 2 bytes, and these 2 bytes will simply be the offset where the hostname starts. The offset is measured from the start of the DNS response.
For example, for cloudflare.com
, there are typically 2
RRs in one response. The first one will have the
cloudflare.com
domain be encoded as described in RFC 1035,
but the second one will simply have 2 bytes pointing to where
cloudflare.com
is located in the first RR relative to the
start of the DNS response.
To get this working, I had to experiment for a bit before I ended up writing working code, and so the DNS client was ready to be showcased for a review.
The review was conducted by nickm
and a
variety of issues were spotted which are going to be worked on in
the short term. Some of these issues are also present in other crates
I’ve created for this project, so I’ll be sure to fix them there as
well.
In the other projects, ie, the connection-checker and the download-manager, I had less success. It seems Snowflake is just a more noisy medium to connect to the Tor Network, and my applications were not able to deal with the increased timeouts.
Still, this didn’t seem to be as big of an issue when running Arti the proxy with a Snowflake connection, so I think there is some issue somewhere that I am missing.
It wasn’t all doom and gloom, however. I did get started on arguably the most impactful project in this repo: the obfs4 connection checker (this is a horrible name, but naming is one of the hardest problems in computer science, and I can’t solve them all unfortunately).
This tool simply takes in a list of all obfs4 bridges (for a quick
recap, this is a Tor entry node not listed publicly and using the obfs4
protocol, which tries its best to obfuscate the TCP connection so
censors can’t tell if its Tor or an innocent TCP connection), tries to
create a Channel
, which is just a direct connection to a
Tor node, and observes whether it is online or not.
Such a tool, intended to be run by the Tor Project, will help them monitor the health of these extremely important subset of Tor nodes. Thanks to Arti’s inherent tendency to leverage multiple threads (thanks fearless concurrency!) and the eventual goal of the project to replace C Tor, it is preferable to build such a tool in Arti.
The basic design first started out by copying some code from the
connection checker to set up an obfs4 connection, and I learned how to
use ChanMgr
as well.
ChanMgr
is an object that helps manage channels, ie,
connections to Tor relays. Its used internally by
TorClient
, but for the purposes of this connection checker,
I needed direct access to it in order to set up a channel to each bridge
I was testing. (Creating a full Tor circuit and testing if we can
connect to a particular host was the first idea I had, but after some
discussion with nickm
, it was clear that in order to just
check if a bridge was online, it was better to try to establish a
channel to that bridge than use it for a circuit, both for speed and
network health purposes).
Such an API did not exist, but it was trivial for me to add one in (see arti!1275)
I then changed my Cargo.toml
for the checker to use the
Arti crates from GitLab directly, since I didn’t want to wait around for
the release on crates.io to be updated with this new MR. This later
catapulted to shifting all of the projects’ Arti dependencies from
crates.io to GitLab for simplicity.
Now that I could access ChanMgr
, allow me to explain how
this program initially worked:
I had a list of 11 obfs4 relays I obtained from the Tor Browser Bundle repo. These bridge lines served as test relays to get everything up and running
I would configure TorClient
by using
TorClientBuilder
to use obfs4 transport and pass in the
specific relay to use by iterating through the bridge lines one by
one
I would obtain ChanMgr
from TorClient
(using the API I added in) and ChanMgr::get_or_launch()
would be called, which will try to create a Channel to the given bridge.
If that works, a Channel object is returned, else, we get an
error.
Using this simple property, we can evaluate the status of each bridge.
Now initially, this code was all sequential, but in order to speed it
up, I had to leverage Tokio tasks which would free up the main thread
from the I/O block. Initially, I attempted to create a task out of the
entire process from step 2, however it was found that
TorClientBuilder
did not implement the Send
trait, which is the trait that allows data to be sent between threads
safely. The absense of Send
meant I was getting “future
couldn’t be sent between threads safely” errors and thus my code
wouldn’t compile.
I reported this bug and a fix was also created which includes the addition of a test to prevent this from happening again.
Discussion on ChanMgr
also led nickm
to
request a method in ChanMgr
which would allow you to
immediately expire a channel as well (I opened a
ticket on his behalf). The current behavior of ChanMgr
when it comes to when a Channel is retired was not documented, and when
I got my answers, I added
this info, which was then elaborated
upon
After all this was done, I had the ability to spawn tasks for each
test, from creating a TorClient all the way to trying to make the
Channel. There are still some antipatterns in the code I have to fix,
however. For now, I have a new TorClient
being created
every time, but the docs really want you to avoid this since it brings
up problems with shared state directories that Arti uses to store
different information related to the Tor network (eg. directory info
which stores each Tor relay in order to create a circuit). I’ll be sure
to fix this as well.
The repeated issues prompted from such discouraged usage of the APIs prompted the opening of an issue to discuss how best to solve this.
Overall, I’d say my project is coming along nicely. I’ve ended up contributing (or causing contributions) to underdocumented parts of the Arti codebase all while the devs have been hard at work getting onion services and key management working in Arti, and all the while managed to break ground on a new example project that (hopefully) can also be used in production one day!
This website was made using Markdown, Pandoc, and a custom program to automatically add headers and footers (including this one) to any document that’s published here.
Copyright © 2024 Saksham Mittal. All rights reserved. Unless otherwise stated, all content on this website is licensed under the CC BY-SA 4.0 International License