Nonchalant Guidance

Added on: Saturday, 01 July, 2023 | Updated on: Monday, 21 October, 2024

GSoC 2023 Blog 3

Hello there! This is the third in a series of blogposts detailing the work I will be doing for the Tor Project as part of Google Summer of Code 2023.

Here I will detail some of the work I did during some period of time, the challenges I faced and the outcomes of that work.

This will be a bit longer update since I had been a bit busier the past two weeks and wasn’t able to create a new entry here.

Brief Intro on My Project

The project I was selected for is titled “Arti API exploration to build example tools”.

Confused? Let’s break it down.

Arti is the Rust rewrite of the Tor software that allows you to make TCP connections through the Tor Network, run Tor relays and bridges etc. You can read more about it from its repository
Arti was built in mind with the goal that other developers should be able to use Arti to build Tor-powered software, and has incorporated that thinking into its design.

So, it exports some crates that other developers can use in their Rust projects, and has some documentation on them, including some basic demonstration code snippets that you can follow along with at home.

However, Arti is a fairly bleeding-edge project. It didn’t hit version 1.0.0 too long ago, and due to the breakneck speed of development, APIs are not set in stone. There is a lot of breakage that could be potentially encountered by another developer.
In this project, I will be creating certain sample programs in Rust using Arti’s APIs

My goal will be to build my sample programs and document any difficulties that come up.

Maybe certain APIs are hard to use, or undocumented, or certain operations cause Arti to fail (exposing a bug). All these issues will be brought to the notice of the Arti team and fixes can be discussed and implemented.

In this way, the project can get valuable feedback from an outsider who doesn’t have much knowledge of the codebase and the way the Arti proxy does things.

Making the DNS client ready for demonstration

After the work done last time, there was little work left for the DNS client. I added in basic CLI arg parsing and tested fetching DNS records for:

google.com: typically returns only one IP address, it is also a very simple domain name
news.ycombinator.com: it is subdomain and also returns one IP address
cloudflare.com: simple domain name but it returns multiple IP addresses

While the previous code would work for the first two situations, I found that for cloudflare.com I would only get one IP address printed to the display.

I held off getting a review of the code to fix this issue, and it involved me once again using direct DNS over TCP connections so I could inspect the traffic using Wireshark, and re-reading RFC 1035 to interpret any mysterious bytes I could not decipher.

The main issue was found to be the compression scheme that DNS uses for hostnames. In DNS, each time there is a new IP address that is said to correspond to a queried domain name, we have to transmit a resource record again. This RR includes the domain name, the record type (A record), the class (Internet), the TTL, the record size (for IPv4, it is 4 bytes), and finally the record name (ie, IP address).

Now, the flaw here is that we will have to pack the hostname into the DNS response again and again for multiple IP addresses. To solve this, subsequent DNS resource records will “compress” the hostname into 2 bytes, and these 2 bytes will simply be the offset where the hostname starts. The offset is measured from the start of the DNS response.

For example, for cloudflare.com, there are typically 2 RRs in one response. The first one will have the cloudflare.com domain be encoded as described in RFC 1035, but the second one will simply have 2 bytes pointing to where cloudflare.com is located in the first RR relative to the start of the DNS response.

To get this working, I had to experiment for a bit before I ended up writing working code, and so the DNS client was ready to be showcased for a review.

The review was conducted by nickm and a variety of issues were spotted which are going to be worked on in the short term. Some of these issues are also present in other crates I’ve created for this project, so I’ll be sure to fix them there as well.

Snowflake Integration Issues

In the other projects, ie, the connection-checker and the download-manager, I had less success. It seems Snowflake is just a more noisy medium to connect to the Tor Network, and my applications were not able to deal with the increased timeouts.

Still, this didn’t seem to be as big of an issue when running Arti the proxy with a Snowflake connection, so I think there is some issue somewhere that I am missing.

obfs4 connection checker

It wasn’t all doom and gloom, however. I did get started on arguably the most impactful project in this repo: the obfs4 connection checker (this is a horrible name, but naming is one of the hardest problems in computer science, and I can’t solve them all unfortunately).

This tool simply takes in a list of all obfs4 bridges (for a quick recap, this is a Tor entry node not listed publicly and using the obfs4 protocol, which tries its best to obfuscate the TCP connection so censors can’t tell if its Tor or an innocent TCP connection), tries to create a Channel, which is just a direct connection to a Tor node, and observes whether it is online or not.

Such a tool, intended to be run by the Tor Project, will help them monitor the health of these extremely important subset of Tor nodes. Thanks to Arti’s inherent tendency to leverage multiple threads (thanks fearless concurrency!) and the eventual goal of the project to replace C Tor, it is preferable to build such a tool in Arti.

The basic design first started out by copying some code from the connection checker to set up an obfs4 connection, and I learned how to use ChanMgr as well.

ChanMgr

ChanMgr is an object that helps manage channels, ie, connections to Tor relays. Its used internally by TorClient, but for the purposes of this connection checker, I needed direct access to it in order to set up a channel to each bridge I was testing. (Creating a full Tor circuit and testing if we can connect to a particular host was the first idea I had, but after some discussion with nickm, it was clear that in order to just check if a bridge was online, it was better to try to establish a channel to that bridge than use it for a circuit, both for speed and network health purposes).

Such an API did not exist, but it was trivial for me to add one in (see arti!1275)

I then changed my Cargo.toml for the checker to use the Arti crates from GitLab directly, since I didn’t want to wait around for the release on crates.io to be updated with this new MR. This later catapulted to shifting all of the projects’ Arti dependencies from crates.io to GitLab for simplicity.

Now that I could access ChanMgr, allow me to explain how this program initially worked:

I had a list of 11 obfs4 relays I obtained from the Tor Browser Bundle repo. These bridge lines served as test relays to get everything up and running
I would configure TorClient by using TorClientBuilder to use obfs4 transport and pass in the specific relay to use by iterating through the bridge lines one by one
I would obtain ChanMgr from TorClient (using the API I added in) and ChanMgr::get_or_launch() would be called, which will try to create a Channel to the given bridge. If that works, a Channel object is returned, else, we get an error.

Using this simple property, we can evaluate the status of each bridge.

Fearless concurrency?

Now initially, this code was all sequential, but in order to speed it up, I had to leverage Tokio tasks which would free up the main thread from the I/O block. Initially, I attempted to create a task out of the entire process from step 2, however it was found that TorClientBuilder did not implement the Send trait, which is the trait that allows data to be sent between threads safely. The absense of Send meant I was getting “future couldn’t be sent between threads safely” errors and thus my code wouldn’t compile.

I reported this bug and a fix was also created which includes the addition of a test to prevent this from happening again.

Discussion on ChanMgr also led nickm to request a method in ChanMgr which would allow you to immediately expire a channel as well (I opened a ticket on his behalf). The current behavior of ChanMgr when it comes to when a Channel is retired was not documented, and when I got my answers, I added this info, which was then elaborated upon

After all this was done, I had the ability to spawn tasks for each test, from creating a TorClient all the way to trying to make the Channel. There are still some antipatterns in the code I have to fix, however. For now, I have a new TorClient being created every time, but the docs really want you to avoid this since it brings up problems with shared state directories that Arti uses to store different information related to the Tor network (eg. directory info which stores each Tor relay in order to create a circuit). I’ll be sure to fix this as well.

The repeated issues prompted from such discouraged usage of the APIs prompted the opening of an issue to discuss how best to solve this.

Conclusion

Overall, I’d say my project is coming along nicely. I’ve ended up contributing (or causing contributions) to underdocumented parts of the Arti codebase all while the devs have been hard at work getting onion services and key management working in Arti, and all the while managed to break ground on a new example project that (hopefully) can also be used in production one day!

This website was made using Markdown, Pandoc, and a custom program to automatically add headers and footers (including this one) to any document that’s published here.