Nonchalant Guidance

Added on: Friday, 25 August, 2023 | Updated on: Monday, 21 October, 2024

GSoC 2023: Final Post

Hello there! This is the last in series of blogposts detailing the all the work I have been doing for the Tor Project as part of Google Summer of Code 2023.

This will serve as a complete summary of the work I did, the things I learned along the way, and the work that remains.

Brief Intro on My Project

The project I was selected for is titled “Arti API exploration to build example tools”.

Confused? Let’s break it down.

Arti is the Rust rewrite of the Tor software that allows you to make TCP connections through the Tor Network, run Tor relays and bridges etc. You can read more about it from its repository
Arti was built in mind with the goal that other developers should be able to use Arti to build Tor-powered software, and has incorporated that thinking into its design.

So, it exports some crates that other developers can use in their Rust projects, and has some documentation on them, including some basic demonstration code snippets that you can follow along with at home.

However, Arti is a fairly bleeding-edge project. It didn’t hit version 1.0.0 too long ago, and due to the breakneck speed of development, APIs are not set in stone. There is a lot of breakage that could be potentially encountered by another developer.
This, in this project, I created certain sample programs in Rust using Arti’s APIs

My goal is to build my sample programs and document any difficulties that come up.

Maybe certain APIs are hard to use, or undocumented, or certain operations cause Arti to fail (exposing a bug). All these issues were brought to the notice of the Arti team and fixes were discussed and implemented.

In this way, the project can get valuable feedback from an outsider who doesn’t have much knowledge of the codebase and the way the Arti proxy does things.

The Sample Projects

The sample projects I ended up creating were:

A download manager that used Arti to create six simultaneous connections to connect to the Tor Project’s website and download a hardcoded version of the Linux version of the Tor Browser Bundle, and also verify its SHA256 checksum.

This could also theoretically be distributed to people living in countries where the normal distribution methods for the Tor Browser may be banned or blocked by authorities.

A DNS resolver that uses a bespoke DNS over TCP implementation to connect to Cloudflare’s DNS server over the Tor network. This project mainly illustrates the usage of a custom TCP-based protocol over Tor, in the hopes that people wishing to tunnel arbitrary streams over Tor can see how much (or rather, how little) work would go into making this possible.
A connection checker, which is just a small tool that tries to connect to Tor through various means, including pluggable transports. It was the first project that was utilizing the pluggable transport portion of the Arti crates.
An obfs4 connection tool, which was not only a sample project, but also a vital new addition to the Tor Project’s arsenal for monitoring the health of the Tor network. This tool sets up a RESTful API which can continuously check on the health of various obfs4 bridges fed to it, and report these statuses. A client program can feed it bridges to monitor and poll an updates endpoint for getting the updated statuses of these bridges.

This project required exposing some APIs or attributes from Arti in order to create channels (which are direct, encrypted connections that only carry Tor protocol messages) to each bridge we need to test.

A small program called pt-proxy, which creates a SOCKS5 proxy server and client tunnelled over the obfs4 protocol in order to avoid censorship of the connection.

This project necessitated adding basic APIs to set up an obfs4 subprocess in server mode. I had to look at pt-spec.txt, which defines the spec for how Tor/Arti and pluggable transports interact with each other, and then modify the tor-ptmgr crate to perform the required operations. (more on this later)

Note that these projects have been reviewed multiple times by gabi and nickm (aka Nick, who was my mentor this year). Many capabilities, docs, code restructurings etc are a result of their suggestions. Thank you very much!

Bugs Found

Now, part of the reason why these sample projects were being created is to see what an external developer, like me, who has no prior experience or knowledge of Arti or its internals, would be able to accomplish using these APIs and what hurdles would they run into during the course of creating the sample projects.

Some of those hurdles are documented below in the form of bug reports I filed:

SQLite error on making repeated requests: this was the first bug I filed, and I found it on the same day I started experimenting with Arti’s APIs! (This was before the GSoC period formally began, but I think it still counts). Essentially, you are not supposed to create two TorClients in Arti, because Arti has a shared state directory that it can’t manage very well under that configuration. This bug arises as a side effect of that, and while it isn’t best practice, it was still a valid bug in the code nonetheless. nickm would later fix this issue.
Add example code to use Snowflake based bridges in arti-client: this was filed because I myself did not find any good guidance through the existing docs on how I can launch a connection via the Snowflake pluggable transport. trinity would end up fixing this disparity.
Mention Pluggable Transports are not a part of Arti: this may sound very trivial, but the devil’s in the details. Someone who may wish to use Arti may not necessarily be aware that pluggable transports aren’t a part of Arti (or even the original C Tor codebase), but are rather maintained separately and are launched as subprocesses by the parent Arti/Tor process. trinity ended up fixing this in the same MR as the one linked for the previous bug.
TorClientBuilder doesn’t implement Send: yet another concurrency bug, this one was fairly straightforward: the TorClientBuilder needed to be marked as Send (it is a marker trait saying “whichever struct is marked Send can be sent across threads safely”) and not !Send (which says “don’t implement Send for this struct”). The fix involved just marking TorClientBuilder as Send + Sync and adding a test which fails if this is violated.

There are some open issues I filed as well:

Merge Requests Created

I also ended up creating and getting many Merge Requests merged into Arti. These gradually transitioned from trivial to non-trivial. Some of these were code or doc fixes for bugs I filed myself, while others were feature requests for exposing a capability that a sample project would require.

These include:

Use humantime in tor-checkable and tor-guardmgr: my first MR to Arti involved just adjusting times used for testing to be human-readable in the RFC3339 format. The times used in this MR were all historic moments where the people would gain freedom. I found this quite touching.
Remove unnecessary warning from arti-hyper/README: another beginner level MR to Arti, this just involved removing a warning from arti-hyper (a crate from Arti) that referenced a TLS issue on macOS that was later worked around.
Change log levels of messages from INFO to others: this MR involved changing the log level of three log messages to debug and add two warnings. This was done to not unnecessarily give scary messages to a normal end user who runs Arti at the default info log level.
Expand arti-client docs to include error reporting section: during development, I was made aware that Arti had an ErrorReport trait implemented by all the Arti errors which allowed you to get a pretty-printed error message by calling the report() method. It would also generate error messages that would be easier to copy-paste into bug reports and to read, so I created an MR to inform any user reading arti-client docs about utilizing this functionality.
Add error if [[bridges.transports]] isn’t written in config file: while learning more about how PTs are configured in Arti, I noticed a bug which allowed invalid configuration files to be used. The bug is triggered when one specified a bridge in [bridges] section of arti.toml but commented out [[bridges.transports]] entirely. It would end up trying to connect to Tor but never work, and would output cryptic panic errors instead.

The MR, after being discussed and reviewed thoroughly by Diziet, would end up being merged, adding the fix as well as test cases which will help catch this bug if there is a regression in the future. (Thank you Diziet for all the suggestions!)

Remove message ‘For now, only direct channels are supported’ in tor-chanmgr docs: As I was exploring tor-chanmgr crate, I found that the docs were not consistent with the progress that had been made in Arti. The docs erroneously said that only direct channels (ie, channels to normal relays in the Tor network) were supported, when in fact the crate had since gained support for creating channels to bridges using any supported pluggable transport. I fixed this inconsistency.
Create chanmgr() method in TorClient: This small MR just exposes the ChanMgr (it helps create channels) from TorClient (which helps us avoid writing the instantiation code ourselves). It was done in order for obfs4 checker to work, since obfs4 checker just builds channels to bridges rather than building an entire circuit.
Add Channel expiry info in ChanMgr docs: after some discussion on how channels expire on IRC, I transcribed this info as best I could in the docs. nickm also created a similar MR based on this MR with more info.
Add initial support for running a PT in server mode: this MR adds low-level support for spawning a pluggable transport process with the required environment variables to operate in server mode, ie, it will listen for incoming connections made using the pluggable transport protocol, decode them and pass them onto a specified destination, where they can be processed accordingly.

This work enables the pt-proxy sample project to work. It is the largest MR I’ve made to Arti to date, and involved the most work to write.

The following two MRs were merged but their work was later undone as it was deemed unnecessary:

Work Left

All that is currently left is to get final reviews from the Arti developers and open an MR to merge my work, which currently lives in a dedicated repo, into the main Arti repo.

After some discussion with the developers, it was decided that these example projects will exist in a top-level examples/ directory.

My Experience

I had an overwhelmingly positive experience in GSoC this year. Here’s a brief, non-exhaustive list why:

I got to learn about technical details from the Arti developers, everything from writing good docs, naming things properly, to code architecture. Code reviews were easily some of the best parts of GSoC for this reason.

Since I only picked up Rust in the near past, I’ve ended up learning a lot of advanced Rust-isms from working on this project as well.

I learned a lot about communicating across timezones, cultures etc. and using less ambiguous language in discussions to get my point across in a clearer manner. People tend to neglect the human aspect of open source development, but in my opinion, it is almost impossible for a good developer to not have good communication skills, especially proficiency in written English.
I was given a ton of autonomy when it came to accomplishing my goals, and whenever I needed help I would get really good advice. This helped me develop my own research and self-learning skills further, while also being able to learn from my mentors in case a problem was too specific to this project to reason about alone.

This didn’t mean I only popped up when things went wrong, though; I always made sure to update Nick about my plans for the week every Monday, which gave him the chance to suggest changes if my approach was flawed and to keep track of my progress.

It felt very good to see pieces of code I wrote merged into a big project, where it will end up doing some good to the world, especially for Arti, where this code will eventually be used in sitations to guard freedom and the free flow of information, much to the chagrin of repressive regimes and iron-fisted ISPs alike.

Conclusion

I am extremely thankful for this opportunity, and I hope that my work has been helpful to Arti and the Tor Project. I’ve ended up learning a lot, and it is in no small part due to the help I’ve gotten from all the developers I ended up interacting with (special mentions to nickm, Diziet, gabi, trinity, arma, shelikhoo, eta, ahf, and meskio)

I’ve gained a ton of confidence in developing and contributing to open source projects of varying scales, and I hope to continue the good work started and outlined here in the future as well.

This website was made using Markdown, Pandoc, and a custom program to automatically add headers and footers (including this one) to any document that’s published here.