Hello there! This is the last in series of blogposts detailing the all the work I have been doing for the Tor Project as part of Google Summer of Code 2023.
This will serve as a complete summary of the work I did, the things I learned along the way, and the work that remains.
The project I was selected for is titled “Arti API exploration to build example tools”.
Confused? Let’s break it down.
Arti is the Rust rewrite of the Tor software that allows you to make TCP connections through the Tor Network, run Tor relays and bridges etc. You can read more about it from its repository
Arti was built in mind with the goal that other developers should be able to use Arti to build Tor-powered software, and has incorporated that thinking into its design.
So, it exports some crates that other developers can use in their Rust projects, and has some documentation on them, including some basic demonstration code snippets that you can follow along with at home.
However, Arti is a fairly bleeding-edge project. It didn’t hit version 1.0.0 too long ago, and due to the breakneck speed of development, APIs are not set in stone. There is a lot of breakage that could be potentially encountered by another developer.
This, in this project, I created certain sample programs in Rust using Arti’s APIs
My goal is to build my sample programs and document any difficulties that come up.
In this way, the project can get valuable feedback from an outsider who doesn’t have much knowledge of the codebase and the way the Arti proxy does things.
The sample projects I ended up creating were:
This could also theoretically be distributed to people living in countries where the normal distribution methods for the Tor Browser may be banned or blocked by authorities.
A DNS resolver that uses a bespoke DNS over TCP implementation to connect to Cloudflare’s DNS server over the Tor network. This project mainly illustrates the usage of a custom TCP-based protocol over Tor, in the hopes that people wishing to tunnel arbitrary streams over Tor can see how much (or rather, how little) work would go into making this possible.
A connection checker, which is just a small tool that tries to connect to Tor through various means, including pluggable transports. It was the first project that was utilizing the pluggable transport portion of the Arti crates.
An obfs4 connection tool, which was not only a sample project, but also a vital new addition to the Tor Project’s arsenal for monitoring the health of the Tor network. This tool sets up a RESTful API which can continuously check on the health of various obfs4 bridges fed to it, and report these statuses. A client program can feed it bridges to monitor and poll an updates endpoint for getting the updated statuses of these bridges.
This project required exposing some APIs or attributes from Arti in order to create channels (which are direct, encrypted connections that only carry Tor protocol messages) to each bridge we need to test.
This project necessitated adding basic APIs to set up an obfs4 subprocess in server mode. I had to look at pt-spec.txt, which defines the spec for how Tor/Arti and pluggable transports interact with each other, and then modify the tor-ptmgr crate to perform the required operations. (more on this later)
Note that these projects have been reviewed multiple times by gabi and nickm (aka Nick, who was my mentor this year). Many capabilities, docs, code restructurings etc are a result of their suggestions. Thank you very much!
Now, part of the reason why these sample projects were being created is to see what an external developer, like me, who has no prior experience or knowledge of Arti or its internals, would be able to accomplish using these APIs and what hurdles would they run into during the course of creating the sample projects.
Some of those hurdles are documented below in the form of bug reports I filed:
SQLite
error on making repeated requests: this was the first bug I filed,
and I found it on the same day I started experimenting with Arti’s APIs!
(This was before the GSoC period formally began, but I think it still
counts). Essentially, you are not supposed to create two
TorClient
s in Arti, because Arti has a shared state
directory that it can’t manage very well under that configuration. This
bug arises as a side effect of that, and while it isn’t best practice,
it was still a valid bug in the code nonetheless. nickm would later fix
this issue.
Add example code to use Snowflake based bridges in arti-client: this was filed because I myself did not find any good guidance through the existing docs on how I can launch a connection via the Snowflake pluggable transport. trinity would end up fixing this disparity.
Mention Pluggable Transports are not a part of Arti: this may sound very trivial, but the devil’s in the details. Someone who may wish to use Arti may not necessarily be aware that pluggable transports aren’t a part of Arti (or even the original C Tor codebase), but are rather maintained separately and are launched as subprocesses by the parent Arti/Tor process. trinity ended up fixing this in the same MR as the one linked for the previous bug.
TorClientBuilder
doesn’t implement Send: yet another concurrency bug, this one was
fairly straightforward: the TorClientBuilder
needed to be
marked as Send
(it is a marker trait saying “whichever
struct is marked Send can be sent across threads safely”) and not
!Send
(which says “don’t implement Send for this struct”).
The fix
involved just marking TorClientBuilder
as
Send + Sync
and adding a test which fails if this is
violated.
There are some open issues I filed as well:
I also ended up creating and getting many Merge Requests merged into Arti. These gradually transitioned from trivial to non-trivial. Some of these were code or doc fixes for bugs I filed myself, while others were feature requests for exposing a capability that a sample project would require.
These include:
Use humantime in tor-checkable and tor-guardmgr: my first MR to Arti involved just adjusting times used for testing to be human-readable in the RFC3339 format. The times used in this MR were all historic moments where the people would gain freedom. I found this quite touching.
Remove unnecessary warning from arti-hyper/README: another beginner level MR to Arti, this just involved removing a warning from arti-hyper (a crate from Arti) that referenced a TLS issue on macOS that was later worked around.
Change
log levels of messages from INFO to others: this MR involved
changing the log level of three log messages to debug
and
add two warnings. This was done to not unnecessarily give scary messages
to a normal end user who runs Arti at the default info
log
level.
Expand
arti-client docs to include error reporting section: during
development, I was made aware that Arti had an ErrorReport
trait implemented by all the Arti errors which allowed you to get a
pretty-printed error message by calling the report()
method. It would also generate error messages that would be easier to
copy-paste into bug reports and to read, so I created an MR to inform
any user reading arti-client
docs about utilizing this
functionality.
Add
error if [[bridges.transports]] isn’t written in config file: while
learning more about how PTs are configured in Arti, I noticed a bug
which allowed invalid configuration files to be used. The bug
is triggered when one specified a bridge in [bridges]
section of arti.toml but commented out
[[bridges.transports]]
entirely. It would end up trying to
connect to Tor but never work, and would output cryptic panic errors
instead.
The MR, after being discussed and reviewed thoroughly by Diziet, would end up being merged, adding the fix as well as test cases which will help catch this bug if there is a regression in the future. (Thank you Diziet for all the suggestions!)
Remove
message ‘For now, only direct channels are supported’ in tor-chanmgr
docs: As I was exploring tor-chanmgr
crate, I found
that the docs were not consistent with the progress that had been made
in Arti. The docs erroneously said that only direct channels (ie,
channels to normal relays in the Tor network) were supported, when in
fact the crate had since gained support for creating channels to bridges
using any supported pluggable transport. I fixed this
inconsistency.
Create
chanmgr() method in TorClient: This small MR just exposes the
ChanMgr
(it helps create channels) from
TorClient
(which helps us avoid writing the instantiation
code ourselves). It was done in order for obfs4 checker to work, since
obfs4 checker just builds channels to bridges rather than building an
entire circuit.
Add Channel expiry info in ChanMgr docs: after some discussion on how channels expire on IRC, I transcribed this info as best I could in the docs. nickm also created a similar MR based on this MR with more info.
Add initial support for running a PT in server mode: this MR adds low-level support for spawning a pluggable transport process with the required environment variables to operate in server mode, ie, it will listen for incoming connections made using the pluggable transport protocol, decode them and pass them onto a specified destination, where they can be processed accordingly.
This work enables the pt-proxy
sample project to work.
It is the largest MR I’ve made to Arti to date, and involved the most
work to write.
The following two MRs were merged but their work was later undone as it was deemed unnecessary:
All that is currently left is to get final reviews from the Arti developers and open an MR to merge my work, which currently lives in a dedicated repo, into the main Arti repo.
After some discussion with the developers, it was decided that these
example projects will exist in a top-level examples/
directory.
I had an overwhelmingly positive experience in GSoC this year. Here’s a brief, non-exhaustive list why:
Since I only picked up Rust in the near past, I’ve ended up learning a lot of advanced Rust-isms from working on this project as well.
I learned a lot about communicating across timezones, cultures etc. and using less ambiguous language in discussions to get my point across in a clearer manner. People tend to neglect the human aspect of open source development, but in my opinion, it is almost impossible for a good developer to not have good communication skills, especially proficiency in written English.
I was given a ton of autonomy when it came to accomplishing my goals, and whenever I needed help I would get really good advice. This helped me develop my own research and self-learning skills further, while also being able to learn from my mentors in case a problem was too specific to this project to reason about alone.
This didn’t mean I only popped up when things went wrong, though; I always made sure to update Nick about my plans for the week every Monday, which gave him the chance to suggest changes if my approach was flawed and to keep track of my progress.
I am extremely thankful for this opportunity, and I hope that my work has been helpful to Arti and the Tor Project. I’ve ended up learning a lot, and it is in no small part due to the help I’ve gotten from all the developers I ended up interacting with (special mentions to nickm, Diziet, gabi, trinity, arma, shelikhoo, eta, ahf, and meskio)
I’ve gained a ton of confidence in developing and contributing to open source projects of varying scales, and I hope to continue the good work started and outlined here in the future as well.
This website was made using Markdown, Pandoc, and a custom program to automatically add headers and footers (including this one) to any document that’s published here.
Copyright © 2024 Saksham Mittal. All rights reserved. Unless otherwise stated, all content on this website is licensed under the CC BY-SA 4.0 International License