Nonchalant Guidance

About Me·RSS·My Projects·LinkedIn


Added on: Sunday, 04 June, 2023 | Updated on: Sunday, 04 June, 2023 2023

GSoC 2023 Blog 1

Hello there! This will be the first of many blogposts detailing the work I will be doing for the Tor Project as part of Google Summer of Code 2023.

Here I will detail some of the work I did during some period of time, the challenges I faced and the outcomes of that work.

Brief Intro on My Project

The project I was selected for is titled “Arti API exploration to build example tools”.

Confused? Let’s break it down.

So, it exports some crates that other developers can use in their Rust projects, and has some documentation on them, including some basic demonstration code snippets that you can follow along with at home.

My goal will be to build my sample programs and document any difficulties that come up.

In this way, the project can get valuable feedback from an outsider who doesn’t have much knowledge of the codebase and the way the Arti proxy does things.

Reorganizing work repository

A bit before the GSoC contributor applications opened up, I created a repo which housed my attempt at building something using arti-client and arti-hyper: A download manager which would download Tor Browser from the Tor Project website.

Now, I only opened this up to play around with the existing APIs, but I’d actually found a bug on my first day.

I continued to hack on this project and even included it in my proposal, both to link to and to continue to work on in the GSoC work period.

I have worked on the project for a bit and now it is able to create six different circuits through the Tor Network (so that means requests go through six different exit nodes) and download the Linux version of Tor Browser through Tor.

Now that I have been selected, it would probably be a good idea to use this repo to house all my work for this summer, not just the download manager.

To do this, I looked to the Arti repo for inspiration.

Arti’s main repo houses all the crates that Arti needs or exports. The root directory of the repository has a crates/ folder which has all the different crates in it.

The Cargo.toml in the root directory is configured to create a cargo workspace, which means that it tells cargo that there are multiple crates inside this one repo.

Once I added my projects inside the crates folder, I was able to declare which one I wanted to run using cargo run --bin <crate-name>

Making connections through bridges

At this point, the download manager was working well, and I wanted to further enhance it by working to add pluggable transport support to it. However, I was not too familiar with the APIs so I pivoted to working on one of the smaller projects in my proposal: a connection checker tool which just tries to connect to a website through Tor via a normal Tor connection, a bridge or obfs4 or Snowflake bridge.

This tool was specifically chosen to be built by me in order to better gauge Arti’s bridge APIs, and while developing this tool I did find some useful feedback for the Arti devs.

Since there were some issues with getting bridges working, I shifted to something else.

Making error reporting easier for new developers

When reporting this issue over IRC, nickm suggested using tor_error::Report to generate an error message instead of copy pasting panic output or even the Display trait’s output

This was the first I saw of this trait, and it took some delving into tor-error’s docs to figure out how to use it.

Essentially, tor_error::Report implements the report() method, which generates a nice, easy to understand error message from the error that has been caught.

So, instead of writing

function_which_fails().unwrap();

and looking at a complicated panic, you can write

match function_which_fails() {
  Ok(_) => {
    // some code here
  },
  Err(e) => {
    println!("{}", e.report());
  }
}

Now knowing this trait was there for Arti’s APIs, it would’ve made debugging much easier. So, I created an MR to add a section on Error Reporting in the docs for arti-client

Now that this was done, I worked on another project under the proposal.

Working on the DNS resolver

Since the download manager had gotten enough work for a while and the connection checker was stalled, I decided to work on the DNS resolver, which was a sample program I chose specifically to highlight how non-HTTP(S) TCP-based protocols might utilize Arti to make their connections anonymous.

The DNS resolver will use DNS over TCP to make a query to a DNS server for a particular domain.

I researched the protocol and found that DNS over TCP was virtually identical to regular UDP-based DNS.

This teaching resource helped me understand the DNS header and payload generation, and even provided some dummy values I could use to validate my DNS request.

The first thing I did was write the structs according to the definitions given in the above teaching resource, which I cross-referenced across various other sources.

After that, I resorted to looking for some crate which could directly serialize and deserialize the structs into Vec<u8>, however, after trying serde and bincode I realized that these crates all used their own bespoke format, and I’d have to just manually write the code to serialize and deserialize.

In order to do that, I defined a trait AsBytes which has as_bytes() method to be implemented by both Header and Query structs (which represent the DNS header and query message respectively)

After this was done and I was able to verify that the method worked as intended, I ran into another roadblock: I was getting a response of 0 bytes every time.

While at first I did directly send these bytes over Tor, I later resorted to using tokio::net::UdpSocket or tokio::net::TcpStream in order to validate my crafted request. This was a good step, since Wireshark would reveal that my packet was not, in fact, valid.

I’d first dropped down from Tor to TCP, but when even that didn’t really work I went down to UDP. It was here that Wireshark revealed that my packet was a mangled UDP packet.

Apparently, some values weren’t set right, so after fixing that and verifiying in Wireshark, I went up to TCP. Here is where the statement “DNS over TCP was virtually identical to regular UDP-based DNS” falls apart.

The only real differences I could see from my packet and what dig generates for DNS over TCP was the following:

Now, I don’t really know why this happens. Keep in mind, I was sending the same exact payload over UDP and TCP, yet the previous iteration (without these hacks) only worked in UDP and not in TCP.

Now, even after doing this, I was able to see that even though my machine gets a response back, all that my Rust program saw was zeroes. I later figured out that by fixing the size of the buffer I used to store the response, I would get the response.

Conclusion

This week was just the start, and I’ve been learning something new almost every day. Here’s to more progress in the coming weeks and more improvements to Arti!


This website was made using Markdown, Pandoc, and a custom program to automatically add headers and footers (including this one) to any document that’s published here.

Copyright © 2023 Saksham Mittal. All rights reserved. Unless otherwise stated, all content on this website is licensed under the CC BY-SA 4.0 International License