Skip to main content

zabka.it

Introduction to my Bachelor's Thesis Project

The aim of my bachelor’s thesis is to introduce a new service into Firefox that allows instrumenting arbitrary WebAPIs to track with which parameters and in which context they are called.

Motivation

The World Wide Web and its technology constitute the most diverse platform on the planet. This diversity results in the web platform being used in various different and interesting ways. Which WebAPIs are used in which ways is a question meaningful both to researchers and browser maintainers. My project aims to produce a tool, that allows to answer this question, document its features and enable new types of research in this space.

Prior Work

OpenWPM, a tool to collect data about tracking on the web, currently injects a script into the websites JavaScript context shadowing the APIs to be instrumented in order to capture the data before passing it to the browser. This approach is both too slow to instrument the entirety of the WebAPI and causes some websites to break. Below is an image of the current architecture.

Architecture diagram of the new system

Umar Iqbal has previously implemented similar hooks for instrumentation for WebKit/Safari.

PageGraph (based on initial work done in AdGraph) implements a graph view of all website behavior in Brave, V8 and Blink.

Proposal

I suggest writing a component in Rust and exposing interfaces both to C++ to get events from the code that implements the WebAPI and to JavaScript so consumers can dynamically process the events they are interested in. My goal would be to land this code in Mozilla Central so it becomes part of the canonical Firefox codebase. A simplified view of the new architecture would be:

Architecture diagram of the new system

This approach has a number of benefits over the current implementation in OpenWPM.

  1. By removing the instrumentation from the JS context of the website, it becomes invisible, which prevents both us accidentally breaking websites and tracking scripts changing their behavior when they detect us.

  2. By implementing the instrumentation in Rust we gain two benefits:

    • As a system programming language Rust should be fast enough that we are able to instrument the entirety of the WebAPI without freezing the browser
    • As a language created to “empower[ing] everyone to build reliable […] software” I think that Rust will make it easier to maintain the code over time as Firefox changes around it.
  3. By landing this feature in Mozilla Central other parts of the organization, such as telemetry or WebCompat, can use it to capture usage of any WebAPIs they are interested in by simply writing a bit of JS that registers an observer for those APIs.

Implementation

The component will use XPCOM (Firefox internal Common Object Model) to expose an interface across language boundaries, making it look similar to other services in the Firefox codebase.

I expect the instrumented native code to look like this:

#ifdef WEB_API_INSTRUMENTATION
    if(Preferences::GetBool("experimental.web_api_instrumentation")) {
        auto const & const webApiInstrumentationService = getApiInstrumentationService()
        if(webApiInstrumentationService.shouldSubmitEvent("window.navigator")) {
            webApiInstrumentationService.submitEvent(
                "window.navigator",
                // Attach more context such as callstack or arguments passed
            );
        }
    }
#endif

And the consumer code to look like this

const event_handler = (event) => {
    //Maybe some transformation
    database.saveRecord("javascript", event)
}

let subscriber_id = browser.js_instrument.subscribe(settings, event_handler)
// visit the page
browser.js_instrument.unsubscribe(subscriber_id)

More information can be found here.

Outlook

This blog post is mostly aspirational and aims to outline the basic structure of the project ahead. I will write follow-up blog posts that will go into more detail on different topics relevant to this work. All of these posts will have the tag “Bachelors Thesis”