Skip to main content

Implementation Ideas

This is a loose collection of ideas in my head regarding the implementation.

General

As discussed with @ckerschb there can’t be any overhead for release Firefox builds, so all changes to the DOM code, where the WebAPIs are implemented, will have to be ifdef-ed out and behind a preference. Which should make the overhead minimal for non release builds and none for release builds.

Producers

All relevant call sites will have to request our service, submit an event to it which contains all relevant information.

#ifdef CALLMONITOR
    if(Preferences::GetBool("experimental.callmonitor")) {
        auto const & const callMonitorEventManager = getCallMonitorEventManager()
        if(callMonitorEventManager.shouldSubmitEvent("window.navigator")) {
            callMonitorEventManager.submitEvent(
                "window.navigator",
                // Attach more context such as callstack or arguments passed
            );
        }
    }
#endif

Consumers

Current behavior for consumers

The OpenWPM WebExtension submits a rather complicated settings object that specifies which API is to be instrumented in which way. For each object the WebExtension is able to specify the following per object it is instrumenting:

  • Which existing and the non-existing properties to instrument, as well as which properties should be excluded
  • If they want their function calls to be serialized to strings
  • If they want “get” operations on properties that are functions to be logged
  • If they want to prevent the overriding of properties to prevent nested objects or functions being changed
  • If a property should be recursively instrumented and how many levels deep the recursion should go

The instrument obtains a reference to the logger and writes out the captured calls to it.

New suggested behavior

When it comes to configuration, I think we should follow the precedent set, however I think an XPCOM based instrumentation doesn’t need to implement the freezing of nested objects and functions, since we don’t lose our instrumentation when such an object is overwritten. I also think that in the interest of keeping the Rust code simple the function should always be serialized and getting should always be logged.

Users should be able to write, JavaScript in the likes of:

const event_handler = (event) => {
    //Maybe some transformation
    logging_db.saveRecord("javascript", event)
}

let subscriber_id = browser.callMonitor.subscribe(settings, event_handler)
// visit the page
browser.callMonitor.unsubscribe(subscriber_id)

Current data being captured

Some of these values are self-explanatory, but some I need to ask for explanations for

Column NameTypeOptionalDescription
incognitoint32
crawl_iduint32OpenWPM internal identifier
visit_idint64OpenWPM internal identifier
instance_iduint32TrueOpenWPM internal identifier
extension_session_uuidstring
event_ordinalint64
page_scoped_event_ordinalint64
window_idint64
tab_idint64
frame_idint64
script_urlstring
script_linestring
script_colstring
func_namestring
script_loc_evalstring
document_urlstring
top_level_urlstring
call_stackstring
symbolstring
operationstring
valuestring
argumentsstring
time_stampstringTrue
    const msg = {
      operation,
      symbol: instrumentedVariableName,
      value: serializeObject(value, logSettings.logFunctionsAsStrings),
      scriptUrl: callContext.scriptUrl,
      scriptLine: callContext.scriptLine,
      scriptCol: callContext.scriptCol,
      funcName: callContext.funcName,
      scriptLocEval: callContext.scriptLocEval,
      callStack: callContext.callStack,
      ordinal: ordinal++,
    };