18 March 2022
by Jack Fox Keen
In our Lightning Bug blog post, we were tackling the macrocosmic question of, “What is the zenith of all possibilities for analysis with ProofMode?” We are now zooming into the microcosm of a single dataset. We have been working to answer the following question: How do we check the data against itself? From the chorus of synchronicity, we are honing in on the tune of an individual.
The first and foremost layer to the ProofMode data is Integrity. Proofmode uses OpenPGP cryptography to sign every media file and proofmode data file, and includes sha256 hash values of the media files in the proof data itself. That hash value can also be shared with third-party notary services such as Opentimestamps and Google SafetyNet, sent privately through a messaging app or email, orposted more publicly to services like Twitter or Facebook. These methods all establish that this set of bytes existed as of the time the hash value was shared or published.
Verification of Opentimestamps signature as outlined in our recent ProofMode release
The second layer is Consistency.
Such a simple graph can actually tell us a lot. As we can see, three of the devices did not provide any information for the Location Provider (dark green bar, per the legend).
The third layer is Synchrony
As we work through our evaluation of these three layers–authenticity, consistency, and analysis–we of course run into more questions. How does one rank these three layers? Integrity is a binary evaluation–either a hash is matched or it isn’t. Consistency introduces a layer of qualitative evaluation. As users add and remove data collected in their settings, how should this impact the overall metric? For example, as we saw in the histogram above, we have three users who did not include location information. Likewise, we had one user who included information about Networks–had they not included this information, we would not have realized that this device’s cell info includes multiple information for cell towers. How should the overall result be interpreted? How does the “proof” aspect of ProofMode change?
Finally, what is the best and most intuitive way to summarize this information for the end user? Our first iteration of a data summary visual has been a radar chart. Each parameter of evaluation is ranked on a scale from 1 to 5, and each parameter can be evaluated for different attributes of the data.For example, in the following radar chart, we have our three parameters of evaluation performed for the hardware aspects of proof and the software aspects.
We intend to improve by thinking about how to visualize authenticity, a binary scale, with a binary format. How do we coalesce binary results with qualitative results? Perhaps one summary visual is simply unrealistic. These scales cannot easily coexist on one axis.
Other questions that have arisen include: is it possible to show when one individual component is unreliable? What would happen if we collected dimensions of data that were less reliable individually, but more reliable in aggregate? Where can one weak variable support another weak variable to create a stronger metric together? Among the data in our ProofMode legend, which variables could feasibly contribute to these hypothetical scenarios?
Par for the course, answering these questions will inevitably generate a hydra of new questions. In addition to our preliminary analytics stage, we are exploring threat models and working towards coalescing years of conversations and documentation around safety. We are excited for the journey, and encourage feedback on these questions and more via our Contact Page. We also welcome feedback on our GitHub, including the code for our Consistency Check. If you would like to be a part of the active conversation around ProofMode’s development, please join the ProofMode Public Matrix Room!