Data privacy for at-risk species

Elizabeth Bondi and I were concerned by the assumption that geo-obfuscating the GPS location of data (ie random jitter within some range) would be sufficient to preserve location privacy, particularly when publishing camera trap data. We dug into it a bit and found that some very simple methods could massively reduce the search area, and wrote it up for the AI for Animals workshop at CVPR this year (attached to this thread).  I'd love to have a conversation about best practices on publishing data for at-risk species, and brainstorm how we can still allow de-siloed, cross-organization research without putting sensitive species at risk.




Rob Appleby
@Rob_Appleby  | He/him
Wild Spy
Whilst I love everything about WILDLABS and the conservation tech community I am mostly here for the badges!!
WILDLABS Author
WILDLABS Research Participant
Variety Hour Regular
Commenter level 4
Conversation starter level 3

These topics are blowing my mind, and not in a good way! I just woke up with night sweats after watching Doug, Trishant, Laure and Koustubh's Tech Tutor talk and now this Sara! 

I love open-source and open access so much, but I am super concerned, as I am sure everyone else is, about how to handle these issues. It's all really tricky. Maybe there needs to be (and there probably already is in many cases) a kind of a "chain-of-custody" approach to open data so that any and all uses can be backtracked to sources? A single, central repository for accessing data sets and people have to provide a reasonable amount of personal/organisational information in order to access, and agree not to share data through any other means other than the repository. Urgh, even writing these words is making me cringe! 

I may never sleep again!

Interesting question and kudos for questioning assumptions!

It seems clear that obfuscation only works to the extent that the cost of de-obfuscation is greater than the expected reward.

For high value, highly endangered species, a better approach might be to actually encrypt the data and establish a public key registry of trusted orgs and researchers. Only those whose public keys have been accepted into the chain of trust could decrypt the data.

This would be a semi-decentralized solution, based on the chain of trust.

 

Arshad Noor
@arshadnoor  | He
StrongKey
Specialized in security and privacy; creator and supporter of open-source solutions.

These are real-world problems that are not only complex, but challenging to address because of a general unwillingmness of most users to give up convenience for security.

I work for a company that solves some of the most complex data-security/privacy problems and am happy to have a discussion around this if there is serious interest.

If you haven't read them, these 2 papers offer propose decision-making frameworks for sensitive animal occurrence data:

Tulloch AIT, Auerbach N, Avery-Gomm S, Bayraktarov E, Butt N, Dickman CR, Ehmke G, Fisher DO, Grantham H, Holden MH, et al. 2018. A decision tree for assessing the risks and benefits of publishing biodiversity data. Nature Ecol. Evol. 2: 1209-1217. https://doi.org/10.1038/s41559-018-0608-1

Lennox RJ, Harcourt R, Bennett JR, Davies A, Ford AT, Frey RM, Hayward MW, Hussey NE, Iverson SJ, Kays R. 2020. A novel framework to protect animal data in a world of ecosurveillance. BioScience 70(6): 468–476. https://doi.org/10.1093/biosci/biaa035

The Research Data Alliance has 2 relevant interest groups:

Sensitive Data IG: This is just getting started.

Data Policy Standardization and Implementation IG: For those publishing results, this group is working on advising more consistent data-access policies. In the paper below resulting from this group's work, they give some general suggestions for defining exceptions to open data policies that include sensitive species data:

Hrynaszkiewicz I, Simons N, Hussain A, Grant R, Goudie S. 2020. Developing a research data policy framework for all journals and publishers. Data Science Journal 19(1): 5. http://doi.org/10.5334/dsj-2020-005

If others know of relevant resources please share!