OK, you asked for it..
Anonymity in New Zealand and Australia for...
Primary research data:
As a university researchers I always promise to store securely, limit access to a small group of named reseachers, analyse and summarise so that individual participants cannot be identified, then eventually destroy after a fixed number of years, all the primary survey and qualitative data I collect. As far as I know, this is mandatory if I want to get ethical approval for my research from my university, and ethical approval for primary research data gathering is in turn mandatory. So nobody outside the prior-nominated researcher group gets to see the data, anonymised or otherwise. Sometimes for removal of potential researcher bias the data is also anonymised for some researchers.
Secondary internal data:
There are commercial customer privacy laws that prevents firms from collecting and retaining customer data unless they can demonstrate that is part of their legitimate business activity, whilst legitimate business activity is understood to include any analysis that is designed to generate improvements in customer value and experience. Generally improvements in customer experience and value can in turn be tied to improvements in business efficiency and profitability. Analysing internal secondary data for anti-competitive and market-dominance-maintaining reasons is explicitly prohibited on penalty of severe fines, as is lax data firewall and handling security. For this type of data to be made publicly available for general learning, an Australasian firm would have to assume responsibility for the effectiveness of any anonymising process, to ensure its customers cannot ever be identified.
So I can sort of understand why corporate legal councel insist on data anonymising protocols that err on the conservative side, i.e. go over the top, before any is released into the wild.
Secondary external data:
This is where it gets a little tricky for me.
Data and metadata that is largely or primarily machine-generated, and does not explicitly identify individual users, like google page stats for selected government public websites, should be OK to analyse and reproduce unless explicitly protected by an agreed protocol, not that any come to mind apart from Creative Commons variations. However I'm never entirely sure that someone smarter than me might be able to identify users somehow.
Any data in the public domain generated by someone typing something, like I am doing right now, is generally protected by author copyright, which means, at a minimum, any explicit reproduction should acknowledge authorship during the period of copyright, and, as in the case of Creative Commons, other protocols should be adhered to. Reproduction and explicit CC or other protocols aside, the data and metadata should be fairly available for analysis, which generally summarises the heck out of what is generally big corpora and structured data and metadata. But when we type, do we ever stop and think how someone might be openly and honestly attempting to analyse what we write, either specifically identifying us, or in bulk with others' creative output? As an academic, I do**, but how widespread is that consciousness? If writers are generally not aware, or limit their awareness to specific domains, how do we, as analysts, ensure their moral rights are fairly protected?
Who analyses the analysts?
If you were tired before reading this, you'll be asleep by now.
Nitey-nite.
** I watch my own bibliographic metadata like a hawk to see who is citing me and to what extent!
... View more