When Arthur Dent complained that he had not been informed of Council’s plans to bulldoze his house for a bypass, Mr Prosser, the Council officer, calmly told him that the plans had been on display for months - in the bottom of a locked filing cabinet stuck in a disused lavatory with a sign on the door saying ‘Beware of the Leopard’. Arthur found the plans there the day before the bulldozers showed up at his door.
A lot of New Zealand’s government data feels about as inaccessible as the council plans that Arthur Dent eventually found in the classic Hitchhiker’s Guide to the Galaxy. And, just like those ‘open’ council plans for Arthur’s house, government data can have consequence in the real world. Opening the data up could do rather some good in building trust and enabling better policy.
Last week, InternetNZ held its annual NetHui at Te Papa. The event brings together a motley assortment of tech geeks and aficionados, policy wonks, social justice warriors and tech law experts. The sessions at NetHui can be rather wide-ranging, and you get the chance to chat with a lot of people you might not otherwise run into.
One theme running through rather a few sessions was a feeling that Kiwis are subject to Big Data rather than participants in it who are enabled to use data to help them to understand and shape the communities in which they live. That feeling builds resentment of the use of data in policymaking.
And it isn’t hard to understand why.
The formal barriers to accessing a lot of government data are rather substantial. New Zealand’s most sensitive data is held in secure data labs under highly restricted access. Researchers wanting to use it must prove that they have sufficient training in statistics and can only use the data within approved research projects. Very sensitive personal data held in the data lab, even though it is anonymised, has to be well-protected.
Even confidentialised random samples from larger datasets, where identifying details have been stripped out, might as well be behind signs warning to Beware of the Leopard. Access to that data is rather difficult.
But even for data that is properly open, data literacy can be a substantial barrier to participation. You have to know what data you need for any particular question, where to find it, how to get it, and then how to analyse it. Not everyone has a copy of Excel, let alone more powerful statistical software like Stata (expensive, powerful and simple to use) or R (free, powerful, and with a big learning curve).
The combined barriers mean that, for a lot of people, government data is something that’s done to them rather than something they can really use. It is even true within government. Our think tank has done a fair bit of work on education; a fairly regular, and accurate, complaint from school leaders is that while they spend countless hours in submitting data up to the Ministry of Education, they receive little back from the Ministry that could help them in improving their schools. The view from those outside of the system can be bleaker.
It all makes for a difficult problem. Social licence for data like that held in the Statistics New Zealand Integrated Data Infrastructure would disappear if anyone’s personal details were ever compromised. But social licence for that data can also disappear when the people whose details make up that data see little benefit in it, are locked out of it.
But there is a way through it.
Well over a decade ago, when I was Senior Lecturer in Economics at Canterbury, I assigned projects using sensitive microdata. The students in my course on Public Choice had check whether survey respondents’ political policy preferences tended to line up with their personal interests and whether, in the broad, policy tended to match public opinion.
We were able to do that, using American General Social Survey data, because of a wonderful web interface hosted by the University of California at Berkeley. The survey includes a lot of very sensitive personal questions, ranging from income and health to sexuality and policy preferences. Berkeley’s web interface allows anyone in the world to run simple statistical tests without ever having to see the confidential data that sits in the database. The students could check whether survey respondents’ income, or those respondents’ education, or their scores on a vocabulary test, were stronger predictors of policy preferences.
But nothing comparable exists in New Zealand, even a decade later. Anyone wanting to run even simple checks on important policy questions in must jump through hoops impossible for most people to hurdle. Meanwhile, over 100 other countries have signed up with the University of Minnesota’s Integrated Public Use Microdata Series (IPUMS), which provides the same kind of simple web tool as Berkeley to enable people to use their own data. It is easier for most Kiwis to access and use other countries’ data than it is to access their own.
That builds resentment among those who, unfortunately correctly, see themselves as having data done to them rather than something that enables their own civic participation. When only the anointed few are allowed the key to the proverbial locked filing cabinet in the basement, and the plans that they there work on matter for policies that affect peoples’ lives, it is not that surprising that the subjects of that data would prefer to layer on even more controls restricting access, or blow up the filing cabinet entirely.
More substantially opening up New Zealand’s locked data filing cabinets to enable people to use their own data would not just help ensure social licence for that data, it would also strengthen government accountability. When anyone with a web browser can run simple checks on whether changes in policy improve outcomes, it is easier to avoid the kind of surprise that Arthur Dent found lurking in the planner’s disused lavatory.