Today’s presentations on big data research at Compromised Data? raised some important questions about the role that big data is playing in academic research and government policy, as well as about the methodological challenges faced by big data researchers.
Greg Elmer‘s opening remarks positioned the ‘compromised data?’ theme in the broader context of neoliberal policies and the Canadian government’s anti-environmental policies. Joanna Redden‘s work on the increasing incorporation of big data research into Canadian policy-making and government service provision expanded on this theme. Redden pointed out that the turn towards big data is framed in the language of efficiency and money-saving, but that we should be concerned about the quality of the data being used, including the erasure of poverty as those who are not online (or online less) become invisible, and as services which generate oppositional forms of knowledge have their funding cut. We should also remain aware of the ways in which a reliance on big data research can change government processes, changing the role of bureaucrats and changing the relationship between citizens and the government. We need to recognise that neoliberalism is not just a political project, but also one which aims to change how we think: big data is not neutral, but rather is easily incorporated within this system.
Tainer Bucher‘s exploration of shifts in the Twitter APIs complemented this well, inviting us to look more deeply at the role of APIs in shaping how we interact with data. Bucher argues that while there’s a risk of seeing APIs as just another convenient tool to gather data, we need to critically analyse software tools and understand the power relations embedded in their design. Her empirical research in 2010 and 2011 focused on shifts in the Twitter APIs, in which the initial openness which helped Twitter to grow was increasingly shut down.
Jean Burgess and Axel Bruns also touched on the consequences of Twitter’s API as they discussed Twitter research and the politics of data access. To begin with, they point out, there’s a disproportionate focus on Twitter in academic research because it’s the easiest social media data to access. At the same time, much of the work is biased by limitations in the software tools used to study the platform: key tools like TwapperKeeper and DataSift were constrained in important ways by the changes to Twitter APIs. There are also biases that come from a focus on the low-hanging fruit, such as a focus on hashtags rather than on more complex layers of interaction like follower networks and @replies. Burgess and Bruns argue that we need to be reaching beyond the easily-available data in order to build a better picture of how people are using Twitter.
Carolin Gerlitz provided one model for doing this, outlining an approach based on a model of social media as multivalent: producing data that is both standardised and vague, and therefore allows for multiple readings. Gerlitz argued that more research needs to be open to the multiple use practices involved in social media. Frauke Zeller‘s work also provided useful templates for research which is open to the multiple meanings of social media texts, suggesting that there are benefits to an interative approach in which qualitative and quantitative analysis mutually inform each other.
Daniel Pare and Mary Francoli‘s research raised concerns about existing approaches in big data research, particularly focusing on the literature on political engagement and mobilisation. Like others, they pointed out that the data which is most easily available is not necessarily the most accurate; a focus on big data research on social media is problematic when it’s used as a simple measure of broader political trends. There’s also far too little recognition of the ways in which assumptions about what ‘democracy’ means shape research on political mobilisation and engagement online, and of the inherently political nature of social media platforms.
Asta Zelenkauskaite’s work on mainstream media’s approaches to big data also highlighted the contested nature of these platforms, inviting us to consider the difference between social media engagement as a top-down process and what it might look like if it was driven by consumer interests. Sidneyeve Matrix’s presentation served as a useful complement to this, examining the shift towards niche social networks—often paid, gated communities—that support consumers’ use of their geolocative data.
The day’s presentations opened up some vital questions that are being asked far too infrequently in big data research, and in the broader big data community, about the political and methodological issues involved in the push towards big data as a magical cure-all. I’m looking forward to tomorrow’s presentations, as well as to talking about how these concerns relate to the research Tim and I are doing.
For more see: