Big data research, notes a new report, is resource intensive “in both obvious and less immediately apparent ways,” already absorbing financial capital, human capital and electricity at sometimes prodigious scale.
So what does tomorrow hold?
“As the ready availability of data grows, and as incentive structures push research towards big data, the demands on research offices, university libraries, high-performance computing centers, graduate programs, individual labs, and other university units seem poised to accelerate. Supporting this research is now central to the mission of research universities. Assessing the efficacy of existing infrastructures and identifying key needs of researchers is essential as universities develop plans to support big data research over the long term.”
Given all that, the report from Ithaka S+R asks how do academic institutions best support big data research?
To find out, the research arm of the ITHAKA not-for-profit joined with librarians from 21 colleges and universities in the United States to interview more than 200 faculty members. (The report includes local reports from 15 partnering institutions.) The synthesis of their input, “Big Data Infrastructure at the Crossroads: Support Needs and Challenges for Universities,” grouped their findings into six areas:
- Tension and interplay between disciplinary and interdisciplinary perspectives
- Managing complex data
- Structures for collaboration
- Sharing knowledge
- Ethical challenges
- Support and training
Looking just at one of those areas – how different disciplines share the big data commons – illustrates how the real-world focus of the report meshes with its understanding of broader theoretical contexts. This excerpt, for example, addresses a dynamic familiar in the social sciences:
The resulting tensions may be especially acute in fields oriented towards qualitative research, whose disciplinary ways of knowing risk further marginalization in already hostile academic climates as the normativity of quantitatively-focused big data research grows. Many of those fields have access to fewer financial resources to support data-intensive research. Faculty from the humanities and qualitative social sciences reported feeling particularly burdened by the necessarily cross-disciplinary features of big data research. Because they continued to be evaluated under disciplinary standards that don’t make room for data science or computer science expertise, it was especially difficult for them to justify investing the time to develop those skills. Moreover, big data research projects in these fields often developed without robust grant funding and/or staff support, making barriers to entry especially high, and the temptation to outsource the analytical parts of their work potentially acute.
The report’s recommendations focus on organizations, not individuals. Specific calls for action target university research offices, departments, libraries, funders, scholarly societies and vendors, and range from the broad (for funders to “continue to support the robust development of data repositories” or departments to “invest in strategic hires designed to further embed data science, data management, statistical, and computational staff to provide researchers with relevant expertise to assist in big data research“) to the fairly detailed (for vendors to “enhance metadata of subscription databases” or libraries to “develop staff expertise in metadata creation, data curation, and data management, as well as data analytics and data visualization”).
The report lists as co-authors all those who participated on local projects, but the synthesis report was written by Ithaka S+R analyst Dylan Ruediger, associate director Danielle Cooper, and analyst Darnell Epps. In a blog post accompanying the report’s release, Ruediger noted the very real-world impediments organizations create in the big data space. “Unsurprisingly,” he writes, “on many campuses, the infrastructure to support big data research is highly fragmented, creating barriers to economies of scale and to the efficacy of support provided by departments and research centers, libraries, computing centers, and IT and information professionals. These challenges are exacerbated by the fact that supporting big data research on campus requires coordination with actors beyond the university but central to the research ecosystem.”
Ruediger added that this report, which is available for free and carries a Creative Commons license, as part of a series of Ithaka S+R research projects exploring data intensive research communities and instructional practices with data. Next year they plan to publish a report on “Teaching with Data in the Social Sciences” based on another cohort-based project.
Ithaka S+R is part of ITHAKA, a not-for-profit organization helping the academic community use digital technologies to preserve the scholarly record and to advance research and teaching in sustainable ways.