GLOBAL
bookmark

Can artificial intelligence make research more open?

The past decade revealed how open information exchange can dramatically accelerate scientific progress. The explosion of sharing of preliminary results, data sets and protocols during the COVID-19 outbreak arguably hastened the development of vaccines, treatments and effective public health measures.

It was a key moment for open science, highlighting in practical terms how access to a diverse range of research outputs, not just the final article, fuels breakthroughs.

Yet, significant hurdles remain. Despite a proliferation of high quality data repositories and an increasing number of funder and institutional mandates, many researchers still lack consistent guidance on how to share data in ways that add value – by aligning with FAIR (Findable, Accessible, Interoperable and Reusable) standards.

The existing maze of overlapping sharing policies, moreover, leaves authors unsure where, when and in what format to deposit their research materials. Adding to these practical challenges are substantial cultural barriers. Data sharing, code publication and detailed protocol documentation are yet to be fully recognised or rewarded in many academic circles.

A researcher-centric approach to AI

Emerging technologies (particularly those built on generative AI) could be part of the answer – helping resolve the bottlenecks that keep many results locked behind institutional walls.

AI is already reshaping the research ecosystem and has the potential to transform how we think about open science. Notably, it could help shift these practices from being another directive from funders or journals, to more of a carefully designed ‘product’ that aligns with how researchers work.

In turn, this requires shifting mindsets from top-down policy enforcement to a service-oriented approach that places researchers’ needs and goals at the centre.

In practice, adopting a product mindset begins with empathy for the realities of researchers’ day-to-day workflows. From data collection and experiment design to code development and protocol sharing, these steps are too often fragmented by time-consuming administrative tasks.

When open science is framed as a ‘policy imperative’, it can feel like a burden to already overstretched scholars.

At Springer Nature, we recently conducted a pilot study simply requiring authors to explain why any unshared data hadn’t been deposited in a public repository before final acceptance. This request alone raised data-sharing compliance from 51% to 87% in participating journals.

However, while such editorial engagement clearly works, scaling it across hundreds of titles poses a challenge if it depends solely on manual oversight.

This is where generative AI has promise. Enabling automation of metadata creation, flagging overlooked requirements and suggesting best-practice workflows all free researchers to focus on discovery rather than documentation.

Crucially, these tools can connect researchers more directly with the benefits of openness (tracking citations of datasets, code usage or protocol adoption) to better reflect their full range of contributions.

We are currently conducting a small pilot with our authors on a small number of our open access journals to see if generative AI can be used to identify promising datasets buried in traditional articles and help transform them into data manuscripts.

Crucially, authors can then review and edit these drafts, ensuring the final text accurately represents their work and meets community standards. This human-in-the-loop approach is vital to ensure the accuracy and integrity of the generated content.

Ultimately, a researcher-centric approach to AI makes generative tools part of the process (rather than the whole process) and encourages and supports detailed documentation and better data stewardship.

Instead of viewing open science as ‘extra’, we can embed it into the infrastructure of academic work. This could ensure that openness, equity and innovation become the norm rather than the exception. Further, once a dataset is shared, it becomes more likely that protocols, code and supplementary assets will be similarly recognised.

Tools alone can’t deliver change

The process of research is currently grounded in a mutual exchange of trust: researchers receive support from institutions, funders and society, and they share the outcomes of their work in return. Until now, this workflow has largely revolved around publishing research articles.

However, in today’s interconnected and data-driven world, this model arguably no longer fully captures the multifaceted nature of modern research outputs. A framework that values only the final manuscript misses these broader contributions and stifles the culture of openness required to reap the benefits of collaboration, usability and re-usability.

As a sector we could better recognise, enable and support all components of the research lifecycle – as could research institutions through academic promotions, grant decisions and professional evaluations.

In scholarly communication, changing incentives always feels over the horizon, but if we could acknowledge researchers who share reproducible datasets, publish well-documented code or refine and disseminate experimental methods, we could build a system where creating a high-quality dataset or widely used software tool could earn credit on par with writing a journal article.

Technology remains crucial to this vision, by providing the infrastructure for measuring and surfacing these contributions. As collaborative projects, such as the DataCite initiative demonstrate and our AI pilot work show, AI can be deployed to detect references to specific datasets, code snippets or methodological protocols in publications.

This granular tracking and linking creates robust evidence for how shared research objects influence future work and broadens the scope of what can be cited, counted or rewarded.

While citation metrics have dominated evaluation of research articles, AI-driven analytics might illuminate a broader range of markers for the value of underlying research components. Used carefully within university or funder evaluation systems, these metrics could help researchers who prioritise openness.

Equally important is reducing the practical friction of sharing.

AI-based platforms can guide researchers through publisher or funder requirements, auto-generate metadata and highlight relevant repositories. This saves time while encouraging thorough documentation and broad accessibility, strengthening the transparency and trust at the heart of any social contract.

In turn, once open science becomes less cumbersome and time-consuming, it is more likely to be embraced by researchers.

Realising the benefits of openness

When designed around researchers’ real-world needs, AI systems and tools can streamline data-sharing requirements, automate the tedious parts of compliance and elevate the visibility of otherwise hidden contributions.

This lowers the barriers to openness, making it simpler and more appealing to deposit data, code and protocols in ways that others can easily find and reuse.

Ultimately, this collaboration between human-centric policy and AI-driven facilitation benefits not only researchers, but everyone who relies on scientific progress. Policy-makers and practitioners have more data at their fingertips, driving more informed decisions; funders can ensure that investments lead to broadly accessible resources; and the public gains greater transparency into the research it helps to underwrite.

While current visions of AI research often seem to include researchers as an afterthought, by aligning technology and incentives under a broader view of scholarship, the research ecosystem can evolve into one that is open, equitable and primed for new discoveries.

Niki Scaplehorn is director, content innovation, at Springer Nature. A cell biologist and neuroscientist by training, Scaplehorn started his editorial career at Cell before moving to Nature Communications in 2012. He became chief life sciences editor in 2015, before taking on a portfolio of Nature Research journals in the life sciences in 2019. In 2023, he joined Springer Nature’s newly formed content innovation team, where he is applying his editorial experience to develop new tools to make communicating and sharing research faster and easier.

Henning Schönenberger is vice president, content innovation, at Springer Nature. He has a strong background in product management and digital innovation and a track record of early adopting artificial intelligence in scholarly publishing. He pioneered the first machine-generated research book published at Springer Nature.

This article was first published on the Impact of Social Science blog of the London School of Economics and Political Science. The article gives the views and opinions of the authors and does not reflect the views and opinions of the Impact of Social Science blog nor of the London School of Economics and Political Science.

This article is a commentary. Commentary articles are the opinion of the author and do not necessarily reflect the views of
University World News.