Case study: Driving data sharing at the Science family journals

Case Study

Driving data sharing at the Science family journals

SUMMARY

When authors share materials, methods, data and code, it allows other researchers to replicate their study’s findings, a cornerstone of reproducibility. To support sharing, many publishers encourage authors to use checklists like the Materials Design Analysis Reporting (MDAR) Framework, which established a broadly applicable minimum standard for life sciences reporting. At Science, Science Translational Medicine, Science Immunology, and Science Signaling, the MDAR checklist has been a required component of life science manuscripts since 2019. Authors completed it manually.

While having an MDAR-aligned checklist in place is a valuable step for publishers to implement, many authors find completing this checklist time-consuming and difficult. What’s more, because of lack of training, insight or precedent, authors aren’t always certain what information they should include. There is little training in lab management to support reproducibility, and management resources are variable.

In 2024, to support authors in reporting their research more thoroughly, AAAS announced two new pilot programs with the company DataSeer. In the first pilot, DataSeer’s Natural Language Processing technology, with some human intervention, pre-filled the MDAR checklist for papers and invited authors to confirm the entries and fill in any missing details. This reduced author workload around MDAR and lowered barriers for authors with less formal training in reporting standards, and it ensured that every MDAR checklist contained detailed information.

In the second pilot, DataSeer used its Open Science Metrics (OSI) tool to quantify data and code sharing in 2680 articles published in Science between 2021 and 2024. The OSI analysis returned top-level metrics for these papers, including rates of data sharing, code sharing, protocol sharing, and more. This created a foundation for identifying systematic gaps. Highlights from the resulting dataset are shared on a January 2026 Science Editor’s blog.

WHAT ACTIONS WERE TAKEN?

While the OSI analysis revealed that all Science papers published in the period evaluated had a data availability statement, and that 69% shared data online or in the supplementary materials, Science still noted areas for improvement – including gaps that may reflect uneven access to training. First, the OSI results showed authors were not routinely sharing the data values used to generate plots and graphs.
Further analysis also revealed that many authors didn’t share code needed to run relevant software packages. They also cited third-party data improperly, in some cases, and pointed to data that would be publicly available, but not by the publication date.
Science values knowing these more common data gaps, so it can work to close them. Addressing these gaps is also critical to ensuring that expectations for transparency are achievable across diverse research contexts, not just well-resourced settings.
Starting in 2026, Science leadership will take two steps to improve the rigor of data sharing. It will strengthen requirements for providing the data values used to generate plots. It will also partner with DataSeer to automate completion of the MDAR checklist it had piloted with DataSeer in 2025, which was pre-filled by DataSeer and filled out much more thoroughly than those filled out by authors. However, the level of human intervention required from DataSeer and Science prevents high-throughput adoption. Developing a version of the checklist that can be completed in a fully automated way is expected to take about six months. The goal is then to use it for all life sciences papers at revision at Science, Science Translational Medicine, Science Immunology, and Science Signaling. This will help standardize reporting expectations while reducing inequities in authors’ ability to comply

Science’s work with DataSeer is also helping to show where different research disciplines may have different standards for data, code or materials sharing; the results may help resolve confusion and make compliance for authors simpler.

Case study contributor & organisation

Meagan Phelan
Communications Director, Science Journals & Publishing

Learn more

Key Takeaways

Using DataSeer’s NLP to pre-fill the MDAR checklist lowered friction for authors while improving the quality of reporting. Instead of treating compliance as a manual, after-the-fact task, it was embedded into the workflow. Authors shifted from generating responses to validating them. To move to high throughput, Science needs to further automate the process with DataSeer. The goal is that well-designed automation will both enhance author experience and raise reporting standards, rather than forcing a trade-off between the two. This is especially important for levelling the playing field for authors with differing resources.

The Open Science Metrics (OSI) analysis moved beyond policy enforcement to real evaluation. By quantifying data, code, and protocol sharing across 2,680 articles, the pilot revealed both strengths (e.g., universal data availability statements, strong overall data sharing rates) and specific gaps (e.g., limited sharing of plotting data and executable code). This kind of portfolio-level visibility enables publishers to identify where policies are working and where they need to evolve.

Crucially, the pilot didn’t end with insights. It informed concrete policy changes. Science is strengthening requirements around sharing underlying data for figures and aims to expand automated MDAR across journals. This illustrates a broader takeaway: pilots should be designed not just to test tools, but to generate evidence that drives editorial policy. Iterative, data-informed policy development helps publishers stay credible with researchers while meaningfully advancing reproducibility.

Case Study

Automation reduces burden—and improves completeness and accessibility

Measurement creates actionable insight (not just compliance)

Policy should evolve in response to evidence