Voir en

On the road to Open Science

The release of the Open Data Portal is a significant milestone on the road to Open Science, but there is work ahead

20 November, 2014

This piece was written by Tim Smith and Sünje Dallmeier-Tiessen

The release of the Open Data Portal is a significant milestone, but the end of the road remains to be reached. Open Science represents much more than just the sum of “open” actions; it is an ideal, and for us here at CERN, a return to our roots.

CERN is the epitome of openness, which goes hand in hand with the collaborative nature of our frontier research. The fact that openness is enshrined in our convention is not taken as an obligation, instead it is used as an expression of the strength of our convictions. We helped build the Open Internet, were early adopters of Open Source, helped usher in the preprint culture, and we pioneer initiatives in the Open Access to publications.

Science is predicated on the concept that the hypotheses that we propose to explain the phenomena that we observe can be tested through repeatable experiment. We should share sufficient details of our observations and conclusions for independent scrutiny, reproduction and verification. In this data-intensive age we have somewhat fallen short of this ideal since we have continued to “share” through publication processes which had no place for data, certainly not large volumes of it, nor the code that was needed to interpret it. Hence Open Science is striving to rebalance the processes and reintroduce data and code as first-class research objects to be shared, scrutinized and reused.

As we release the Open Data Portal today, we take a new step on our steady evolutionary path towards Open Science that we have been undertaking these past years. This particular step, however, is new and evokes in many a feeling akin to that of a first parachute jump; thrill, fear or a mixture of both!

Recently in our field, a wide spectrum of initiatives has been opening up data and analysis code to a variety of audiences. Examples include HEPDATA, Rivet, Recast, the Master Classes, not to mention the recent Higgs Kaggle challenge, as well as numerous others. In launching the CERN Open Data Portal we are reinforcing these initiatives by providing a platform to expose, publish and archive data that come out of the CERN experimental programme, and to open them to ALL. To achieve this, the Open Data Portal assigns digital object identifiers (DOIs) to the data sets and code, making them citable objects in the normal scientific communications, and offers the data openly for anyone to download since they are published under a Creative Commons CC0 waiver. Thus the portal provides us with a building block for data management plans and a focal point for preservation actions.

Building the Open Data Portal has also been a prime example of the collaborative spirit that powers our discipline. The Open Data Portal is the culmination of a very close collaboration of digital library experts, data curators and meta data experts from IT and GS, together with data experts, researchers and outreach teams from the four LHC experiments. It also represents the bringing together of two distinct threads we have been pioneering over the past years, namely digital libraries and (big) data management. It thus builds on years of investment into the Invenio digital library software which powers CDS, INSPIRE, Zenodo and many more services worldwide.

To note, however, that this is data from our real collision events, so one should not underplay its complexity nor understate the time and effort newcomers to our collaborations invest in learning the tools and techniques to interpret them. Along with the lower level analysis object data in the portal we are publishing high-level and reduced data sets and tools, which while easier to manipulate and appreciate, are still not entirely straightforward to interpret! So the “parachute” trepidation mentioned earlier is not so much our launch, more the time ahead when friends afar will access and tackle the data. We openly share and we are interested to hear how and where this data is used. Not only because we are curious, but also because we need to understand how best to present our open data assets, in forms which are useable now and in the future. The Open Data Portal launch is just the starting point – we hope many will take the opportunity to try. And in the months to come we will be working with experts in the experiments to add more tools and data to make the task easier and easier.

13 January, 2025

European Strategy for Particle Physics: commu...

ATLAS gets under the hood of the Higgs mechan...

CERN signs a joint Statement of Intent with C...

CERN announces artist selected for the Resona...

CERN Courier Mar/Apr 2025

High-Luminosity LHC images

LHC Facts and Figures

On the road to Open Science

Related Articles

CERN joins the build-up phase of EOSC Federat...

CERN highlights in 2024 celebrating 70 years

It’s beginning to look a lot like Quantum

Also On Computing

CERN joins the build-up phase of EOSC Federat...

Computer Security: ThisIsAVeryGoodPassword

Computer Security: The cost of compromise

Computer Security: Store your data right

Computer Security: 2025 – Plugging holes

International Geneva celebrates the Internati...

Computer Security: When free+free becomes cos...

Computer Security: Don’t let your data walk a...

Computer Security: Are you a team player?

CERN

Science

Featured resources

On the road to Open Science

Related Articles

Also On Computing