What are research data?
"Research data, as defined by OECD in its 2017 report, are "factual records (numbers, texts, images and sounds), which are used as main sources for scientific research and are generally recognized by the scientific community as necessary to validate research results. Research dataset is a systematic and partial representation of the subject being researched.
The term does not apply to: laboratory notebooks, preliminary analyses, drafts of scientific documents, future work programs, peer review, personal communication with colleagues and physical objects. "
There are several types of research data depending on how the data is produced and its assumed value (INIST, 2014):
- Observation data: captured in real time; usually unique and therefore impossible to replicate (ex. survey data);
- Experimental data: obtained from laboratory equipment; often reproducible, but sometimes expensive (ex. chromatograms);
- Computational data of models or simulation (ex. seismic simulation model);
- Derived or compiled data (ex. text mining);
- Reference data (ex. crystallography database)
Why manage and share your data?
Management of research data meets the following objectives (CoopIST, 2015):
- it increases research efficiency by facilitating access and data analysis conducted by the researcher or any new researcher;
- it ensures research continuity through data reuse and prevents duplicating efforts;
- it promotes wider dissemination and increases its impact: research data, when properly formatted, described and identified, will keep long-term value;
- it ensures research integrity and result validation. Accurate and comprehensive research data also enables to reconstruct events and processes that have led to the results;
- it reduces the risk of loss and strengthens data security through the use of robust and responsive storage devices;
- it follows today’s publishing developments: scientific journals tend to propose that data once published be shared and deposited in available data repository. As a result, research data management facilitates submitting articles, which are based on documented datasets, to scientific journals;
- it meets funders’ requirements for project funding: funders pay more attention to what researchers do with data relating to a project and they often condition their funding to such data being available for free and open access;
- it testifies to your commitment: thus managing your research data and making it available, you demonstrate your responsible use of public research funding.
To go further, get informed and train on DoRANum: Issues and Benefits
Data Management Plan (DMP)
Data Management Plan is a management tool. It is in the form of a structured document with headings. It aims to summarize the description and evolution of your research project datasets. It is a basis to sharing, reuse and sustainability of data.
Since July 2016, all H2020 projects have had to provide a DMP. Program requirements are detailed here, some of them are listed below:
- DMP first version (deliverable) within the first 6 months of the project;
- At proposal stage, provide a brief description of DMP;
- H2020 promotes updated DMP publishing halfway through the project (Mid-term review DMP);
- H2020 requires at least a DMP new version with the necessary updates at the end of the project (DMP final review);
- The minimum requirements (DMP1st version) are: description of the data that are being generated or collected; standards and metadata that will be used; data sharing; archiving and preservation of FAIR principles.
DMP is more and more frequently promoted or imposed by supervisory authorities and funding agencies. On Sherpa/Juliet website, learn about requirements in terms of management and openness of scientific productions.
FAIR principles (Findability, Accessibility, Interoperability, Reusability):
- Findability: descriptive metadata; permanent identifiers;
- Accessibility: appropriate authorization; well-defined protocol;
- Interoperability: open formats; common standards; consistent vocabulary;
- Reusability: clear rights; appropriate license
DMP should answer the following questions (INIST, 2014):
- Which data will be collected or generated during the project?
- Who will be responsible for each step of management?
- What will be the policy applied to data: that of the funding agencies or that of the institution ...?
- How will data and files be organized?
- How will data (documentation and metadata standards) be described?
- How and where will data be stored, backed up and secured?
- How will data be shared? Intellectual property? Reuse license?
- How will data be sustained in the long run?
- What will be the cost and resources for data management and sharing?
There are online tools for DMP creation, such as DMP OPIDoR (OPIDoR = Optimization of Sharing and Interoperability of Research Data)
DMP OPIDoR allows you and your partners to draw up a data management plan in various models that are recommended by institutions and funders (European Commission ...), as well as guides and customized examples.
To go further, get informed and train on DoRANum: Data Management Plan
Where to publish? The choice of data repository
A repository makes it possible to store research data, access it and reuse it. There are thousands of repositories divided into several types: disciplinary, multidisciplinary, specific to a publisher, institutional, specific to a research project ...
There are directories (or repertories) that can help you filter your repository search: re3data, OAD, OpenDOAR, etc.
Some examples of data repository:
- Thematic: GenBank (DNA sequences), UniProt (proteins);
- Disciplinary: PANGAEA (earth and environmental sciences), Réseau Quetelet (social sciences);
- Multidisciplinary: Figshare, Zenodo (Europe; created by CERN as part of European OpenAIRE project), DRYAD;
- Institutional: Edinburgh DataShare (United Kingdom), Open Data LMU (Germany), Merritt (USA);
- Specific research project: Scientific Drilling Database (Drilling Continental Program ICDP).
To go further, get informed and train on DoRANum: Deposit – Repository
How to publish a Data Paper
There are several ways to publish your data. The way data papers are published is similar to other classical papers. Here is how a data paper can be published.
A video made by
The deposit of data in 5 questions
Why deposit your scientific data? Which data to deposit? When should it be done? In which repository? How to proceed? Some answers in this video.
A video made by
What is a data repository ?
One way to share research data is to deposit it in a data repository. What exactly is a data repository? What about its features? How to find the right repository?
A video made by
What is a Data Management Plan ?
The data management plan is a management tool. It is in the form of a structured document with headings. It aims to synthesize the description and evolution of the datasets of your research project. It prepares data sharing, reuse and sustainability.
A video made by