Introduction to Research Data Management for students

Table of contents

Research data refers to all material that is collected, produced, and utilised in research.

RESEARCH DATA

Research data is a central part of the research process. Your research design, objectives, and methods determine what kind of data you collect and the type of data you produce (quantitative or qualitative). You analyse research data in order to answer your research questions. However, you may also stumble upon new perspectives and findings when analysing the data, in which case you might end up modifying your original research questions, hypotheses, or theories.

  • Research data includes, for example, interviews and surveys.
  • Research findings are based on the data, but the generation of the data itself can also be a significant outcome of the research!

PLANNING DATA MANAGEMENT

Planning data management is an important part of the research process and design.

By carefully planning how you will manage the research data for your thesis in advance, you ensure that:

  • You address all ethical, practical, and legal issues related to your data, such as the protection and privacy of personal data.
  • You follow good scientific practices and guidelines for research integrity in collecting and handling research data.
  • The data does not become altered, distorted, or compromised at any stage of the research process.
  • Data collection and processing are well-organised and thoroughly documented.
  • The data can be reused in future research projects, if needed.

Data Management Plan (DMP)

A data management plan supports your data management and complements your overall research plan.

Make sure to allocate enough time to create a data management plan before you start data collection.
The data management plan is a tool that helps you systematically address questions related to the collection, processing, storage, and potential reuse of research data.

Work on the plan together with your supervisor as part of your broader research planning process.

Even if a data management plan is not mandatory for you, it is still highly recommended to create one!

Core questions of a Data Management Plan

  • Description of the data and ensuring its quality
    • What kind of data is your research based on?
    • What type of data do you collect, produce, or reuse?
    • How will you ensure the quality, consistency, and integrity of your data?
  • Rights related to the data
    • Do you need to make agreements regarding the use of the data, for example, if you are using previously collected data?
    • Are there copyrights associated with the data?
    • Are you collecting data from an online platform whose terms of service may restrict its use in research?
  • Personal Data
    • What personal data will you be processing, and what kind of considerations are necessary when handling it?
    • How will you protect the research participants?
  • Research Ethics
    • What ethical challenges are associated with your research data?
  • Documentation and Metadata
    • How will you document and describe your data so that it is understandable to others and the data handling is systematic and smooth?
  • Data Security
    • Where can the data be stored, and how will you back it up during your thesis process?
    • What software and devices are safe to use for handling the data?
  • Reuse or Disposal of Data
    • Can parts of your data be archived or published for future use, and what should be considered then already during the planning phase?
    • If you dispose of the data, how will you do so securely?

RESEARCH INTEGRITY

Responsible and reliable scientific research is ethically sustainable, as open as possible, and it adheres to the Finnish Code of Conduct for Research Integrity.

Openness and transparency are key principles of research integrity. To assess the findings of your research, it must be clear for the reader how the research was conducted. Therefore, in your thesis, clearly explain what research methods you used, how you collected your data, and how you analysed it.

To report your research results openly and transparently, you must document your entire research process. Keep track of what you are doing and what you have agreed upon with research participants or collaborators. Store this information in an organised manner, so you can refer back to it later if you need to.

Source: Responsible Research

CHOOSING THE TOPIC OF YOUR THESIS

When doing your Bachelor's or Master's thesis, do not work with research data whose collection, processing, or analysis could put either you or your research participants in danger, or involves unreasonable risks.

Students should also carefully consider whether the data is ethically so demanding that the related ethical issues cannot be adequately addressed within the scope of the thesis. These are matters that should be discussed with your supervisor.

REUSING ARCHIVED OR PUBLISHED DATA

You don't necessarily have to collect the research data yourself—instead, you can also make use of previously collected datasets.

  • Pre-existing datasets are data collected for earlier research and made available for reuse by others.
    These datasets are often suitable for conducting multiple studies from different perspectives.
  • Such data can give you access to material that would be impossible to collect within the scope of a single thesis, such as data gathered for longitudinal studies or international comparative research.
  • Reusing data is also resource-efficient. Producing datasets requires a lot of work, and if a research project has collected a large dataset, it makes sense for it to be used in multiple studies.

Find archived data through a service called Etsin

Students may also have access to:

  • Data from an ongoing research project
  • Data produced by the student themselves, collected as part of a research project
  • Parts of data previously collected and produced at one's own department

Many sections of the data management plan primarily concern the collection of new data.

Reusing existing data is often easier, as most data management issues have already been addressed during the original data collection and storage.

Nevertheless, even if you are working with existing data, planning your data management is still important. Familiarize yourself with the decisions that have already been made and take them into account in your own data handling, along with the terms of use for the data. You can read more about terms of use in the learning material section: Rights and Copyright.

Instructions for citing pre-existing data:

How to cite data

FAIR PRINCIPLES

With regard to research data, the principles of responsible science are taken care of by following the FAIR principles as closely as possible. FAIR stands for:

  • Findable
  • Accessible (available)
  • Interoperable
  • Reusable

The FAIR principles facilitate the publishing of data, but they are not created solely for that purpose. FAIR principles and practices of responsible data management improve the quality of research—it becomes more reproducible, verifiable, structured, and easier to report.

Follow the principles as far as your circumstances allow.

  • It is worth striving to follow the FAIR principles, but it's also important to recognise that they represent an ideal. A dataset may meet only some of the principles and still be valuable.
  • Some principles may be challenging depending on the type of data. For example, the interoperability principle emphasizes machine-readability, which may not be feasible for all types of data.
  • It is the responsibility of the researcher (and student) to be aware of the principles, but the primary focus should be on collecting data in a systematic and well-documented manner.

If you plan on publishing your data in JYX Datasets, the Open Science Centre will support you in addressing the FAIR principles.

CHECKLIST

  • When choosing a thesis topic, consider risks, ethical challenges, and the scope of the topic.
    • Your thesis topic should be appropriately focused and feasible.
  • Select your research data based on your research questions, methods, and objectives.
  • Include data management planning as part of your overall research design.
  • Create a data management plan to complement your research plan, regardless of whether you are collecting the data yourself or using existing data. With the help of the plan, ensure that:
    • Your research follows good scientific practice, the ethical principles binding the research community (= guideline on Research Integrity), the University of Jyväskylä’s data management guidelines, and relevant legislation.
    • Your data management appropriately considers the FAIR principles.
    • Your research process—including data management—is well-organised and thoroughly documented, allowing you to report it openly and transparently.