Social media data
Table of contents
Social media data often contains data privacy, contract law, and copyright issues and restrictions.
RESEARCH ETHICS
Utilising social media as research material inevitably involves research ethics questions, challenges, and risks.
As a researcher, your task is to identify the ethical issues and challenges related to social media data and to minimise and manage risks.
Consider the following questions:
- Social media content is generally not intended for research purposes.
- What risks or harm could arise for individuals from analysing such content?
- What risks or harm could result from processing personal data contained in social media material?
- Could harm occur to you or the individuals if their social media posts are analyzed and highlighted in a public thesis, thereby exposing the posts to a new audience?
- Can you include direct quotes from the material to support your analysis, knowing that the original post and its author may be easily found via search engines?
- Do the posts contain special categories of personal data or other highly sensitive information?
- Examples of highly sensitive information include: criminal convictions, drug use, financial problems, mental health issues, controversial political opinions and activism.
- If such information is present, what harm or damage could occur to individuals from being subjects of your research?
- Avoiding social media material that contains sensitive topics or data is recommended, as such research may involve significant ethical challenges that cannot be resolved within the scope of a thesis.
These questions, challenges, and risks exist regardless of whether the material is publicly available online or not.
- Please note that researching closed groups likely involves such extensive ethical challenges that they are not suitable as thesis material for Bachelor's/Master's students.
- Even in the case of open groups, it is important to verify whether the platform allows the use of its content for research purposes.
- You must also address other questions related to ethics, personal data, and informing research subjects.
Using social media data often requires preparing a Data Protection Impact Assessment (DPIA).
A DPIA is required especially when the research involves:
- Large-scale processing of data
- Combining or linking datasets
- Collecting special categories of personal data or highly personal information
- Situations where research subjects cannot be informed
More details on DPIA can be found in the learning material section: Personal Data.
- Ethical issues related to social media data are discussed, for example, in the EU guideline Ethics in Social Sciences and Humanities.
SOCIAL MEDIA CONTENT AND PERSONAL DATA
Social media content almost always includes (direct and/or indirect) identifiers. In other words, you will be collecting and processing personal data.
Additionally, you must remember that individuals have the right to know that they are the subject of your research!
Informing and consent
According to the current guidelines of the University of Jyväskylä, you must inform the individuals whose personal data you process and request their consent to participate, even when the data is collected from social media.
- The guidelines are currently being updated, and the instructions on this page will be revised as soon as updates are available.
Inform
- For example, if you are studying an influencer’s social media channel, send them a privacy notice, research notification, and consent form.
- If it is impossible or unreasonably burdensome to contact the individual and inform them personally, make the documents available by other means:
- Publish the privacy notice.
- Every JYU student has access to a personal JYU webpage and to the university’s SharePoint, either of which can be used for this purpose.
- If you are studying comments on a public social media channel or messages on a discussion forum, link the published privacy notice and research notification in the comments section or message thread.
- State that you are analysing the comments/messages for your thesis.
- Indicate when you will start and stop monitoring the discussion.
- Do not start monitoring immediately - give the participants time to delete their messages if they wish.
- Include your contact details so that participants can reach you later on and request removal of their messages from the dataset.
- Publish the privacy notice.
Ensure consent
- For example, if you are studying a specific influencer’s social media channel, also send them the consent form together with the privacy notice and research notification. You need their response confirming consent.
- In some cases, obtaining consent from all participants is impossible: for instance, when studying the comment section of a social media channel where not all commenters can be reached.
- In such cases, the principle of informed consent for participation can be waived, BUT if informed consent is waived, an ethical review is required.
- You can find more information about ethical review in the learning material section: Personal data.
- Discuss with your supervisor whether an ethical review is necessary.
- Familiarise yourself with TENK’s guidelines on the six criteria for ethical review in human sciences.
- If your research meets even one of these criteria, initiate the University of Jyväskylä’s ethical review process with your thesis supervisor.
- Before starting the process, contact the university’s Ethics Committee for Human Sciences: ethics-committee@jyu.fi.
These guidelines also apply to social media accounts of public figures, such as politicians, or company accounts.
Researchers also have the right to avoid harm directed at themselves. While it is natural and acceptable to receive feedback on research, social media can involve various forms of harassment, and feedback may be inappropriate. Researchers can even become targets of online harassment.
Therefore, it is worth considering whether your research topic is in any way sensitive or controversial. For example, if you were studying a Facebook discussion related to immigration or gender equality, you should reflect on whether informing participants could lead to harassment directed at you.
If you decide to deviate from informing participants and/or obtaining consent, your research must undergo an ethical review.
Legal basis for personal data processing
A privacy notice must always state the legal basis for processing personal data.
- Information about legal basis can be found in the learning material section: Personal data.
- You can also read about legal basis in the University of Jyväskylä’s Instructions for students, under the section Legal basis for personal data processing and consent.
Legal basis
- If your thesis is a scientific work, the legal basis is public interest.
- If it is a non-scientific work, the legal basis is usually consent.
- Remember that consent to participate in research and consent as a legal basis for processing personal data are two different things.
- If you collect data from social media and it is impossible to obtain consent from participants to take part in the research, then it is equally impossible to ensure their active consent for processing personal data.
- If neither public interest nor consent can be used as a legal basis, the remaining option is legitimate interest.
- Using legitimate interest as a legal basis requires a Balance test.
- However, legitimate interest cannot be used as a legal basis if social media data is collected directly from individuals, for example, by asking them questions in a discussion forum.
Tip!
Are you interested in studying online discussions but want to avoid handling personal data from a GDPR perspective? Here are some tips on Finnish language options:
- Posts from the Suomi24 discussion forum are available via Kielipankki (Language Bank of Finland).
- Kielipankki is an archive that stores various language datasets.
- Some datasets are publicly available, while others require academic login.
- Certain dataset versions may contain personal data - this is clearly indicated in the license.
- The license may include data protection conditions, and your project’s published privacy notice must be provided as a link to Kielipankki.
- Comments on newspaper and magazine articles are classified as opinion "letters to the editor", meaning they are considered part of the published article and do not constitute personal data processing.
- Thus, comments published on the official online platforms or apps of newspapers and magazines are treated as part of the article.
TERMS OF SERVICE AND COPYRIGHT
Review the social media platform's Terms of Service / Terms of Use.
- The terms may change, so always check the latest version.
- Find out what the terms say about, for example:
- Downloading and storing content
- Publicity or public availability of content (not all material found online is public!)
- Content sharing
- Copyright
- Automated data collection (scraping)
Exception for Text and Data Mining
- Section 13b of the Finnish Copyright Act allows for the reproduction and storage of copies of a work for the purpose of text and data mining. Under this exception, it may in certain cases be possible to deviate from the terms of use of a social media platform.
- “Text and data mining refers --- to an automated analysis technique aimed at analysing text and data in digital form to generate information.” (Translated from: Tekijänoikeuden tiedotus- ja valvontakeskus ry)
- Mining may be carried out unless the authors have reserved this right. If data mining is conducted for scientific research within a research or cultural heritage institution, rights holders cannot prohibit the mining.
- A prerequisite for data mining is lawful access to the work.
- If you are considering not complying with the platform’s terms of use based on this exception, consult your supervisor to ensure that your research methods meet the criteria of the exception.
Texts written on social media may exceed the threshold of originality. In addition, copyright protects images uploaded to social media.
- According to Section 25 of the Finnish Copyright Act, it is permissible to cite an image that has been published, i.e., lawfully made available to the public. Thus, from a copyright perspective, image citation allows the use of a legally published work in a thesis.
- For example, images from companies’ public social media accounts can generally be considered lawfully published.
- Images from public social media accounts of public figures fall into a more ambiguous area. So far, there is no legal precedent addressing the right to cite such images in theses or scientific research.
- Images from private individuals’ private social media accounts cannot be cited. In such cases, copyright issues intersect with ethical and data protection considerations. In principle, when requesting consent to participate in research—or when consent serves as the basis for processing personal data—permission to use images in a thesis may also be requested at the same time as consent.
You can find more information about terms of use and copyright in the learning material section: Rights and copyright.
I am researching videos on a YouTube channel. Videos are protected by copyright. What does this mean for my data?
You can reference videos, such as reference text or an image. Screenshots: In general, you can take screenshots because they can be considered an image quote. If you use an image in your thesis, the image must be essentially related to the text, and it must be part of the analysis.
YouTube's current terms of use prohibit, for example, downloading videos to one's own computer. What does this mean for my data?
You cannot save the data for yourself. Your data only exists on YouTube, and if the creator of the videos were to delete the videos for any reason, you would lose your data. If a single video were removed, the data would no longer be intact. This in itself does not prevent research, but it is good to be aware of this risk.
Reproducibility is part of the essence of scientific research: scientific research must be reproducible, and it cannot be done if the data no longer exists as such.
CHECKLIST
- Identify research ethics questions, challenges, and risks related to social media material.
- Create a plan for how you will minimise and manage these risks.
- Consider how the chosen legal basis for processing personal data affects the collection of social media data.
- Inform participants and ensure their consent to participate in the research.
- Determine whether your research requires a Data Protection Impact Assessment (DPIA).
- Determine whether your research requires an ethical review.
- Check the terms of service and usage policies of the social media platforms.
- Reflect on how these terms affect:
- Data collection
- Data quality
- Research methods
- Reflect on how these terms affect: