top of page
  • velislava777

Education data collection - is there a hard stop?

Is there a hard stop to data collection in education?

Today, within the contexts of Western societies, data collection in education vary widely. However, one common thread is that data collection continues to grow in granularity and possibilities for exploitation (e.g., layering AI for inferencing and prediction; building data pipelines for cradle-to-career loops between industry labour demands and educational outputs). While much debate surrounds data privacy and cybersecurity risks as a result of digitalising education, the question that escapes these debates is where should education data collection and use stop (and should there be a hard stop)?

Data collection in education has increased tremendously in the past ten years, driven by advancements in technology and data analysis techniques, as well as a growing focus on using data to drive decision-making and improve student outcomes. In the US and the UK, traditionally, education data collection has mainly focused on basic student information (e.g. name, demographics), enrolment, attendance, and test scores. However, more recently, the granularity of data collection has increased, including detailed information on student demographics and backgrounds, learning and behaviour data (e.g. classroom performance, disciplinary incidents), information on teacher performance and training, parental involvement and engagement data, communications and personal thoughts and feelings from regularly administered student surveys.

In the UK, similarly data collection from students has increased by collecting student attainment and progress data with breakdowns by subject, ability, and socioeconomic status; attendance data, including patterns and reasons for absence; data on the impact of interventions, such as special educational needs support; information on school resources and funding allocation, parental income, educational background and special needs. In the classroom, in-app performances, and other such meta-data collection has also been normalised.

EdTech products today offer intuitive digital dashboards which provide to teachers detailed analyses of student behaviour, attitude, and even self-reported socio-emotional state (see screenshots from Net Support, a platform claiming to be used by millions of students in the UK and internationally).

This increased data collection allows for even more granular and continuous assessments, analytics and informed decision-making, on one hand. On the other hand, however, it has created a panopticon in which children become increasingly hyper-visible to everything they do, say, or think.


There are two issues with this hyper-visibility: When something becomes known it is also easier to control. The level of precision granular data extraction for analysis, inferencing, profiling and prediction becomes the holly grail of persuasion. The use of data for mind control and moulding societies is a real issue and risk to think about. And secondly, when everything converts into data - be that a child's 'feelings' or attendance, grade, and socio-economic status (i.e., receiving free lunches at school) - forms a culture of labelling and performativity. Labelling (and specifically diagnostic labelling) can have further negative affect on the evaluation of students according to research. Moreover, labelling can have a lasting impact on how children are treated and how they see themselves. This is not to disregard when help and dedicated programs are needed. But the risks of hyper-visibility from the granular data collection can lead to misreading the data and mistreating the situation and, crucially, impacting negatively on children long term.

Race against the machine

A school becomes a lab where not only literacy is tackled but also any other efforts beyond the academic because data is now collected beyond the academic records and attendance. As one head teacher once told me: whenever there is a societal problem, the focus veers towards schools and new programs are developed so that we can tackle the newly-emerged problem. That has also attracted tremendous technological ideas, too. Companies have come forward to offer further data collection for analytics and deeper dig for causes and problems.

And so, we witness a constant avalanche of programs and provisions on anything - from coding to cyber security to tackling radicalisation. If there are issues with cyber insecurities, programs for cyber awareness and cyber hygiene are developed. Data about it is collected and monitoring mechanisms are installed. More recently, ChatGPT-like technologies are questioned whether to incorporate or ban in schools. The debate for and against such tools' use has led to, again, more planning on how schools should tackle this, what educational programs should be put in place for students to learn about AI and natural language processing, reinforcement learning, machine learning and so on. Within that, more data is asked to be collected, data to detect whether students cheat by using ChatGPT-like tools for their homework, and so on. Having such dynamism in education certainly has its positives. On the other hand, this looks like a knee-jerk reaction and a tiring race of catching up with fast-changing technologies and propositions. Such racing leaves little chance for children to dive in depth in any one domain. This further poses the question of what education is about and whether the objectives of what children should learn hasn't become a constantly moving target and therefore a giant distraction from some more substantial and fundamental goals such as the unscrutinised EdTech sector, the constant data collection that has normalised surveillance and the risks relating to data privacy loss.

Risks for children

Making children hyper-visible through extensive data collection in schools can pose several risks, including:

  1. Privacy concerns: Children's sensitive information may be disclosed or used without their consent, leading to violations of their privacy rights.

  2. Bias and discrimination: The use of data analytics and algorithms in education can perpetuate existing biases and lead to discrimination against certain students based on their race, gender, socioeconomic status, or other factors.

  3. Stereotyping and labelling: Data collected in schools can be used to make assumptions or generalizations about students, which can result in them being labeled or stigmatized based on their academic or behavioral records.

  4. Lack of agency: Children have limited control over the data that is collected about them, which can have long-term consequences for their futures, such as their educational opportunities, employment prospects, and personal privacy.

  5. Unintended consequences (residual data harm): The use of data in education can have unintended consequences, such as increased standardisation, reduced creativity, and decreased student autonomy and agency.

It is important that data collection in schools be done in a responsible and ethical manner, with proper safeguards in place to protect children's privacy and ensure fair and unbiased treatment.

Taming (Ed) Tech and data collection

Data privacy of children can be viewed from a variety of approaches, including:

  1. Law: Children's data privacy is protected by laws and regulations, such as the Children's Online Privacy Protection Act (COPPA) in the United States, the General Data Protection Regulation (GDPR) in the European Union, and similar legislation in other countries.

  2. Ethics: Data privacy for children can also be viewed from an ethical perspective, considering the rights and well-being of children and the responsibilities of those who collect and process their data.

  3. Human rights: Children's data privacy can also be viewed as a human rights issue, with a focus on the protection of children's fundamental rights to privacy, dignity, and autonomy.

  4. Technology: Technical measures, such as encryption, anonymization, and data minimization, can be used to protect children's data privacy and security.

  5. Evidence: Ensure that external assessment is conducted to verify that necessary legal, socio-ethical, privacy and human rights, appropriate designs and principles are met.

Ultimately, however, it is the EdTech sector's commitment to demonstrate that they meet the necessary requirements when it comes to data responsibility, ethical practices, and evidence impact. Providing guidance and leaving the sector to self-control and do the right thing will not be enough.

An evaluation and certification system should be in place.

58 views2 comments


Mar 29

One of the primary concerns with extensive data collection in education, especially when leveraging tools like HubSpot data enrichment, is the privacy and security of students' personal information. There is a need to establish clear boundaries to protect sensitive data from unauthorized access, breaches, or misuse. Implementing robust security measures and adhering to strict privacy protocols is essential to ensure that platforms such as HubSpot data enrichment are used responsibly and ethically in educational settings.

Tony Stark
Tony Stark
Mar 29
Replying to

Not only protect it from privacy point of view but also from the risks that data manipulation will lead to poor decisions/inferencing and prediction that affects children negatively. Data, in other words, may still be protected (what much of industry says today!), but there is no protection from the processing and how it influences decisions.

bottom of page