Session Submission Summary
Share...

Direct link:

Optimizing prediction to strengthen education systems

Thu, April 29, 1:45 to 3:15pm PDT (1:45 to 3:15pm PDT), Zoom Room, 125

Group Submission Type: Formal Panel Session

Proposal

Educational researchers are increasingly looking to combine innovative models, data processing techniques, and applications of artificial intelligence with human judgement to understand how to improve learning and ensure children progress through schools. While this has become more pervasive in high-income systems, it is only just emerging on the radar of policy makers in poorer and low-income systems.

For many, the key goal of these models is to improve the use of data to help make decisions, allocate scarce resources, predict trends, improve targeting and ensure money is allocated more equitably. These models are expected to improve on descriptive statistics and retrospective analysis, and other types of evidence that currently underpin policy makers’ decisions..

A number of promising areas in which these predictive analytics have gained popularity include improving efficiency of teacher allocation by incorporating teacher preferences in teacher deployment strategies and matching expert teachers and schools, predicting which teachers will have the most impact on student test scores, introducing early warning systems for students at risk of school dropout or exhibiting violent behaviours so that interventions can be made, and tailoring personalised learning, among others.

Changing the way “data” is used, by incorporating more sophisticated predictive techniques, is the main force behind these creative solutions to education problems. This panel will present various cutting-edge techniques (Machine Learning and Markov chains) to predict enrolment, non-enrolment and dropout in three different contexts. These are outlined as follows:

- Using a Markov chain model to estimate the impact of COVID-19 on the Rwandan education system
- Using machine learning at a community-level to improve education intervention targeting by predicting non-enrolment of girls in India
- Using machine learning at a child-level to predict the children most at risk of not entering or dropping out of education in Sierra Leone.

Markov chains are mathematical models that describe a sequence of future events based on a set of probabilities defined by previous events. Such models have been used in various fields like economics, finance, and even music. They can be used to create evidence-based models to support planning and decision-making for large systems, on the basis of existing data.

Machine learning algorithms are a branch of artificial intelligence that provides computer systems with the ability to learn iteratively from data and identify patterns in the data without those patterns being pre-specified by a programmer. Machine learning algorithms are being increasingly deployed to help implementers decide where to target their programs. In certain situations, these tools can dramatically increase the cost-effectiveness of programs by helping implementers reach more eligible program recipients than they would otherwise.

Panellists will present these topics, highlight existing limitations and will discuss what needs to change in the way we collect and utilize data in the education sector so that we can leverage these techniques to improve education. We believe that there is a gap between the educational researchers developing ground-breaking models and the policymakers intended to deploy those models. Our panel aims to help connect these two communities.

This panel conceptualizes social responsibility within 3 broad areas of inquiry:

- How do we make better use of existing data and how do we work collaboratively to improve the quality of the data collected? We will address this by recognising who is involved in the use of data, what are the existing barriers to data utilisation for a variety of stakeholders, and how can we leverage data to formulate localised solutions.

- We need more, better, more diverse, and more comprehensive data. Can less be more when it comes to data? Building larger, more frequent and more complex volumes of data, including new sources, increases the likelihood of optimizing prediction and making better decisions.

In line with these needs, the field of education increasingly draws on data from multiple sources, not only student demographics, but also teachers’ demographics, performance data, data obtained from schools and the communities, which is often mapped using geographic information systems (GIS).

As new data becomes available, the scope for improving ML-based tools increases as well. The performance of an ML-based tool varies over time as predictions are made for new populations and people learn to adapt to the model. Algorithms must be continuously assessed and calibrated to improve performance.

However, more data of poor data quality can lead to wrong decision-making and in some instances, streamlining is best.

- How can researchers reduce bias in training data and keep transparency in education?

Besides standard checks on the performance and portability of algorithms to new populations, researchers must assess whether algorithms systematically exclude vulnerable groups. Questions such as ‘Who is represented in the data?’, ‘Who is excluded?’, ‘How are we capturing social constructs?’ should be pursued and addressed. It is often necessary to look under the hood of these algorithms and make adjustments to ensure inclusion of priority groups.

While programming and scale-up decisions could likely be improved with increased use of ML methods, researchers and practitioners should avoid over-reliance on an algorithm. Any ML-based tool should include a mechanism for contesting specific predictions and allowing human judgment to exercise exceptions where the algorithm makes a clearly bad prediction.

Last, we will touch on the increasing need for open source solutions, and the role that intellectual property rights play in both using data and learning algorithms. Additionally, we will discuss the need to address privacy concerns of individuals we use the data on, whether it be students, families, or teachers whose data is crucial in building effective algorithms and accurate predictions.

Sub Unit

Chair

Individual Presentations

Discussant