All Posts

A thumbnail image

Practical learnings for development, engineering, and data science

In Karl Popper’s The Open Society and its Enemies (1945) he introduces “piecemeal social engineering,” his framework for building up social institutions incrementally informed by experimentation and evidence. This is in contrast to the more prevalent “utopian social engineering” of his time which he criticized for overly lofty / abstract ideals that largely ignored practicality; indeed today we might regard such methods as colonial and paternalistic. For Popper, the “piecemeal engineer knows, like Socrates, how little he knows. He knows that we can learn only from our mistakes.”1 In that spirit, I want to begin with lessons we have learned in trying to apply engineering principles and methods to help our partners increase their social impact. I hope that our learnings can be helpful for others in the sector.

A thumbnail image

The reality behind a machine learning dataset

As data practitioners, we are separated by vast distances from the ground truth. There is, in one sense, the literal physical distance between our laptop screens and the places and sites of data collection which can cause fidelity losses in context and empathy. There is also a representative distance – in some cases, an asymmetry of power – between the reality of researching and practicing machine learning, of publishing papers, of open-source repositories, of commercial applications – and the labor that goes into each row of data; the families represented by vectors; each interaction is distilled into a potential flag for data quality. In this blog, I hope to illustrate those minutiae and bring together these two worlds.

A thumbnail image

Using AI to improve maternal health chatbots in South Africa

Amahle is pregnant and soon expecting a new addition to her family. She has been seeking maternal care through South Africa’s national WhatsApp helpline for the past seven months where she frequently consults a help-desk team about pregnancy challenges she’s been facing. At one point she got really worried that she couldn’t feel her baby move and it took a while for the help-desk to get back to her. Soon, she will be able to get instant recommendations with the help of a technology developed by IDinsight that will automatically answer her questions.

A thumbnail image

A Julia/JuMP model for optimal assignment of teachers to school

We were recently approached by a government to optimally allocate teachers to schools. The state has a few hundred secondary schools and is meant to offer a set of subjects in all these schools. But there are a number of classrooms without a teacher allocated. To close these “gaps”, a new cohort of teachers were trained to join the existing set of teachers. They asked us how these new teachers should be assigned while taking operational and logistical constraints into account. These kinds of problems are notoriously hard to solve optimally1. Operational Research offers clever techniques to search the solution space for optimal solutions.

A thumbnail image

Data quality check for hierarchical linear data with outliers

In this post, I discuss how IDinsight’s DSEM team implemented a data quality check for pairs of variables displaying linear relationships. We explain why we chose a Bayesian model, and the tweaks we made to address the hierarchy and outliers in data. We will also see how to implement the model using PyMC3.