All Posts

A thumbnail image

Clustering algorithms for grid-based sampling

TL;DR: In this blog post, we will describe a custom clustering algorithm we designed to efficiently cluster grids into enumeration areas for grid-based sampling The DSEM team at IDinsight is the technical workhorse for project teams, and nearly every piece of technical work we do involves grouping things by some measure of similarity. Let me explain. For our work with Educate Girls (EG), we predicted the number of out-of-school girls (OOSGs) in villages across northern India.

A thumbnail image

Making satellite imagery easy-to-use: speeding up computations

In our previous post, we examined how satellite imagery can be used in the social sector and how the MOSAIKS algorithm enables us to draw out “features” from these images without needing complex image-processing models. But the story doesn’t end with the algorithm. Satellite images are large files. This means the retrieval from storage, processing through the MOSAIKS pipeline, and storing the resultant features is quite slow (though still computationally efficient relative to other options).

A thumbnail image

Making satellite imagery easy-to-use: the MOSAIKS algorithm

Satellite imagery has become a valuable tool in global development: from environmental monitoring and disaster response to urban planning and agriculture. With more and more high-resolution satellite imagery available as open-source datasets, information about land usage and populations have become widely accessible. But this data also needs advanced analytical techniques to make sense of it. Machine learning is one of these tools: we can use deep learning models to extract information from lots of satellite images, and quickly.

A thumbnail image

Practical learnings for development, engineering, and data science

In Karl Popper’s The Open Society and its Enemies (1945) he introduces “piecemeal social engineering,” his framework for building up social institutions incrementally informed by experimentation and evidence. This is in contrast to the more prevalent “utopian social engineering” of his time which he criticized for overly lofty / abstract ideals that largely ignored practicality; indeed today we might regard such methods as colonial and paternalistic. For Popper, the “piecemeal engineer knows, like Socrates, how little he knows.

A thumbnail image

The reality behind a machine learning dataset

As data practitioners, we are separated by vast distances from the ground truth. There is, in one sense, the literal physical distance between our laptop screens and the places and sites of data collection which can cause fidelity losses in context and empathy. There is also a representative distance – in some cases, an asymmetry of power – between the reality of researching and practicing machine learning, of publishing papers, of open-source repositories, of commercial applications – and the labor that goes into each row of data; the families represented by vectors; each interaction is distilled into a potential flag for data quality.

A thumbnail image

Using AI to improve maternal health chatbots in South Africa

Amahle is pregnant and soon expecting a new addition to her family. She has been seeking maternal care through South Africa’s national WhatsApp helpline for the past seven months where she frequently consults a help-desk team about pregnancy challenges she’s been facing. At one point she got really worried that she couldn’t feel her baby move and it took a while for the help-desk to get back to her. Soon, she will be able to get instant recommendations with the help of a technology developed by IDinsight that will automatically answer her questions.