Welcome to the IDinsight Tech Blog

Check in regularly or sign-up to the IDinsight newsletter to find out about IDinsight's latest data science and engineering work.

All Posts

A thumbnail image

ElectionGPT: How AI-powered tools supported India’s elections––and what may be next for AI in the public sector

In 2024, elections took place in some 70 countries, home to half the world’s population. In India alone, 642 million people cast ballots1 in the general election, making it the largest democratic exercise in history. Ensuring the smooth conduct of Indian elections for hundreds of millions of people was no small feat and required the dedication of millions of support staff. ] in the general election, making it the largest democratic exercise in history. Ensuring the smooth conduct of Indian elections for hundreds of millions of people was no small feat and required the dedication of millions of support staff. Our team worked with the government of Uttar Pradesh where millions of registered voters participated in the elections. To support this monumental task, IDinsight developed ElectionGPT–– an AI-powered tool designed to help election officers in India. The tool served as an indispensable resource, answering thousands of questions with speed and accuracy, while saving officials countless hours.

A thumbnail image

Let’s Make Sure the Right-Hand Rule is Left-Behind

Household surveys are a critical source of data for understanding the conditions, experiences and aspirations of families. Governments and social sector organizations use data from household surveys to inform program design, targeting, service delivery, budget allocations and more. Household surveys give families a voice – through data – in the policies and programs that affect their lives. But it is impossible to reach all households, and not all households are alike. So how do surveyors choose who to visit to ensure that their data are representative of the wide variety of families in a given place? That is a question about sampling.

A thumbnail image

Three Stage Sampling

An earlier version of this blog post appeared on my person blog. Household surveys often involve more than one “stage” of sampling – e.g. in the first stage, we might randomly sample villages and in the second stage we might randomly sample households within these villages. Most often, we use two stages when sampling. Accounting for two sampling stages is pretty straightforward. In some cases, we might want to consider using three stages. Unfortunately, to my knowledge, there aren’t a lot of good resources on how to account for more than two stages when sampling. In this post, I’ll try to answer four questions:

A thumbnail image

My board says do AI. Halp plz

Over the last year or so, the data science team at IDinsight has been busy building AI products like Ask-A-Question and Ask-A-Metric. A few months back, I was on a panel on GenAI for Social Impact and was asked if they should be investing in AI or not. I talked about how AI is a tool and we want to be problem driven. I talked about thinking about your use case and finding the tool that fits it best instead of starting with the AI hammer and looking for a nail. With all the hoopla around AI of late, these are questions on the minds of almost all social sector organizations. Below is how I see it. I’d love to hear your thoughts.

A thumbnail image

Why "Ask A Question"?

We’ve spent the last 8 months or so building Ask-a-Question, an AI question-answering service for direct-to-citizen helplines. We decided not to create yet-another-RAG solution. Ethan Mollick summarizes our concerns well. In short, the tech for a cheap, scalable, and guaranteed error-free AI is not there - the kind that you’d want to roll out to citizens in high risk and high trust use-cases like health. Instead, we wanted to lean on our experience building a question-answering for MomConnect to build something that is (a) trustworthy; and (b) provides actionable insights to support continuous learning and improvement.

A thumbnail image

Enhancing Maternal Healthcare: Training Language Models to Identify Urgent Messages in Real-Time

We have fine-tuned the Gemma-2 2-billion parameter instruction model on a custom dataset in order to detect whether user messages pertain to urgent or non-urgent maternal healthcare issues. Our model demonstrates superior performance compared to GPT-3.5-Turbo in accurately distinguishing between urgent and non-urgent messages. Both the dataset and the model have been made publicly available to support further research and development in this critical area.1