In 2024, elections took place in some 70 countries, home to half the world’s population. In India alone, 642 million people cast ballots1 in the general election, making it the largest democratic exercise in history. Ensuring the smooth conduct of Indian elections for hundreds of millions of people was no small feat and required the dedication of millions of support staff. ] in the general election, making it the largest democratic exercise in history. Ensuring the smooth conduct of Indian elections for hundreds of millions of people was no small feat and required the dedication of millions of support staff.
Our team worked with the government of Uttar Pradesh where millions of registered voters participated in the elections. To support this monumental task, IDinsight developed ElectionGPT–– an AI-powered tool designed to help election officers in India. The tool served as an indispensable resource, answering thousands of questions with speed and accuracy, while saving officials countless hours.
Household surveys are a critical source of data for understanding the conditions, experiences and aspirations of families. Governments and social sector organizations use data from household surveys to inform program design, targeting, service delivery, budget allocations and more. Household surveys give families a voice – through data – in the policies and programs that affect their lives. But it is impossible to reach all households, and not all households are alike. So how do surveyors choose who to visit to ensure that their data are representative of the wide variety of families in a given place? That is a question about sampling.
An earlier version of this blog post appeared on my person blog.
Household surveys often involve more than one “stage” of sampling – e.g. in the first stage, we might randomly sample villages and in the second stage we might randomly sample households within these villages. Most often, we use two stages when sampling. Accounting for two sampling stages is pretty straightforward. In some cases, we might want to consider using three stages. Unfortunately, to my knowledge, there aren’t a lot of good resources on how to account for more than two stages when sampling. In this post, I’ll try to answer four questions:
Over the last year or so, the data science team at IDinsight has been busy building AI products like Ask-A-Question and Ask-A-Metric. A few months back, I was on a panel on GenAI for Social Impact and was asked if they should be investing in AI or not. I talked about how AI is a tool and we want to be problem driven. I talked about thinking about your use case and finding the tool that fits it best instead of starting with the AI hammer and looking for a nail.
With all the hoopla around AI of late, these are questions on the minds of almost all social sector organizations. Below is how I see it. I’d love to hear your thoughts.
We’ve spent the last 8 months or so building Ask-a-Question, an AI question-answering service for direct-to-citizen helplines. We decided not to create yet-another-RAG solution. Ethan Mollick summarizes our concerns well. In short, the tech for a cheap, scalable, and guaranteed error-free AI is not there - the kind that you’d want to roll out to citizens in high risk and high trust use-cases like health. Instead, we wanted to lean on our experience building a question-answering for MomConnect to build something that is (a) trustworthy; and (b) provides actionable insights to support continuous learning and improvement.
We have fine-tuned the Gemma-2 2-billion parameter instruction model on a custom dataset in order to detect whether user messages pertain to urgent or non-urgent maternal healthcare issues. Our model demonstrates superior performance compared to GPT-3.5-Turbo in accurately distinguishing between urgent and non-urgent messages. Both the dataset and the model have been made publicly available to support further research and development in this critical area.1