Research for Global Development

Data Volunteering with International Organizations at the World Bank


During the weekend of March 16-17, 130 eager data volunteers and dozens of World Bank, U.N. Development Programme, and U.N. Global Pulse staffers convened at the World Bank in Washington, D.C., to take part in the DataKind  #Data4Good DataDive. These organizations volunteered their data and staff and volunteers donated their time to tackle two main challenges:

1) Find new and innovative ways to measure poverty; and 2) sift through World Bank procurement and program data to identify possible techniques to detect fraud and corruption. Simple, right?

In the lead-up to the event at the bank, DataKind and its volunteer data ambassadors (myself among them) helped prepare data for eight primary projects listed here in this hackpad.

I had the opportunity to lead a team Web scraping  (a software technique for extracting data from websites) food prices and consumption data from around the world.  We quickly learned that groups like the U.N. Global Pulse have also been looking at this same idea.  #TeamNdizi (ndizi means banana in Swahili), as we called ourselves, first started with a Web Scraping 101 tutorial for six team members where we learned to scrape using crowdsourced banana prices on ScraperWiki.  Team Ndizi then split into two subunits, one looking for food price and consumption data in Africa, particularly Kenya, and the other unit looking at rice prices in Indonesia.  For a few more visuals and links to the scrapers and data, see our github page: https://github.com/mjrich/ndizi.

What would success look like for our team?  The senior World Bank economist we worked with hoped for a “definitive chart” that showed this approach to gathering food prices might bear fruit (pun intended).

To do so, we plotted the Indonesia rice data over time.  The Indonesia team first grabbed price data from the Carrefour Indonesia supermarket website and Twitter account, then used the internet archives to scrape historical Carrefour Indonesia webpages and combined these data with the U.N. Food and Agriculture Organization (FAO) rice prices.  In doing so, we were able to produce the chart below showing the price increase in two brands of rice up to the present day.  Certainly more research and data preparation is necessary, but in 12 hours we were able to demonstrate the possibility of creating a commodity price leading indicator from open data:

The next step will be to work with the 100,000 data points of daily commodities prices covering 1,095 days that were scraped from mfarm and compare them to FAO Kenya and World Bank global prices.  These datasets are loaded here for future analysis.

By the end of the event, both volunteers and international organization representatives seemed equally shocked at the amount of ground covered and the potential for next steps.  Here’s a great blog post from the World Bank recapping the event and a series of two more posts about other projects that took place during the event.

Beyond the utility of these data and analysis for participating organizations, what do events like this mean for the future of research and evaluation? While the work of eager and smart volunteers won’t replace the need for highly skilled and specialized researchers at institutions like the World Bank, events like this show that organizations willing to open up their data to volunteers with technical skills, but limited domain knowledge, can lead to unexpected and valuable new approaches and data sources.

InterMedia

Data Volunteering with International Organizations at the World Bank


During the weekend of March 16-17, 130 eager data volunteers and dozens of World Bank, U.N. Development Programme, and U.N. Global Pulse staffers convened at the World Bank in Washington, D.C., to take part in the DataKind  #Data4Good DataDive. These organizations volunteered their data and staff and volunteers donated their time to tackle two main challenges:

1) Find new and innovative ways to measure poverty; and 2) sift through World Bank procurement and program data to identify possible techniques to detect fraud and corruption. Simple, right?

In the lead-up to the event at the bank, DataKind and its volunteer data ambassadors (myself among them) helped prepare data for eight primary projects listed here in this hackpad.

I had the opportunity to lead a team Web scraping  (a software technique for extracting data from websites) food prices and consumption data from around the world.  We quickly learned that groups like the U.N. Global Pulse have also been looking at this same idea.  #TeamNdizi (ndizi means banana in Swahili), as we called ourselves, first started with a Web Scraping 101 tutorial for six team members where we learned to scrape using crowdsourced banana prices on ScraperWiki.  Team Ndizi then split into two subunits, one looking for food price and consumption data in Africa, particularly Kenya, and the other unit looking at rice prices in Indonesia.  For a few more visuals and links to the scrapers and data, see our github page: https://github.com/mjrich/ndizi.

What would success look like for our team?  The senior World Bank economist we worked with hoped for a “definitive chart” that showed this approach to gathering food prices might bear fruit (pun intended).

To do so, we plotted the Indonesia rice data over time.  The Indonesia team first grabbed price data from the Carrefour Indonesia supermarket website and Twitter account, then used the internet archives to scrape historical Carrefour Indonesia webpages and combined these data with the U.N. Food and Agriculture Organization (FAO) rice prices.  In doing so, we were able to produce the chart below showing the price increase in two brands of rice up to the present day.  Certainly more research and data preparation is necessary, but in 12 hours we were able to demonstrate the possibility of creating a commodity price leading indicator from open data:

The next step will be to work with the 100,000 data points of daily commodities prices covering 1,095 days that were scraped from mfarm and compare them to FAO Kenya and World Bank global prices.  These datasets are loaded here for future analysis.

By the end of the event, both volunteers and international organization representatives seemed equally shocked at the amount of ground covered and the potential for next steps.  Here’s a great blog post from the World Bank recapping the event and a series of two more posts about other projects that took place during the event.

Beyond the utility of these data and analysis for participating organizations, what do events like this mean for the future of research and evaluation? While the work of eager and smart volunteers won’t replace the need for highly skilled and specialized researchers at institutions like the World Bank, events like this show that organizations willing to open up their data to volunteers with technical skills, but limited domain knowledge, can lead to unexpected and valuable new approaches and data sources.

Marketing Materials

Contact Us:

InterMedia Headquarters

1825 K Street, NW
Suite 650
Washington, D.C. 20006
+1.202.434.9310
FAX: +1 202 434 9560
Contact | View Map

InterMedia Africa

UN Avenue, Gigiri Nairobi
Box 10224
City Square 00200
Nairobi, Kenya
+254.720.109183
Contact | View Map