Data collection and analysis is of great importance for humanitarian aid decisions. This is a challenging problem because obtaining a general yet clear picture of humanitarian needs in crises regions is labor intensive and costly. The team explored the idea of using a generative model to decrease survey time. They showed that using a Bayesian Network, one could pick questions during the interview to maximize the information gain. This approach could potentially save time for the first-hand data collection or alternatively allow collections of more data points.
Students: Martin Buttenschön, Natallie Baikevich, Luca Pedrelli, Georgios Papadimitriou
An important objective of humanitarian aid is to identify households in need. This is usually done by reaching out to people to answer a questionnaire detailing their living situation. Based on their answers, sectoral Index Indicators (PiN) are calculated. This is a time-consuming and labor-intensive process. The team analyzed a different approach aiming to predict the PiN on demographic variables. Furthermore, they provided a prototype of a visualization app to further help to identify people in need.
Students: Stephan Artmann, Viktoria de La Rochefoucauld, Nico Messikommer, Francesco Saltarelli
In order to assist decision-making in the humanitarian crisis in Nigeria, the team tried to identify undiscovered patterns on the Multi-Sectorial Needs Assessment dataset collected by REACH. First, they developed a random forests model that was able, to a low degree, to predict the overall level of need of a household. Second, they identified sets of co-occurring sectorial needs. Third, they showed that the current methodology doesn’t allow to accurately predict the reported needs of the households.
Students: Marco Mancini, Yilmazcan Ozyurt, Ylli Muhadri, Maria R. Cervera
The traditional approach of Multi-Sector Need Assessment structures questionnaires for assessing people in humanitarian crises into predefined sectors. The team found that about 80% of the information captured by a set of preselected questions can be recovered from only four latent factors. They showed that these factors exhibited a high linear dependence with some of the sectors, stressing the importance of these particular sectors. The exact semantics of these latent variables, however, may go well beyond the traditional sectors and is an open topic for future research. Thus, the team suggests that rethinking the traditional sector approach can lead to more concise data acquisition and analysis.
Students: Shirzart Enwer, Belinda Müller, Swaneet Sahoo