Chapter 4 Sampling

The project aims to not only research respondent but communities, villages in rural and neighborhoods in urban areas. In total, 1,300 communities a 20 respondents are going to be selected, which totals in 26,000 interviews. Since there is no sufficient information about where villages are located and how many people are living in them, we take a geographic approach. A unit is defined as a 1 kilometer square in which we pick in a later step one or more 100 meter units, depending on whether twenty or more people are living in the first 100 meter unit. All units are selected by probability proportional sampling.

The risk of sampling “empty” units should be rather small because of the probability proportinal sampling method that favors areas with a higher population and areas with a low population have a lower change to be drawn. Furthermore, the WorldPop data is using an sophisticated algorithm to predict population numbers that are the foundation of the sampling approach. However, there is still a risk that units with a lower population get selected and the WorldPop algorithm predictions are not perfect. Hence, there is a need to oversample to be able to replace “empty” units at a later stage.

4.1 Nairobi, Lusaka, and Lilongwe

nairobi_sample_150 <- oversample_wrapper_non_pair(name = "Nairobi",
                            bins = nairobi_bins,
                            raster_1 = nairobi_population,
                            raster_2 = nairobi_population_1k,
                            random_number = 680,
                            original_min_unit2=2,
                            original_sample_unit2=150,
                            by_factor=1.1,
                            verbose=2)


lusaka_sample_150 <- oversample_wrapper_non_pair(name = "Lusaka",
                            bins = lusaka_bins,
                            raster_1 = lusaka_population,
                            raster_2 = lusaka_population_1k,
                            random_number = 680,
                            original_min_unit2=2,
                            original_sample_unit2=150,
                            by_factor=1.1,
                            verbose=2)

lilongwe_sample_150 <- oversample_wrapper_non_pair(name = "Lilongwe",
                            bins = lilongwe_bins,
                            raster_1 = lilongwe_population,
                            raster_2 = lilongwe_population_1k,
                            random_number = 680,
                            original_min_unit2=2,
                            original_sample_unit2=150,
                            by_factor=1.1,
                            verbose=2)