Kenya Population Map Metadata Report

Prediction Weighting Layer Used in Population Redistribution

The data presented below represent the predicted number of people per ~100 m pixel as estimated using the random forest (RF) model as described in Stevens, et al. (In Press). The following pages contain a description of the RF model and its covariates, their sources and any metadata collected for each covariate. The prediction weighting layer is used to dasymetrically redistribute the census counts and project counts to match estimated populations based on UN estimates for the final population maps provided by AfriPop, AsiaPop and AmeriPop.

plot of chunk predict_density

Kenya Census Data and Observed Population Density

These data are the population density values used to estimate the RF model used to create the prediction weighting layer you see above. Values represent population density as measured by people per hectare and calculated from population counts within each census unit. These values are used as the dependent variable during model estimation.

Kenya Census Data, 1999, Admin-level 5

Folder: Census
File Name: KEN_census_1999_sublocations_topo.shp
Source: Kenya National Bureau of Statistics, acquired by Tatem, et al. for use in AfriPop data products.
Description: These census data were acquired for use as a disaggregation layer for more-recent census data for AfriPop. It is used here on its own to produce a disaggregated population map for 1999 because it is the finest level census data available. Required fields for map production are ADMINID and ADMINPOP.
Class: polygon
Derived Covariates:
area, buff, zones,

class       : SpatialPolygonsDataFrame 
nfeatures   : 6624 
extent      : -66764, 823501, -517009, 605783  (xmin, xmax, ymin, ymax)
coord. ref. : NA 
nvariables  : 50

plot of chunk census_data

Random Forest Model and Diagnostics

These output and figures outline the estimated RF model that is used to predict the population density weighting layer. The model is fitted to the population density values for the preceding census data using covariates aggregatedfrom the ancillary data sources summarized following the model diagnostics.

 randomForest(x = x_data, y = y_data, ntree = popfit$ntree, mtry = popfit$mtry,      nodesize = length(y_data)/1000) 
               Type of random forest: regression
                     Number of trees: 500
No. of variables tried at each split: 8

          Mean of squared residuals: 0.67
                    % Var explained: 83

plot of chunk random_forestplot of chunk random_forestplot of chunk random_forest

Covariate Metadata

Kenya Classified Land Cover

Folder: Landcover
File Name: KEN_gc_reclass_0.0008333_rurb_8bit.img
Source: GlobCover, 300m
Description: Landcover from the GlobCover product, reclassified to match AfriPop coding and eventually broken down into binary classifications by aggregated land cover type (see Linard, et al., 2010 and Gaughan, et al. 2013 for category information).
Class: raster
Derived Covariates:
prp011, cls011, dst011, prp040, cls040, dst040, prp130, cls130, dst130, prp140, cls140, dst140, prp150, cls150, dst150, prp160, cls160, dst160, prp190, cls190, dst190, prp200, cls200, dst200, prp210, cls210, dst210, prp230, cls230, dst230, prp240, cls240, dst240, prp250, cls250, dst250, prpBLT, clsBLT, dstBLT,

class       : RasterLayer 
dimensions  : 11263, 8911, 100364593  (nrow, ncol, ncell)
resolution  : 100, 100  (x, y)
extent      : -66765, 824335, -519218, 607082  (xmin, xmax, ymin, ymax)
coord. ref. : +proj=utm +zone=37 +datum=WGS84 +units=m +no_defs +ellps=WGS84 +towgs84=0,0,0 
data source : D:\Documents\Graduate School\Research\Population\Data\RF\data\KEN\Landcover\Derived\landcover.tif 
names       : landcover 
values      : 0, 240  (min, max)
attributes  :
       ID OID Value    Count
 from:  0   0    11 19254933
 to  :  9   9   240     8562

plot of chunk covariate_reports

MODIS 17A3 2010 Estimated Net Primary Productivity, 1km

Folder: NPP
File Name: KEN_gc_reclass_0.0008333_rurb_8bit.img
Source: United States Geological Survey (USGS)
Description: MODIS 17A3 version-55 derived estimates of net primary productivity for the year 2010, estimated for 1km pixel sizes and subset and resampled to match the available land cover and final population map output requirements.
Class: raster
Derived Covariates:

class       : RasterLayer 
dimensions  : 11263, 8911, 100364593  (nrow, ncol, ncell)
resolution  : 100, 100  (x, y)
extent      : -66765, 824335, -519218, 607082  (xmin, xmax, ymin, ymax)
coord. ref. : +proj=utm +zone=37 +datum=WGS84 +units=m +no_defs +ellps=WGS84 +towgs84=0,0,0 
data source : D:\Documents\Graduate School\Research\Population\Data\RF\data\KEN\NPP\Derived\npp.tif 
names       : npp 
values      : 0, 22341  (min, max)
attributes  :
          ID Rowid    COUNT
 from:     0     0 11089043
 to  : 22341 19433       90