Zambia Population Map Metadata Report

Prediction Weighting Layer Used in Population Redistribution

The data presented below represent the predicted number of people per ~100 m pixel as estimated using the random forest (RF) model as described in Stevens, et al. (2015). The following pages contain a description of the RF model and its covariates, their sources and any metadata collected for each covariate. The prediction weighting layer is used to dasymetrically redistribute the census counts and project counts to match estimated populations based on UN estimates for the final population maps provided by WorldPop.

Stevens, F. R., Gaughan, A. E., Linard, C., & Tatem, A. J. (2015). Disaggregating Census Data for Population Mapping Using Random Forests with Remotely-Sensed and Ancillary Data. PLOS ONE, 10(2), e0107042. doi:10.1371/journal.pone.0107042

plot of chunk predict_density

Zambia Census Data and Observed Population Density

These data are the population density values used to estimate the RF model used to create the prediction weighting layer you see above. Values represent population density as measured by people per hectare and calculated from population counts within each census unit. These values are used as the dependent variable during model estimation.

Zambia Census, 2010

Folder: Census
File Name: ZMB_adm2_2010.shp
Source: Zambia Census, 2010, provided from Catherine Linard
Description: These high spatial resolution census block data were attained through in-country partners for 2010.
Class: polygon
Derived Covariates:
area, buff, zones,

class       : SpatialPolygonsDataFrame 
nfeatures   : 72 
extent      : -309032, 897801, -2121582, -976014  (xmin, xmax, ymin, ymax)
coord. ref. : NA 
nvariables  : 22

plot of chunk census_data


Random Forest Model and Diagnostics

These output and figures outline the estimated RF model that is used to predict the population density weighting layer. The model is fitted to the population density values for the preceding census data using covariates aggregatedfrom the ancillary data sources summarized following the model diagnostics.

[1] "Random Forest model is a merged RF model using models from:"
ZMB_2010_2000, ZMB_2000,   

Call:
 randomForest(x = x_data, y = y_data, ntree = popfit$ntree, mtry = popfit$mtry,      nodesize = length(y_data)/1000, importance = TRUE, proximity = TRUE) 
               Type of random forest: regression
                     Number of trees: 1000
No. of variables tried at each split: 13

plot of chunk random_forest

plot of chunk random_forest

Covariate Metadata

Remotely-sensed, Classified Landcover, ESA 2010

Folder: Landcover
File Name: ESA_2010_ZMB.tif
Source: http://www.esa-landcover-cci.org/
Description: Land cover information was combined from a GlobCover 2010 coverage and fused with Landsat-derived urban/rural built area classification to construct a single land cover dataset.
Class: raster
Derived Covariates:
cls011, dst011, cls040, dst040, cls130, dst130, cls140, dst140, cls150, dst150, cls160, dst160, cls190, dst190, cls200, dst200, cls210, dst210, cls230, dst230, cls240, dst240, cls250, dst250, clsBLT, dstBLT,

class       : RasterBrick 
dimensions  : 11483, 12092, 138852436, 1  (nrow, ncol, ncell, nlayers)
resolution  : 100, 100  (x, y)
extent      : -310025, 899175, -2123329, -975029  (xmin, xmax, ymin, ymax)
coord. ref. : +proj=aea +lat_1=20 +lat_2=-23 +lat_0=0 +lon_0=25 +x_0=0 +y_0=0 +datum=WGS84 +units=m +no_defs +ellps=WGS84 +towgs84=0,0,0 
data source : D:\WorldPop\data\ZMB_2010_2000\Landcover\Derived\landcover.tif 
names       : landcover 
min values  :        11 
max values  :       210 

plot of chunk covariate_reports


Suomi NPP VIIRS-Derived 2012 Lights at Night, 15 arc-second

Folder: Lights
File Name: DEFAULT: VIIRS 2012
Source: http://ngdc.noaa.gov/eog/viirs/download_viirs_ntl.html
Description: These 'Lights at Night' data were derived from imagery collected by the Suomi National Polar-orbiting Partnership (NPP) Visible Infrared Imaging Radiometer Suite (VIIRS) sensor. Data were collected in 2012 on moonless nights and though background noise associated with fires, gas-flares, volcanoes or aurora have not been removed it represents the best-available data for night-time light production.
Class: raster
Derived Covariates:
,

class       : RasterBrick 
dimensions  : 11483, 12092, 138852436, 1  (nrow, ncol, ncell, nlayers)
resolution  : 100, 100  (x, y)
extent      : -310025, 899175, -2123329, -975029  (xmin, xmax, ymin, ymax)
coord. ref. : +proj=aea +lat_1=20 +lat_2=-23 +lat_0=0 +lon_0=25 +x_0=0 +y_0=0 +datum=WGS84 +units=m +no_defs +ellps=WGS84 +towgs84=0,0,0 
data source : D:\WorldPop\data\ZMB_2010_2000\Lights\Derived\lights.tif 
names       : lights 
min values  : -0.052 
max values  :   1377 

plot of chunk covariate_reports


WorldClim/BioClim Mean Annual Temperature 1950-2000, 30 arc-second

Folder: Temp
File Name: DEFAULT: BIO1
Source: http://www.worldclim.org/current
Description: WorldClim/BioClim 1950-2000 mean annual precipitation (BIO12) and mean annual temperature (BIO1) estimates (Hijmans et al., 2005) were downloaded, mosaicked and subset to match the extent of our land cover data for the mapping of this region.
Class: raster
Derived Covariates:
,

class       : RasterBrick 
dimensions  : 11707, 12340, 144464380, 1  (nrow, ncol, ncell, nlayers)
resolution  : 100, 100  (x, y)
extent      : -321025, 912975, -2134229, -963529  (xmin, xmax, ymin, ymax)
coord. ref. : +proj=aea +lat_1=20 +lat_2=-23 +lat_0=0 +lon_0=25 +x_0=0 +y_0=0 +datum=WGS84 +units=m +no_defs +ellps=WGS84 +towgs84=0,0,0 
data source : D:\WorldPop\data\ZMB_2010_2000\Temp\Derived\temp.tif 
names       : temp 
min values  :  118 
max values  :  269 

plot of chunk covariate_reports


WorldClim/BioClim Mean Annual Precipitation 1950-2000, 30 arc-second

Folder: Precip
File Name: DEFAULT: BIO12
Source: http://www.worldclim.org/current
Description: WorldClim/BioClim 1950-2000 mean annual precipitation (BIO12) and mean annual temperature (BIO1) estimates (Hijmans et al., 2005) were downloaded, mosaicked and subset to match the extent of our land cover data for the mapping of this region.
Class: raster
Derived Covariates:
,

class       : RasterBrick 
dimensions  : 11707, 12340, 144464380, 1  (nrow, ncol, ncell, nlayers)
resolution  : 100, 100  (x, y)
extent      : -321025, 912975, -2134229, -963529  (xmin, xmax, ymin, ymax)
coord. ref. : +proj=aea +lat_1=20 +lat_2=-23 +lat_0=0 +lon_0=25 +x_0=0 +y_0=0 +datum=WGS84 +units=m +no_defs +ellps=WGS84 +towgs84=0,0,0 
data source : D:\WorldPop\data\ZMB_2010_2000\Precip\Derived\precip.tif 
names       : precip 
min values  :    536 
max values  :   2587 

plot of chunk covariate_reports


Open Street Map (OSM) Road Network, 2016

Folder: Roads
File Name: gisosm_roads_free_1.shp
Source: http://www.openstreetmap.org/
Description: These data were downloaded as part of a per-country package of OSM data layers made availalble as shapefiles through the http://www.BBBike.org/community.html website.
Class: linear
Derived Covariates:
cls, dst,

class       : SpatialLinesDataFrame 
nfeatures   : 53065 
extent      : -318776, 898264, -2129340, -974824  (xmin, xmax, ymin, ymax)
coord. ref. : NA 
nvariables  : 10

plot of chunk covariate_reports


Open Street Map (OSM) River Lines, 2016

Folder: Rivers
File Name: gisosm_waterways_free_1.shp
Source: http://www.openstreetmap.org/
Description: These data were downloaded as part of a per-country package of OSM data layers made availalble as shapefiles through the http://www.BBBike.org/community.html website.
Class: linear
Derived Covariates:
cls, dst,

class       : SpatialLinesDataFrame 
nfeatures   : 2041 
extent      : -318581, 892867, -2130674, -966553  (xmin, xmax, ymin, ymax)
coord. ref. : NA 
nvariables  : 5