China Population Map Metadata Report

Prediction Weighting Layer Used in Population Redistribution

The data presented below represent the predicted number of people per ~100 m pixel as estimated using the random forest (RF) model as described in Stevens, et al. (In Press). The following pages contain a description of the RF model and its covariates, their sources and any metadata collected for each covariate. The prediction weighting layer is used to dasymetrically redistribute the census counts and project counts to match estimated populations based on UN estimates for the final population maps provided by AfriPop, AsiaPop and AmeriPop.

plot of chunk predict_density

China Census Data and Observed Population Density

These data are the population density values used to estimate the RF model used to create the prediction weighting layer you see above. Values represent population density as measured by people per hectare and calculated from population counts within each census unit. These values are used as the dependent variable during model estimation.

China Census Data, 2000

Folder: Census
File Name: CHN_2000_matched_to_2922units_wgs84.shp
Source: China CDC, acquired by Gaughan, et al. for use in AsiaPop data products.
Description: These census data are 2000 China Country Population Census Data. Required fields for map production are ADMINID and ADMINPOP.
Class: polygon
Derived Covariates:
area, buff, zones,

class       : SpatialPolygonsDataFrame 
nfeatures   : 2922 
extent      : -2577308, 2096165, 2366784, 6387872  (xmin, xmax, ymin, ymax)
coord. ref. : NA 
nvariables  : 7

plot of chunk census_data

Random Forest Model and Diagnostics

These output and figures outline the estimated RF model that is used to predict the population density weighting layer. The model is fitted to the population density values for the preceding census data using covariates aggregatedfrom the ancillary data sources summarized following the model diagnostics.

 randomForest(x = x_data, y = y_data, ntree = popfit$ntree, mtry = popfit$mtry,      nodesize = length(y_data)/1000, importance = TRUE, proximity = TRUE) 
               Type of random forest: regression
                     Number of trees: 500
No. of variables tried at each split: 2

          Mean of squared residuals: 0.3
                    % Var explained: 88

plot of chunk random_forestplot of chunk random_forestplot of chunk random_forest

Covariate Metadata

Urban Extents for 2000

Folder: Landcover
File Name: urban2000_rc_w210_final_gaul3.tif
Source: Settlement Extents, Citation: Wang L, Li C C, Ying Q, et al. China's urban expansion from 1990 to 2010 determined with satellite remote sensing. Chin Sci Bull, 2012, 57:2802???2812
Description: These data are processed using Landsat TM/ETM+ using base years of 1990, 2000, and 2010, to calculate all urban built-up areas in China; based on 663 cities
Class: raster
Derived Covariates:
cls190, dst190, cls210, dst210,

class       : RasterBrick 
dimensions  : 40412, 46936, 1896777632, 1  (nrow, ncol, ncell, nlayers)
resolution  : 100, 100  (x, y)
extent      : -2587372, 2106228, 2356759, 6397959  (xmin, xmax, ymin, ymax)
coord. ref. : +proj=lcc +lat_1=30 +lat_2=62 +lat_0=0 +lon_0=105 +x_0=0 +y_0=0 +datum=WGS84 +units=m +no_defs +ellps=WGS84 +towgs84=0,0,0 
data source : D:\Research\Population\Data\RF\data\CHN_2000_gaul3\Landcover\Derived\landcover.tif 
names       : landcover 
min values  :         0 
max values  :       210 

plot of chunk covariate_reports

Version 4 DMSP-OLS Nighttime Lights Time Series

Folder: Lights
File Name: CHN_F14F152000.v4b_web.stable_lights.avg_vis_INT.tif
Description: The files are cloud-free composites made using all the available archived DMSP-OLS smooth resolution data for calendar years. The products are 30 arc second grids, spanning -180 to 180 degrees longitude and -65 to 75 degrees latitude.
Class: raster
Derived Covariates:

class       : RasterBrick 
dimensions  : 40413, 46937, 1896864981, 1  (nrow, ncol, ncell, nlayers)
resolution  : 100, 100  (x, y)
extent      : -2587472, 2106228, 2356659, 6397959  (xmin, xmax, ymin, ymax)
coord. ref. : +proj=lcc +lat_1=30 +lat_2=62 +lat_0=0 +lon_0=105 +x_0=0 +y_0=0 +datum=WGS84 +units=m +no_defs +ellps=WGS84 +towgs84=0,0,0 
data source : D:\Research\Population\Data\RF\data\CHN_2000_gaul3\Lights\Derived\lights.tif 
names       : lights 
min values  :      0 
max values  :     63 

plot of chunk covariate_reports

Elevation and Derived Slope, 3 second

Folder: Elevation
File Name: DEFAULT: Void-Filled DEM.gdb
Source: HydroSHEDS Void-Filled DEM (Lehnert, et al., 2006),
Description: The HydroSHEDS data are the result of an effort to provide a globally consistent dataset consisting of NASA's Shuttle Radar Topography Mission (SRTM) data and have been processed, void-filled and corrected for use at large scales.
Class: raster
Derived Covariates:
, slope,

class       : RasterBrick 
dimensions  : 40413, 46937, 1896864981, 1  (nrow, ncol, ncell, nlayers)
resolution  : 100, 100  (x, y)
extent      : -2587472, 2106228, 2356659, 6397959  (xmin, xmax, ymin, ymax)
coord. ref. : +proj=lcc +lat_1=30 +lat_2=62 +lat_0=0 +lon_0=105 +x_0=0 +y_0=0 +datum=WGS84 +units=m +no_defs +ellps=WGS84 +towgs84=0,0,0 
data source : D:\Research\Population\Data\RF\data\CHN_2000_gaul3\Elevation\Derived\elevation.tif 
names       : elevation 
min values  :      -272 
max values  :      8618 

plot of chunk covariate_reports

Rivers and Streams (OSM), 2014

Folder: Rivers
File Name: rivers.shp
Source: Open Street Map, Downloaded 2013-09-16,
Description: These data were downloaded as part of a per-country package of data layers made availalble as shapefiles through the website, extracted from the Open Street Map (OSM) database.
Class: linear
Derived Covariates:
cls, dst,

class       : SpatialLinesDataFrame 
nfeatures   : 33070 
extent      : -2586582, 2104529, 2379287, 6394013  (xmin, xmax, ymin, ymax)
coord. ref. : NA 
nvariables  : 4