Is Predicted Data a Viable Alternative to Real Data?
It is costly to collect the household- and individual-level data that underlies official estimates of poverty and health. For this reason, developing countries often do not have the budget to update their estimates of poverty and health regularly,...
Main Authors: | , |
---|---|
Language: | English en_US |
Published: |
World Bank, Washington, DC
2016
|
Subjects: | |
Online Access: | http://documents.worldbank.org/curated/en/2016/09/26822026/predicted-data-viable-alternative-real-data http://hdl.handle.net/10986/25156 |
Summary: | It is costly to collect the household-
and individual-level data that underlies official estimates
of poverty and health. For this reason, developing countries
often do not have the budget to update their estimates of
poverty and health regularly, even though these estimates
are most needed there. One way to reduce the financial
burden is to substitute some of the real data with predicted
data. An approach referred to as double sampling collects
the expensive outcome variable for a sub-sample only while
collecting the covariates used for prediction for the full
sample. The objective of this study is to determine if this
would indeed allow for realizing meaningful reductions in
financial costs while preserving statistical precision. The
study does this using analytical calculations that allow for
considering a wide range of parameter values that are
plausible to real applications. The benefits of using double
sampling are found to be modest. There are circumstances for
which the gains can be more substantial, but the study
conjectures that these denote the exceptions rather than the
rule. The recommendation is to rely on real data whenever
there is a need for new data, and use the prediction
estimator to leverage existing data. |
---|