Applying Machine Learning and Geolocation Techniques to Social Media Data (Twitter) to Develop a Resource for Urban Planning
With all the recent attention focused on big data, it is easy to overlook that basic vital statistics remain difficult to obtain in most of the world. This project set out to test whether an openly available dataset (Twitter) could be transformed i...
Main Authors: | , , , , , |
---|---|
Language: | English |
Published: |
World Bank, Washington, DC
2020
|
Subjects: | |
Online Access: | http://documents.worldbank.org/curated/en/407261607111342557/Applying-Machine-Learning-and-Geolocation-Techniques-to-Social-Media-Data-Twitter-to-Develop-a-Resource-for-Urban-Planning http://hdl.handle.net/10986/34910 |
Summary: | With all the recent attention focused on
big data, it is easy to overlook that basic vital statistics
remain difficult to obtain in most of the world. This
project set out to test whether an openly available dataset
(Twitter) could be transformed into a resource for urban
planning and development. The hypothesis is tested by
creating road traffic crash location data, which are scarce
in most resource-poor environments but essential for
addressing the number one cause of mortality for children
over age five and young adults. The research project scraped
874,588 traffic-related tweets in Nairobi, Kenya, applied a
machine learning model to capture the occurrence of a crash,
and developed an improved geoparsing algorithm to identify
its location. The project geolocated 32,991 crash reports in
Twitter for 2012-20 and clustered them into 22,872 unique
crashes to produce one of the first crash maps for Nairobi.
A motorcycle delivery service was dispatched in real-time to
verify a subset of crashes, showing 92 percent accuracy.
Using a spatial clustering algorithm, portions of the road
network (less than 1 percent) were identified where 50
percent of the geolocated crashes occurred. Even with
limitations in the representativeness of the data, the
results can provide urban planners useful information to
target road safety improvements where resources are limited. |
---|