Speaker
Description
Introduction
The rising prevalence of non-communicable diseases, especially in low- and middle-income countries (LMICs), highlights the need for research into their determinants. Limited data and methodological challenges hinder LMIC studies on neighbourhood health determinants. Computer vision (CV), powered by deep learning, can identify visual objects and understand what they are. CV enables a scalable solution, automating data extraction from street images. We systematically reviewed literature on CV applications for extraction of environmental characteristics from street images.
Methods
Following an adapted version of Arksey and O’Malley’s six-step review process, we used eight databases to identify 11,221 studies. Eligible studies were conducted in English and focused on CV models to classify, detect or segment objects from street images. After title, abstract, and full-text screening, we included 112 studies (published 2020-2023) for data extraction. We conducted a narrative synthesis of findings, supported by harvest plots.
Results
The majority (n=75) of studies identified used Google and Baidu Street View images. Most studies were from the US and Canada (n=21) or East Asia (n=44). CV has been used to extract data on environmental characteristics, including aspects of the built (e.g. sidewalks), transport (e.g. vehicles) and food (e.g. food stalls) environments, and neighbourhood vegetation. Segmentation is the most common CV method (n=57). Almost half of studies report overall CV accuracy, with fewer reporting individual class accuracy. Most models (n=42) were pre-trained.
Conclusions
Our findings indicate that the potential applications of CV in geographical and related research are extensive. However, relatively few studies report class accuracy, which is a concern.
Keywords: neighbourhood, determinants, computer vision, street view, review