![]() As we will show, this approach again yields high accuracy. The learned models from each study area are then transferred to classify building types in alternative study areas. Upon comparing to ground truth testing data, we show that our learned models yield high accuracy in all three locations. Next, based on existing and newly derived building footprint attributes, a set of models are trained to classify the building type using ground truth data obtained from official sources from each study area. Our choice of location has been dictated by available ground truth data and to provide a mix of urban, sub urban and rural areas. This particular semantic information is of limited availability at the building footprint level in, both, OSM and official datasets across the United States (US) and globally.įirst, we use existing high quality data available through OSM to derive geometric attributes for each building footprint (e.g., area, distance to roads, distance to parking lots, underlying land use) as well as available descriptive attributes in three study areas in the US, as follows: Fairfax County in Virginia, Mecklenburg County in North Carolina, and the City of Boulder in Colorado. We present a basic demonstration of this approach to classify OSM building footprints by their type as either residential or non-residential. Therefore, this study proposes a supervised learning approach to add meaningful, semantic information to OSM data without manual intervention. The incomplete nature of the attribute data is a shortcoming that limits the usefulness of OSM data. However, for the vast majority of buildings, building type information is unknown or unclear. We observe that in most of the cases where OSM building type information is available, it is correct. Thus, despite the lack of completeness in building type information, the number of misclassified buildings is less than 1%.įigure 1 further illustrates this by mapping the building footprints in Fairfax County and color coding their accuracy when compared with ground truth data. Note that the total number of labeled buildings and the number of correctly labeled buildings for each type are almost the same. ![]() OSM correctly labels 12.84% of residential and 19.26% of non-residential buildings for Fairfax County, 9.33% of residential and 10.48% of non-residential buildings for Mecklenburg County, and 67.75% of residential and 42.23% of non-residential buildings for the City of Boulder. To illustrate this, Table 1 compares the number of residential and non-residential buildings in OSM (both the total number and buildings that are correctly classified) with the ground truth data. However, even in the data rich locations, the semantic information that records the type and function of these features is very sparse such that the vast majority of features mapped have little to no descriptive attributes. The OSM and data science community are invited to build upon our approach to further enrich the volunteered geographic information in an automated manner.įor many locations, OSM geometries delineating streets, natural features, and building footprints are highly complete and accurate, often matching or overtaking traditional data sources such as the Central Intelligence Agency (CIA) World Factbook and United States Census Topologically Integrated Geographic Encoding and Referencing (TIGER)/Line data 9, 10, 11, 12. Additionally, a trained model is transferable with high accuracy to other regions where ground truth data is unavailable. The results show that our approach achieves high accuracy in predicting building types for the selected areas. The model is trained and tested using ground truth data available for the three study areas. The model leverages (i) available OSM tags capturing non-spatial attributes, (ii) geometric and topological properties of the building footprints including adjacent types of roads, proximity to parking lots, and building size. ![]() We present a basic demonstration of our approach that classifies buildings into either residential or non-residential types for three study areas: Fairfax County in Virginia (VA), Mecklenburg County in North Carolina (NC), and the City of Boulder in Colorado (CO). To fill this gap, this study proposes a supervised learning-based approach to provide meaningful, semantic information for OSM data without manual intervention. While volunteered geographic information from sources such as OpenStreetMap (OSM) has good building geometry coverage, descriptive attributes such as the type of a building are sparse. Having accurate building information is paramount for a plethora of applications, including humanitarian efforts, city planning, scientific studies, and navigation systems.
0 Comments
Leave a Reply. |