Untangling urban data signatures: unsupervised machine learning methods for the detection of urban archetypes at the pedestrian scale [article]

Gareth D. Simons
2022 arXiv   pre-print
Urban morphological measures applied at a high-resolution of spatial analysis can yield a wealth of data describing characteristics of the urban environment in a substantial degree of detail; however, such forms of high-dimensional numeric datasets are not immediately relatable to broader constructs rooted in conventional conceptions of urbanism. Data science and machine learning (ML) methods provide an opportunity to explore such forms of complex datasets by applying unsupervised ML methods to
more » ... reduce the dimensionality of the data while recovering latent themes and characteristic patterns which may resonate with urbanist discourse more generally. Dimensionality reduction and clustering methods, including Principal Component Analysis (PCA), Variational Autoencoders, and an Autoencoder based Gaussian Mixture Model, are discussed and demonstrated for purposes of 'untangling' urban datasets, revealing themes bridging quantitative and qualitative descriptions of urbanism. The methods are applied to a dataset for Greater London consisting of network centralities, land-use accessibilities, mixed-use measures, and density measures. The measures are computed at pedestrian walking tolerances at a 20m network resolution utilising a local windowing-methodology with distances computed directly over the network and with aggregations performed dynamically and with respect to the direction of approach, thus preserving the relationships between the variables and retaining contextual precision. Whereas the demonstrated methods hold tremendous potential, their power is difficult to convey or fully exploit using conventional lower-dimensional visualisation methods, thus underscoring a need for subsequent research into how such methods may be coupled to interactive visualisation tools to further elucidate the richness of the data and its potential implications.
arXiv:2106.15363v3 fatcat:ww637kegb5ekvhmpndsu2jkyqe