Thicket data repository for the EEG
at main 23 kB view raw
1{ 2 "id": "https://gabrielmahler.org/walkability/compsci/2025/06/02/background", 3 "title": "Walkability Chapter 2: Background", 4 "link": "https://gabrielmahler.org/walkability/compsci/2025/06/02/background.html", 5 "updated": "2025-06-02T10:40:11", 6 "published": "2025-06-02T10:40:11", 7 "summary": "Background", 8 "content": "<h1>Background</h1>\n\n<p>In the context of this work, the issues of pedestrian path-finding\nintersect both theoretical urbanist ideas (particularly the concept of\nwalkability) and classical topics in computer science (particularly\ngraph search algorithms). However, to achieve our objectives, we also\nrelate to much more modern computer science topics, specifically in\nnatural language processing and transformer-based encoders.</p>\n\n<h2>The Issue of “Walkability”</h2>\n\n<p>To construct a better set of requirements than those traditionally used\nin path-finding algorithms, we turn to the urbanist concept of\nwalkability. Unlike the exact yet simplistic measures of routing\nefficiency, walkability considers a wide range of factors closer aligned\nwith realistic human preferences for walking, and can, therefore,\nprovide a valuable perspective on finding paths walkers would actually\nprefer to take.</p>\n\n<h3>Urbanist Overview</h3>\n\n<p>In urbanist literature, the concept of walkability frequently\nencompasses a range of physical and social characteristics that\ncollectively determine how conducive a neighborhood is to pedestrian\nactivity.</p>\n\n<p><em>Alfonzo</em> draws a multi-level model to hierarchically structure the\nfactors that contribute to walkability (Alfonzo 2005). They use\nindividual-level characteristics (such as income and car ownership),\nregional-level attributes (that reflect broader geographic variation),\nand physical environment characteristics (including safety, traffic\nconditions, sidewalk availability, and the directness of pedestrian\nroutes). They further distill the factors of the individual-level\ncharacteristics and physical environments in an analysis of the human\nneeds. The resulting model is called “the five levels of walking needs”,\nand includes, in order: “feasibility” (reflecting, for instance, the\nmobility of individuals and environment), “accessibility” (referring to\nfactors such as the presence of pedestrian infrastructure or the\nproximity to points of interest), “safety” (determined by, for example,\nland use or the fear of crime), “comfort” (for instance, the\nrelationship between pedestrian and motorized traffic, or the presence\nof “street furniture”), and “pleasurability” (invoked by factors such as\naesthetic appeal or presence of public spaces).</p>\n\n<p>Nevertheless, a number of other publications emerge with more\nquantifiable approaches to measuring walkability. <em>Grasser et. al.</em>\nsuggest using data of gross population, employment, and housing\ndensities alongside land-use diversity indicators (such as the entropy\nindex) and estimated “street connectivity” based on intersection\ndensity (Grasser et al. 2013). In a parallel effort, <em>Frank et. al.</em>\nintroduce a composite index combining net residential density, street\nintersection density, and retail floor area ratio to capture both\ndestination presence and ease of access (Frank et al. 2006). Broadening\nthe scope, <em>Shields et. al.</em> catalog objective factors (including\ndistance to key destinations, sidewalk continuity, road-network\ncharacteristics, intersection safety features, vehicular traffic volume\nand speed, pedestrian-support amenities, and various density measures),\nwhile also emphasizing subjective qualities such as aesthetics, comfort\n(such as lighting, shade, noise levels), personal security,\nattractiveness, and crowding (Shields et al. 2023). Finally, <em>Frank et.\nal.</em> later propose calculating z-scores for net residential density,\nretail floor area ratio, intersection density, and a five-category\nland-use mix entropy score, summing these standardized values to produce\na regionally normalized composite index (Frank et al. 2010).</p>\n\n<h3>Summary and Criticisms of Walkability Literature</h3>\n\n<p>As such, the methods for the evaluation of walkability in urbanist\nliterature utilize a large variety of approaches and tools. Walkability\nis frequently calculated based on both highly granular metrics (such as\nintersection density) and small, local elements (such as street\nfurniture). Nevertheless, there are clear limitations to these\napproaches. For instance: while the approaches that aim to express\nwalkability in numeric values are only concerned with quantifiable\nfactors, it is only the more general, high-level frameworks (such as in\n<em>Alfonzo</em> (Alfonzo 2005)) that consider more subjective factors. Highly\nimportant influences of the physical environment, particularly in the\ndomains of “comfort” and “pleasurability”, are often omitted, or\nexpected to be correlated with the exact, quantifiable metrics.\nConsidering the spatial diversity of cities (particularly if we’re\ncomparing cities from different countries or regions), one may conclude\nthat these general-purpose approaches can easily lead to inaccurate (or\neven biased) conclusions about what can be considered well or poorly\nwalkable.</p>\n\n<h3>Walkability-focused Services</h3>\n\n<p>The potential shortcomings of the urbanist research seem to have been\nmirrored in both public and proprietary projects and services.\nHigh-visibility projects such as the National Walkability Index (Thomas,\nZeller, and Reyes 2021) or WalkScore (Walk Score 2025) (both of which\nare limited to the United States) have been criticized for their\npositive emphasis on car-centric areas and proximity to points of\ninterest, and neglecting more realistic pedestrian preferences,\nultimately leading to inaccurate and misleading conclusions (Steuteville\n2019). The NWI, curated by the U.S. Environmental Protection Agency,\nfocuses on measures that can be consistently applied across the country,\nusing data from the Smart Location Database (Ramsey and Bell 2014).\nThese measures include intersection density, proximity to transit stops,\nand the diversity of land use. The underlying assumption is that each of\nthese factors is positively correlated with the likelihood of walking\ntrips, making them key indicators of walkability at the block group\nlevel. A notable alternative to NWI is WalkScore - originally an\nopen-source project aimed at promoting walkable neighborhoods. However,\nWalkscore was later privately acquired, and currently releases\nwalkability scores calculated only through proprietary\nmethodologies (Walk Score 2025; Steuteville 2019).</p>\n\n<h3>Alternative: <em>n</em>-Minute Cities</h3>\n\n<p>Reflecting the limited supply of reliable walkability assessment tools\nand the demanding nature of the problem (requiring plentiful data and\nintricate technological solutions), alternative approaches, such as the\nconcept of “<em>n</em>-minute cities”, have emerged. Instead of measuring\nwalkability on the basis of actual physical environments, <em>n</em>-minute\ncities infer their walkability indices based on the proximity to points\nof interest. Exclusively aimed at urban environments, these projects\ngenerally focus on determining how long it would take to walk from a\ncertain location to places essential for daily life (such as stores,\nschools, hospitals, etc.) - hence the <em>n</em>-minutes.</p>\n\n<p>There are several projects, built by geographers and urbanists, that\nrely on this concept. A frontier example may be the project Close (Henry\nSpatial Analysis, LLC 2025; Bliss 2024), which combines information from\npublic geospatial datasets (such as the Overture Maps (Overture Maps\nFoundation 2025)) and custom filtering logic. In Close, geospatial data\npoints undergo a vetting process to refine and categorize destinations\nmeaningfully. For instance, when identifying supermarkets, Close uses\nqualitative criteria (such as a number of different sorts of aisles) to\ndistinguish full-service grocery stores from smaller convenience stores\nor bodegas. This labor-intensive process is partly automated using\nbuilding size, business names, and other available metadata, but in the\nfifty largest U.S. cities, the authors of Close had to undergo volumes\nof manual reviewing to improve accuracy. Furthermore, Close also\nattempts to alleviate issues induced by reliance on manual maintenance\nwith an iterative refinement implemented through a crowd-sourcing\nfeedback mechanism.</p>\n\n<h2>Path-Finding Algorithms</h2>\n\n<p>The aim of path-finding is to identify a sequence of steps that would\ndefine a route between two points with the aim of maximizing some\npredefined objective. Path-finding problems are typically represented\nwithin graphs, and their applications are widespread - from\ntransportation to robotics or video games. Nevertheless, the core of\nforefront path-finding frameworks has been consistent for a very long\ntime, and very frequently revolves around the A* (Hart, Nilsson, and\nRaphael 1968) and the foundational Dijkstra (Dijkstra 2022) algorithms\n(viz.\n§<a href=\"https://gabrielmahler.org/walkability/compsci/2025/06/02/background.html#section:relatedwork-transportationrouting\">3.1</a>{reference-type=”ref”\nreference=”section:relatedwork-transportationrouting”}).</p>\n\n<h3>A* Search</h3>\n\n<p>A* optimizes its search efficiency by using a heuristically-guided\nsearch (Hart, Nilsson, and Raphael 1968). It combines a Dijkstra-like\ngreedy best-first search with an estimation of the cost to reach the\ntarget node. At each step, A* selects the node with the lowest cost\n$f(n) = g(n) + h(n)$, where $g(n)$ represents the exact cost from the\nstart node to the considered node, and $h(n)$ the heuristic estimate of\nthe cost from the currently iterated node to the target destination. In\norder for A* to find an optimal path, $h(n)$ must be admissible, which\nmeans that it must never overestimate the actual cost to reach the\ntarget node. If, for an A* algorithm, the equation\n$h(n)\\leq c(n,n’) + h(n’)$ (where $c(n,n’)$ is the transition cost from\nnode $n$ to node $n’$) holds, then the search guarantees not only\noptimality but also efficiency by never having to revisit a node.\nFurthermore, while frequently $h(n)$ is defined as a simple Euclidean or\nManhattan distance, specific applications often benefit from more\nsophisticated strategies.</p>\n\n<h3>Search Optimizations</h3>\n\n<p>Search algorithms are frequently optimized with bidirectional search,\nwhich performs two simultaneous searches from both the start and the\ntarget until they meet. This reduces the number of visited nodes but\ngenerally requires more complex logic and balanced\nheuristics (Sturtevant and Felner 2018). Another approach, applicable in\nstatic graphs, is contraction hierarchies. This involves gradually\nremoving less important nodes and replacing them with shortcut edges\nthat preserve shortest paths. The resulting hierarchy allows for fast\nbidirectional search by restricting movement to higher-level nodes,\ngreatly speeding up queries after preprocessing, which is typically\nworthwhile for large graphs (Geisberger et al. 2008).</p>\n\n<h2>Sentence Transformers</h2>\n\n<p>Sentence embedders (such as the foundational Sentence-BERT (Reimers and\nGurevych 2019)) are neural networks based on the transformer\narchitecture, designed to capture the semantic contents of textual data\nof arbitrary length (but typically standalone sentences) into vectorized\nrepresentations of predetermined sizes. These models are frequently\nprepared by fine-tuning pretrained transformers on the objective of\nprojecting semantically similar sentences close together in the\nresulting embedding space. This property can then be used to easily\ncompare different data points, using measures like cosine similarity.\nUnlike some pre-existing approaches for measuring similarity (such as by\nrelying on the original BERT network), sentence transformers do not\ncompute pair-wise comparisons, but can encode inputs independently.\nTherefore, comparing similarities in large sets becomes much more\ncomputationally efficient. In order to output embeddings of fixed sizes,\nsentence transformers use various techniques, such as pooling over the\ntransformer’s final layer. Under both supervised and unsupervised\nbenchmarks in clustering, similarity, and retrieval tasks, sentence\ntransformers (such as Sentence-BERT) consistently outperformed existing\nstrategies (Reimers and Gurevych 2019).</p>\n\n<h2>Low-Rank Adaptation of Language Models</h2>\n\n<p>Low-Rank Adaptation (or LoRA) in language models is a technique that can\nbe leveraged to perform light-weight fine-tuning of pre-trained language\nmodels. LoRA-based fine-tuning works by freezing the original model’s\nweights and injecting small trainable low-rank decomposition matrices\ninto each of the transformer’s layers. Here, rank $r$ denotes the\ndimensionality of the low‑rank decomposition of a weight update. For a\nfrozen pre‑trained weight matrix $W_0\\in \\mathbb{R}^{d\\times k}$, the\nupdate is written as\n$\\Delta W = B A,\\; B \\in \\mathbb{R}^{d \\times r},\\; A \\in \\mathbb{R}^{r \\times k}$.\nThen, “low‑rank” implies choosing $r\\ll\\min(d,k)$ so that $\\Delta W$\nlies in a small $r$‑dimensional subspace, hence dramatically reducing\nthe number of trainable parameters. Before any training, the\ndecomposition matrices are initialized so that their product equals\nzero, and therefore, the model’s initial behavior matches the pretrained\nbaseline. During the training, the optimization is balanced by a scaling\nfactor, making sure most hyperparameters do not require retuning with\nvarying ranks. The underlying rationale of this approach is based on the\nobservation that during task-specific adaptation of transformers, the\nchange in the model’s weights lies in a much lower-dimensional subspace\nthan the entire parameter space (Hu et al. 2022).</p>\n\n<p>Furthermore, based on analyses published by <em>Hu et. al.</em> (Hu et al.\n2022), effective weight updates in transformers have very low intrinsic\nranks, and, in many cases, minimal ranks are sufficient to capture\nadaptations necessary for downstream tasks. Based on similarity\nmeasurements between adaptations of random initializations and different\nranks, they conclude that the most important parameter updates lie in a\nvery small subspace. Low-rank updates also tend to highlight features\nalready present in the pre-trained network, rather than introduce new\n“concepts” into the model. Therefore, LoRA can reduce the number of\ntrainable weights by orders of magnitude when applied to large semantic\nmodels and substantially lower the computational burden relative to full\nmodel fine-tuning. Furthermore, the injected low-rank matrices can be\nmerged with the transformer’s frozen weights before inference, and\nsubsequently achieve no additional latency compared to the original\nvanilla transformer.</p>\n\n<p>Therefore, considering LoRA’s proven potential in the context of\ntext-based transformers to match or exceed fully fine-tuned networks,\nLoRA presents a viable strategy for customization and adaptation of\ntransformer-based models while alleviating the computational burden\nassociated with full-model fine-tuning.</p>\n\n<h2>Contrastive Learning</h2>\n\n<p>Contrastive learning is a machine learning technique applicable in both\nsupervised and unsupervised settings. Contrastive learning aims to\nleverage known relationships between training data points to learn how\nto project data into an embedding space such that points of the same or\nsimilar samples appear close together, whereas points from different\nsamples are spread apart (Weng 2021). This is frequently accomplished\nvia specialized contrastive loss functions, such as the contrastive\nloss (F. Wang and Liu 2021), the triplet loss (Tripathi and King 2024),\nor InfoNCE (Rusak et al. 2024). Contrastive learning has enjoyed much\npopularity due to (amongst other things) its ability to train under a\nself-supervised objective and its versatility across various domains,\nincluding multi-modal machine learning (Weng 2021).</p>\n\n<p>Relevant for the context of this work is the triplet loss. The triplet\nloss paradigm uses three examples at a time: an “anchor” example, a\n“positive” example of the same or similar sample as the anchor, and a\n“negative” example of a sample different from the anchor. The trained\nmodel is then taught to effectively pull the anchor closer to the\npositive example in the representation space and push it away from the\nnegative example. In this way, the model is prompted to represent\ncontrasting samples in different parts of the embedding\nspace (Weinberger and Saul 2009; Khosla et al. 2020; Tripathi and King\n2024).</p>\n\n<h2>References</h2>\n\n<ul>\n <li>Alfonzo, Mariela A. (2005). <em>To Walk or Not to Walk? The Hierarchy of Walking Needs</em>. <em>Environment and Behavior</em>, 37(6), 808–836.</li>\n <li>Bliss, Laura. (2024). <em>How Walkable Is Your Neighborhood? A New Map Tool Offers an Answer – Bloomberg</em>.\n<a href=\"https://www.bloomberg.com/news/newsletters/2024-09-11/how-walkable-is-your-neighborhood-a-new-map-tool-offers-an-answer\">https://www.bloomberg.com/news/newsletters/2024-09-11/how-walkable-is-your-neighborhood-a-new-map-tool-offers-an-answer</a></li>\n <li>Dijkstra, Edsger W. (2022). <em>A Note on Two Problems in Connexion with Graphs</em>. In <em>Edsger Wybe Dijkstra: His Life, Work, and Legacy</em> (pp. 287–290).</li>\n <li>Frank, Lawrence D., Sallis, J. F., Conway, T. L., Chapman, J. E., Saelens, B. E., &amp; Bachman, W. (2006). <em>Many Pathways from Land Use to Health: Associations Between Neighborhood Walkability and Active Transportation, Body Mass Index, and Air Quality</em>. <em>Journal of the American Planning Association</em>, 72(1), 75–87.</li>\n <li>Frank, Lawrence D., Sallis, J. F., Saelens, B. E., Leary, L., Cain, K., Conway, T. L., &amp; Hess, P. M. (2010). <em>The Development of a Walkability Index: Application to the Neighborhood Quality of Life Study</em>. <em>British Journal of Sports Medicine</em>, 44(13), 924–933.</li>\n <li>Geisberger, R., Sanders, P., Schultes, D., &amp; Delling, D. (2008). <em>Contraction Hierarchies: Faster and Simpler Hierarchical Routing in Road Networks</em>. In <em>Experimental Algorithms: 7th International Workshop, WEA 2008</em> (pp. 319–333). Springer.</li>\n <li>Grasser, G., Van Dyck, D., Titze, S., &amp; Stronegger, W. (2013). <em>Objectively Measured Walkability and Active Transport and Weight-Related Outcomes in Adults: A Systematic Review</em>. <em>International Journal of Public Health</em>, 58, 615–625.</li>\n <li>Hart, Peter E., Nilsson, N. J., &amp; Raphael, B. (1968). <em>A Formal Basis for the Heuristic Determination of Minimum Cost Paths</em>. <em>IEEE Transactions on Systems Science and Cybernetics</em>, 4(2), 100–107.</li>\n <li>Henry Spatial Analysis, LLC. (2025). <em>Close.city Project</em>.\n<a href=\"https://close.city\">https://close.city</a></li>\n <li>Hu, Edward J., Shen, Y., Wallis, P., Allen-Zhu, Z., Li, Y., Wang, S., Wang, L., Chen, W., et al. (2022). <em>LoRA: Low-Rank Adaptation of Large Language Models</em>. <em>ICLR</em>, 1(2), 3.</li>\n <li>Khosla, P., Teterwak, P., Wang, C., Sarna, A., Tian, Y., Isola, P., Maschinot, A., Liu, C., &amp; Krishnan, D. (2020). <em>Supervised Contrastive Learning</em>. <em>NeurIPS</em>, 33, 18661–18673.</li>\n <li>Overture Maps Foundation. (2025). <em>Overture Maps Foundation</em>.\n<a href=\"https://overturemaps.org\">https://overturemaps.org</a></li>\n <li>Ramsey, K., &amp; Bell, A. (2014). <em>Smart Location Database</em>. <em>Washington, DC</em>.</li>\n <li>Reimers, N., &amp; Gurevych, I. (2019). <em>Sentence-BERT: Sentence Embeddings Using Siamese BERT-Networks</em>. In <em>EMNLP 2019</em>.\n<a href=\"https://arxiv.org/abs/1908.10084\">https://arxiv.org/abs/1908.10084</a></li>\n <li>Rusak, E., Reizinger, P., Juhos, A., Bringmann, O., Zimmermann, R. S., &amp; Brendel, W. (2024). <em>InfoNCE: Identifying the Gap Between Theory and Practice</em>. <em>arXiv Preprint arXiv:2407.00143</em>.</li>\n <li>Shields, R., Gomes da Silva, E. J., Lima e Lima, T., &amp; Osorio, N. (2023). <em>Walkability: A Review of Trends</em>. <em>Journal of Urbanism</em>, 16(1), 19–41.</li>\n <li>Steuteville, R. (2019). <em>Walkability Indexes Are Flawed. Let’s Find a Better Method</em>. <em>CNU</em>.\n<a href=\"https://www.cnu.org/publicsquare/2019/01/10/walkability-indexes-are-flawed-lets-find-better-method1\">https://www.cnu.org/publicsquare/2019/01/10/walkability-indexes-are-flawed-lets-find-better-method1</a></li>\n <li>Sturtevant, N., &amp; Felner, A. (2018). <em>A Brief History and Recent Achievements in Bidirectional Search</em>. In <em>AAAI Conference on Artificial Intelligence</em>, 32(1).</li>\n <li>Thomas, J., Zeller, L., &amp; Reyes, A. R. (2021). <em>National Walkability Index: Methodology and User Guide</em>. <em>United States Environmental Protection Agency (EPA)</em>.\n<a href=\"https://www.epa.gov/sites/default/files/2021-06/documents/national_walkability_index_methodology_and_user_guide_june2021.pdf\">https://www.epa.gov/sites/default/files/2021-06/documents/national_walkability_index_methodology_and_user_guide_june2021.pdf</a></li>\n <li>Tripathi, S., &amp; King, C. R. (2024). <em>Contrastive Learning: Big Data Foundations and Applications</em>. In <em>CODS-COMAD 2024</em>, 493–497.</li>\n <li>Walk Score. (2025). <em>Walk Score®: Walkability Index and Neighborhood Analytics</em>.\n<a href=\"https://www.walkscore.com\">https://www.walkscore.com</a></li>\n <li>Wang, F., &amp; Liu, H. (2021). <em>Understanding the Behaviour of Contrastive Loss</em>. In <em>CVPR</em>, 2495–2504.</li>\n <li>Weinberger, K. Q., &amp; Saul, L. K. (2009). <em>Distance Metric Learning for Large Margin Nearest Neighbor Classification</em>. <em>Journal of Machine Learning Research</em>, 10(2).</li>\n <li>Weng, L. (2021). <em>Contrastive Representation Learning</em>.\n<a href=\"https://lilianweng.github.io/posts/2021-05-31-contrastive/\">https://lilianweng.github.io/posts/2021-05-31-contrastive/</a></li>\n</ul>", 9 "content_type": "html", 10 "author": { 11 "name": "", 12 "email": null, 13 "uri": null 14 }, 15 "categories": [ 16 "walkability", 17 "compsci" 18 ], 19 "source": "https://gabrielmahler.org/feed.xml" 20}