| Source | Records | Features derived |
|---|---|---|
| Overture Maps + OSM | 174,711 places | 79 place composition + 114 per-place |
| Overture Buildings | 377,331 | 16 built environment |
| LTA Stations + Bus Stops | 231 MRT + 44 LRT + 5,177 bus | 18 transit + 8 GTFS + 19 place anchors |
| LTA Ridership (hourly) | 12.3M taps/day | 7 temporal transit (AM/PM/off/night) |
| Singapore GTFS 2026 | 230,914 trips | 8 frequency features (headway by window) |
| OSM Road Network | 550,991 segments | 26 walkability (network walk from 214K-node graph) |
| SingStat Population | 5,982,320 | 18 demographics + 12 dwelling type |
| HDB Resale Transactions | 227,207 | 2 property price features |
| URA Master Plan | 113,212 parcels | 12 land use |
| NASA VIIRS Nightlights | 2 epochs (2022+2024) | 7 nightlight features (growth, commercial indicator) |
| WorldPop + WorldCover | Grid rasters | 5 satellite (pop growth, land cover) |
| LTA DataMall API (live) | 3K taxis + 2.6K carparks + 50K speed bands | 10 dynamic (taxi, carpark, congestion, bus services) |
| LTA Bus Routes | 26,711 route-stops (789 services) | Bus network topology + connectivity |
| Government amenity datasets | ~3,000 (hawkers, clinics, parks, schools, hotels) | 16 amenity counts + distance features |
| OSM POIs (4 layers) | 52,317 | 4 supplementary counts |
| Pillar | # | Source | What it captures |
|---|---|---|---|
| Demographics | 18 | SingStat | Population total (5.98M), elderly %, non-resident, daytime intensity |
| Dwelling types | 12 | SingStat | HDB 1-5rm, condo, landed, by floor area |
| Built environment | 16 | Overture | Buildings, HDB blocks, floors, commercial/industrial |
| Land use | 12 | URA | Zoning %, entropy, fragmentation, dominant use |
| Transit | 18 | LTA | MRT/bus counts, taps (total + AM/PM/off/night) |
| GTFS frequency | 8 | GTFS 2026 | Headway by time window, routes, departures |
| Walkability | 26 | OSM graph | Euclidean + network walk to 6 amenities, detour ratios |
| Amenities | 16 | Gov data | Hawkers, clinics, parks, supermarkets, hotels, schools |
| Place composition | 79 | Overture+OSM | 24 categories, tiers, entropy, HHI, brands |
| Demand pull | 12 | Computed | 6 pull scores (office/residential/transit/hotel/school/hawker) |
| Synergy | 20 | Computed | 10 co-location scores (cafe×office, grocery×residential...) |
| Saturation | 13 | Computed | 5 categories: supply/demand ratio + gap (pop_total denominator) |
| Satellite | 12 | VIIRS+WP+WC | Nightlight change, pop growth, land cover |
| Archetypes | 15 | K-means | 6 types + 8 indices (vitality, accessibility, demand...) + 4 proxies |
| Micrograph | 156 | Pipeline | 12 categories × 13 context vectors |
| Spatial context | 123 | H3 rings | Ring-1/2 max + pop-weighted aggregates |
| LTA dynamic | 10 | Live API | Taxi, carpark, speed, congestion, car dependency |
| Structure | 8 | Cross-scale | Interface, gradient, demand flow, ecosystem, self-containment |
| Property | 2 | HDB resale | Median PSF, transaction count |
| Group | # | Key features |
|---|---|---|
| Identity | 14 | Category (24 types), price tier, branded, h3 keys |
| Competition | 5 | competitors_200m/500m, substitution_risk, market_share |
| Complementary | 5 | Cross-category diversity 300m, score |
| Anchors | 19 | 14 types (MRT, bus, hawker, clinic, park, hotel, school, library, sports...) |
| Demand + synergy | 18 | 6 pulls + demand_context + 10 synergies (target-category-only) |
| Transit | 8 | Network walk MRT/bus, GTFS headway, transit_score |
| Catchment | 5 | pop, elderly, nonresident, daytime intensity |
| Context | 16+ | Building, neighborhood, archetype, indices, nightlight |
| Supply-demand | 5 | saturation, demand_match, survivability_index |
| Level | Units | Features | Resolution | Role |
|---|---|---|---|---|
| Hex-9 | 7,318 | 612 | ~174m | Fine-grain context, walkability, micrograph |
| Hex-8 | 1,191 | 637 | ~461m | Primary: demand, gaps, archetypes, ecosystem |
| Subzone | 326 | ~449 | URA | Policy alignment |
| Places | 174,711 | 114 | Point | Competition, synergy, survivability |
| Plexis | ~200K | 1.49M edges | 39 types | Structural reasoning, scenarios |
| Family | Edges | Key relations | What it captures |
|---|---|---|---|
| Commercial | 941,834 | COMPETES, SYNERGIZES, SUBSTITUTES, EXIT_FRONTAGE, VOID_DECK | How businesses interact |
| Hierarchy | 364,058 | LOCATED_IN, IS_A, PARENT_OF, PART_OF | Containment + classification |
| Anchor | 121,053 | ANCHORED_BY, WALK_CATCHMENT, SERVES | Demand generators |
| Spatial | 27,551 | ADJACENT_TO, N/S/E/W_OF, ROAD, COASTAL | Physical connectivity |
| Structure | 10,758 | SAME_CLUSTER, LU_TRANSITION, DEVELOPMENT_FRONT | Urban evolution |
| Gradient | 9,263 | COMMERCIAL/HEIGHT/DENSITY/PRICE gradients | Change across space |
| Transit | 5,770 | CONNECTS_TO, FEEDS_INTO, SAME_CORRIDOR, EXPRESSWAY | Movement network |
| Supply-demand | 5,260 | UNDERSUPPLIED, OVERSUPPLIED, DEMAND_LEAKS, COMPARABLE | Gaps & opportunities |
| Feature | R² | Quality | |
|---|---|---|---|
| walkability_score | 0.898 | EXCELLENT | |
| pull_residential | 0.894 | EXCELLENT | |
| ecosystem_completeness | 0.827 | STRONG | |
| population | 0.777 | GOOD | |
| pull_office | 0.667 | MODERATE | |
| pc_total | 0.661 | MODERATE | |
| transit_daily_taps | 0.648 | MODERATE |
| Feature | R² | Quality | |
|---|---|---|---|
| anchor_score | 0.908 | EXCELLENT | |
| demand_context_score | 0.884 | EXCELLENT | |
| competitors_200m | 0.774 | GOOD | |
| complementary_diversity | 0.696 | GOOD | |
| transit_score | 0.661 | MODERATE | |
| survivability_index | 0.477 | MODERATE |
| Metric | Value | What it means |
|---|---|---|
| Category classification | 69.8% | Embedding alone predicts business type 70% of the time (24 categories) |
| Category separability | 310x | Same-category places are 310x more similar than different-category in commercial head |
| Retrieval P@5 | 0.100 | 10% of top-5 nearest neighbors are same category (vs 4% baseline) |
| Archetype NMI | 0.362 | Embedding clusters partially recover pre-computed neighborhood archetypes |
| Link prediction Hits@10 | 14.1% | True connected node appears in top-10 predictions 14% of the time |
| Target | Static features only | + LTA dynamic | Gain |
|---|---|---|---|
| ecosystem_completeness | 0.715 | 0.818 | +10.3% |
| saturation_fnb | 0.479 | 0.524 | +4.4% |
| idx_vitality | 0.935 | 0.953 | +1.8% |
| What embedding captures | R² range | Verdict |
|---|---|---|
| Structural / spatial (walkability, ecosystem, population) | 0.78 – 0.90 | EXCELLENT — these ARE graph properties |
| Demand (pull_residential, demand_context, anchor_score) | 0.88 – 0.91 | EXCELLENT — demand flows through edges |
| Competition (competitors, diversity) | 0.70 – 0.77 | GOOD — competition is graph density |
| Transit (taps, transit_score) | 0.65 – 0.66 | MODERATE — hub-dependent |
| Viability (survivability) | 0.48 | MODERATE — supply-side beyond graph |
| Category identity | 69.8% acc, 310x sep | STRONG — commercial head works |