What's strong: Coordinates 100% valid (every record inside SG bounding box, ≥4 decimal precision).
Brand mapping verified — 23 of 25 fact-checked brands match public counts within ±33%.
Top retail/F&B chains (7-Eleven, NTUC FairPrice, McDonald's, Starbucks, Watsons) within ±7% of authoritative counts.
Known limitations: Opening hours and price level were never extracted by the source scraper.
~33% of records lack a phone number. POPStation parcel lockers undersampled (only ~30% coverage on Google).
Bad-centroid records (1,567) dropped to ensure all locations are trustworthy.
2. Schema (12 flat fields)
# Identifierid : 12-char alphanumeric (deterministic from name+coords)
# Namingname : cleaned (emoji-free, "Pte. Ltd." canonicalized)
alt_name : alternate name (e.g., Chinese variant)
# Location (must-haves; all verified)latitude : float, inside SG bbox (1.15–1.48), ≥4 dp
longitude : float, inside SG bbox (103.59–104.10), ≥4 dp
address : original address (87.5% have SG postal)
# Classification (must-haves)brand : 1 of 220 SGP brands (or empty string)
primary_category : 1 of 166 derived buckets
# Engagementrating : 1.0–5.0, null if reviews_count = 0
reviews_count : int
# Contactwebsite : empty string if missing
phone : raw scraped format
Marina Bay Sands(91) · Hotel 81(27) · Pan Pacific(19) · ibis(19) · Shangri-La(14)
Hospital
267
64.4%
Singapore General Hospital(48) · Tan Tock Seng Hospital(30) · National University Hospital(27) · Sengkang General Hospital(9) · KK Women's and Children's Hospital(8)
Preschool
2,638
44.7%
PCF Sparkletots Preschool(372) · My First Skool(172) · Star Tots Playgroup(151) · MindChamps PreSchool(83) · M.Y World(60)
Tuition Centre
5,141
9.6%
Edufarm Learning Centre(171) · PlayFACTO School(68) · Kumon(49) · Zhengfei Cultural Education(47) · Cristofori Music School(36)
address80 Punggol Fld, B1-01 Punggol 21 Community Club, Singapore 828815
latitude1.3934754
longitude103.9135381
brandSingPost
primary_categoryPost Office
rating3.1
reviews_count111
phone
websitehttps://www.singpost.com/
12. Notes & caveats
Brand-vs-name override fix: When a record has a brand, the brand-derived category overrides any name-pattern match. So "LHN Parking - Khatib Polyclinic" is tagged as a Parking Lot, not a Polyclinic.
Bad-centroid records dropped: 1,567 records pinned to a fake centroid coord were removed. They had valid name/brand but no usable location.
Rating cleared when reviews_count = 0: Aggregate ratings without reviews were nulled to avoid misleading downstream consumers (26,448 records).
Hours and price_level not in dataset: The original scraper never captured these. Re-populating requires re-scraping.
POPStation undercount confirmed: 39 vs ~130 actual SingPost lockers. Lockers are systematically underlisted on Google Maps as POIs.
BlueSG matched stations not points: Initial 0.19x undercount was a unit mismatch — BlueSG has 253 stations × 1,000+ points; Google Maps lists stations.
Long tail in "Other": 14% of records remain in "Other" — the source has 3,039 unique Google categories; mapping covers the top ~85%.