In LightBox Live, City Directory occupant records are systematically tagged to help surface historical uses that may be associated with elevated environmental risk. Each listing is classified across two dimensions: property type and business use.
Property type tags distinguish between commercial and residential entries. Listings associated with a business activity are categorized as commercial, while individual listings without a business reference are categorized as residential. If property type cannot be confidently determined, it is left blank. Where an address includes both commercial and residential occupants, it is designated as “mixed” within the default table’s “Property Type” column. These tags provide immediate context for understanding historical land use at a site.

Business use tags, currently including dry cleaner and gas station, are applied to identify operations most commonly associated with environmental risk. The LightBox Data Team carefully curated and manually trained a large sample set of occupant records, which is used within AI models to evaluate keyword patterns and language context in occupant listings and flag relevant business types, even when terminology varies across directories or time periods. A high-confidence threshold is maintained throughout the process, ensuring that flagged records represent meaningful insights.
To further support accuracy, all AI-generated business use tags are manually reviewed by City Directory researchers. This human-in-the-loop step helps validate model outputs, resolve ambiguous listings, and further reduce false positives.
This approach brings structure and consistency to inherently variable historical data, helping ensure that higher-risk uses are more easily surfaced and enabling environmental professionals to focus attention where it is most warranted within Phase I ESA workflows.
