Extracting Semantics from Census-based Reference Data

Daniel R Harris, Nima Seyedtalebi

Research output: Contribution to journalArticlepeer-review


We present preliminary findings in extracting semantics from reference data generated by the United States Census Bureau. US Census reference data is based upon surveys designed to collect demographics and other socioeconomic factors by geographical regions. These data sets contain thousands of variables; this complexity makes the reference data difficult to learn, query, and integrate into analyses. Researchers often avoid working directly with US Census reference data and instead work with census-derived extracts capturing a much smaller subset of records. We propose to use natural language processing to extract the semantics of census-based reference data and to map census variables to known ontologies. This semantic processing reduces the large volume of variables into more manageable sets of conceptual variables that can be organized by meaning and semantic type.

Original languageEnglish
Pages (from-to)88-89
Number of pages2
JournalProceedings. IEEE International Conference on Semantic Computing
StatePublished - Jan 2021


Dive into the research topics of 'Extracting Semantics from Census-based Reference Data'. Together they form a unique fingerprint.

Cite this