This paper introduces the open-source Beast system for scalable exploratory data science on big spatiooral data. Beast is based on well-established research and has been released to assist the research community with analyzing big spatiooral data. Beast provides a set of extensible components that naturally integrate with Spark to build exploratory data science pipelines. Beast can install in less than a minute on an existing Spark cluster and provides a wide array of features including loading vector and raster data represented in standard file formats, synthetic data generation for benchmarking, load-balanced spatial partitioning, data summarization, interactive visualization, and more. Beast builds on several research projects; its goal is to make all this research widely available to researchers in one integrative and coherent system.
|Title of host publication||CIKM 2021 - Proceedings of the 30th ACM International Conference on Information and Knowledge Management|
|Number of pages||12|
|State||Published - Oct 26 2021|
|Event||30th ACM International Conference on Information and Knowledge Management, CIKM 2021 - Virtual, Online, Australia|
Duration: Nov 1 2021 → Nov 5 2021
|Name||International Conference on Information and Knowledge Management, Proceedings|
|Conference||30th ACM International Conference on Information and Knowledge Management, CIKM 2021|
|Period||11/1/21 → 11/5/21|
Bibliographical noteFunding Information:
This work is supported in part by NSF under grants IIS-2046236, IIS-1954644, IIS-1838222 and CNS-1924694 and by Agriculture and Food Research Initiative Competitive Grant no. 2019-67022-29696 from NIFA.
© 2021 Owner/Author.
- data science
- geospatial data
- spatiooral data
ASJC Scopus subject areas
- Business, Management and Accounting (all)
- Decision Sciences (all)