Abstract
This paper introduces the open-source Beast system for scalable exploratory data science on big spatiooral data. Beast is based on well-established research and has been released to assist the research community with analyzing big spatiooral data. Beast provides a set of extensible components that naturally integrate with Spark to build exploratory data science pipelines. Beast can install in less than a minute on an existing Spark cluster and provides a wide array of features including loading vector and raster data represented in standard file formats, synthetic data generation for benchmarking, load-balanced spatial partitioning, data summarization, interactive visualization, and more. Beast builds on several research projects; its goal is to make all this research widely available to researchers in one integrative and coherent system.
Original language | English |
---|---|
Title of host publication | CIKM 2021 - Proceedings of the 30th ACM International Conference on Information and Knowledge Management |
Pages | 3796-3807 |
Number of pages | 12 |
ISBN (Electronic) | 9781450384469 |
DOIs | |
State | Published - Oct 26 2021 |
Event | 30th ACM International Conference on Information and Knowledge Management, CIKM 2021 - Virtual, Online, Australia Duration: Nov 1 2021 → Nov 5 2021 |
Publication series
Name | International Conference on Information and Knowledge Management, Proceedings |
---|
Conference
Conference | 30th ACM International Conference on Information and Knowledge Management, CIKM 2021 |
---|---|
Country/Territory | Australia |
City | Virtual, Online |
Period | 11/1/21 → 11/5/21 |
Bibliographical note
Publisher Copyright:© 2021 Owner/Author.
Keywords
- data science
- exploration
- geospatial data
- spatiooral data
- visualization
ASJC Scopus subject areas
- General Business, Management and Accounting
- General Decision Sciences