Abstract
This study analyzes the differences in terms regarding cannabidiol (CBD) expressed by commercial sellers and personal users on Twitter. It demonstrates that data from social networks can be used by public health and medical researchers to compare the medical conditions targeted by those selling loosely-regulated substances such as CBD against the medical conditions that patients themselves are commonly treating with CBD. We collected 567,850 tweets by searching Twitter with the Tweepy Python package using the terms CBD and cannabidiol, and annotated a sample of 5,496 tweets to distinguish between personal use CBD tweets and commercial/sales-related CBD tweets. We used this sample to train two binary text classifiers to create two corpora of 169,876 personal use and 148,866 commercial/sales. Using medical, standard, and slang dictionaries, we then identified and compared the most frequently occurring medical conditions, symptoms, side effects, body parts, and other substances referenced in both corpora.
Original language | English |
---|---|
Title of host publication | Mediterranean Forum – Data Science Conference - First International Conference, MeFDATA 2020, Revised Selected Papers |
Editors | Jasminka Hasic Telalovic, Mehmed Kantardzic |
Pages | 139-150 |
Number of pages | 12 |
DOIs | |
State | Published - 2021 |
Event | 1st Mediterranean Forum - Data Science Conference, MeFDATA 2020 - Virtual, Online Duration: Oct 24 2020 → Oct 24 2020 |
Publication series
Name | Communications in Computer and Information Science |
---|---|
Volume | 1343 CCIS |
ISSN (Print) | 1865-0929 |
ISSN (Electronic) | 1865-0937 |
Conference
Conference | 1st Mediterranean Forum - Data Science Conference, MeFDATA 2020 |
---|---|
City | Virtual, Online |
Period | 10/24/20 → 10/24/20 |
Bibliographical note
Publisher Copyright:© 2021, Springer Nature Switzerland AG.
Keywords
- Cannabis
- Text classification
- Text mining
ASJC Scopus subject areas
- General Computer Science
- General Mathematics