Challenges in Information-Mining the Materials Literature: A Case Study and Perspective

Andrew Smith, Vinayak Bhat, Qianxiang Ai, Chad Risko

Research output: Contribution to journalReview articlepeer-review

6 Scopus citations

Abstract

The rapid development and application of machine learning (ML) techniques in materials science have led to new tools for machine-enabled and autonomous/high-throughput materials design and discovery. Alongside, efforts to extract data from traditional experiments in the published literature with natural language processing (NLP) algorithms provide opportunities to develop tremendous data troves for these in silico design and discovery endeavors. While NLP is used in all aspects of society, its application in materials science is still in the very early stages. This perspective provides a case study on the application of NLP to extract information related to the preparation of organic materials. We present the case study at a basic level with the aim to discuss these technologies and processes with researchers from diverse scientific backgrounds. We also discuss the challenges faced in the case study and provide an assessment to improve the accuracy of NLP techniques for materials science with the aid of community contributions.

Original languageEnglish
Pages (from-to)4821-4827
Number of pages7
JournalChemistry of Materials
Volume34
Issue number11
DOIs
StatePublished - Jun 14 2022

Bibliographical note

Publisher Copyright:
© 2022 American Chemical Society.

Funding

This work was sponsored by the National Science Foundation in part through the Designing Materials to Revolutionize and Engineer our Future (NSF DMREF) program under Award Number DMR 1627428 and the Established Program to Stimulate Competitive Research (EPSCoR) Track 2 program under Cooperative Agreement Number 2019574. We acknowledge the University of Kentucky Center for Computational Sciences and Information Technology Services Research Computing for their fantastic support and collaboration and use of the Lipscomb Compute Cluster and associated research computing resources.

FundersFunder number
University of Kentucky Medical Center
National Science Foundation (NSF)DMR 1627428
Office of Experimental Program to Stimulate Competitive Research2019574

    ASJC Scopus subject areas

    • General Chemistry
    • General Chemical Engineering
    • Materials Chemistry

    Fingerprint

    Dive into the research topics of 'Challenges in Information-Mining the Materials Literature: A Case Study and Perspective'. Together they form a unique fingerprint.

    Cite this