Accession Number:



A Machine Learning Approach to Zeolite Synthesis Enabled by Automatic Literature Data Extraction

Descriptive Note:

Journal Article - Open Access

Corporate Author:

Massachusetts Institute of Technology Cambridge United States

Report Date:


Pagination or Media Count:



Zeolites are porous, aluminosilicate materials with many industrial and green applications. Despite their industrial relevance, many aspects of zeolite synthesis remain poorly understood requiring costly trial and error synthesis. In this paper, we create natural language processing techniques and text markup parsing tools to automatically extract synthesis information and trends from zeolite journal articles. We further engineer a data set of germanium-containing zeolites to test the accuracy of the extracted data and to discover potential opportunities for zeolites containing germanium. We also create a regression model for a zeolites framework density from the synthesis conditions. This model has a cross-validated root mean squared error of 0.98 T1000 Angstrom3, and many of the model decision boundaries correspond to known synthesis heuristics in germanium-containing zeolites. We propose that this automatic data extraction can be applied to many different problems in zeolite synthesis and enable novel zeolite morphologies.

Subject Categories:

  • Cybernetics
  • Information Science
  • Physical Chemistry

Distribution Statement: