Talk #D1.08

21.05.2024, 16:30 – 16:45





Large Language Models are a Strong Baseline for Inorganic Synthesizability and Precursor Selection Prediction

Joshua Schrier



A goal of chemical space exploration is the discovery (synthesis/characterization) of novel compositions of matter.[1] Knowledge grounded in explanations of physical causes is most desirable,[2] but any method of obtaining the correct answer (e.g., actually synthesizing a compound) is valuable in practice. For this purpose, any process that consistently produces true beliefs over false ones counts as knowledge,[3] and so even a process that merely uses statistical relationships in text can be admissible. Recently, the rise of pre-trained and fine-tuned large language models (LLMs) has been demonstrated as a useful strategy for organic molecule property regression and classification,[4,5] even if their chemical space representation are unclear. In this talk, I will describe new results on predicting the synthesizability of inorganic compounds (can it be made?) and selecting precursors (how to make it?)—which correspond to a positive/unlabelled and multiclass (set) learning problems. We benchmarked pre-trained and fine-tuned LLMs against recent (traditional) machine-learning approaches.[6] Surprisingly, the LLMs can solve these problems at levels that are comparable to the best traditional approaches. The relative ease, speed, and quality of this LLM-based approach suggests both its broader adoption in chemical discovery and use of methods like these as a general baseline for when reporting the performance of more traditional chemical space prediction methods.


  1. Schrier, Norquist, Buonassisi, Brgoch, J. Am. Chem. Soc. 2023, 145, 21699-21716. doi:10.1021/jacs.3c04783.
  2. Virgil, Georgics c.37 BCE Book II, line 490.
  3. Goldman, Reliabilism and Contemporary Epistemology (Oxford Univ. Press 2015) 336pp.
  4. Jablonka, Schwaller, Ortega-Guerrero, Nature Mach. Intel. 2024, doi:10.1038/s42256-023-00788-1.
  5. Xie, Evangelopoulos, Omar, Troisi, Cooper, Chen, Chem. Sci 2024, 15, 500-510 doi:10.1039/D3SC04610A.
  6. Kim, Noh, Ho Gu, Chen, Jung, Chem. Sci. 2024, 15, 1039-1045 doi:10.1039/D3SC03538G.





Joshua Schrier

 Joshua Schrier


  •   Fordham University, New York