In this talk, I will give a brief overview of the research done in my group on the field of machine learning (ML) applied to transition metal complexes (TMCs). I will start with our initial work on predicting the energy barrier of the elementary steps involved in homogeneous catalysis[1], followed by our interest on generalizing ML approaches with tmQM[2], an 86K dataset of TMCs built from CSD data and semi-empirical quantum calculations, which we thereafter expanded with graph representations based on DFT and NBO[3]. After this short summary, I will introduce our recent work on evolutionary learning, including the tmQMg-L ligand library and the PL-MOGA algorithm[4]. The tmQMg-L library contains 30K diverse and synthesizable ligands with defined charges and metal coordination modes. The PL-MOGA is a genetic algorithm allowing for the multiobjective optimization of TMCs within selected regions of the Pareto front, with fine control over both aim and scope; [Fig. 1]. Used together, these tools allowed for the exploration and exploitation of vast chemical spaces containing billions of TMCs. If time allows, I will present our current efforts in coupling the PL-MOGA to other generative models based on deep graph learning.
Figure 1. Pareto front multiobjective optimization with fine control over aim and scope using the PL-MOGA algorithm with the tmQMg-L ligand library[4].
 David Balcells