We pioneered[1] the use of Gaussian Process Regression (which we regrettably called “kriging” with hindsight) in the design of atomic potentials. Contrary to efforts of other research groups, we started first with the machine learning (ML) of accurate electrostatics (e.g. for all 20 amino acids[2]). It is manifest to work with multipole moments if only nuclear sites are used. Next followed accurate ML predictions of the atomic energies. At the heart of this method are quantum topological atoms[3], such that a single partitioning method provides all atomic properties. Note, that unlike in alternative approaches, the ML does not partition the total system into atomic quantities. This approach[4] is now called FFLUX. Sustained in-house software development led to the ML training program FEREBUS[5] supported by ICHOR[6] and the molecular dynamics program DL_FFLUX, which is an offspring of DL_POLY. Adaptive sampling (which the establishment calls active learning) combined with a AIMD-based sample set produces models with fewer training point than neural network do for small molecules[7]. Improved training now tackles molecules (e.g. paracetamol) up to ~30 atoms[8]. The recent parallellisation[9] of DL_FFLUX enables the simulation of condensed matter, whether molecular crystals[10] or liquid water[11], all with high-rank polarisable electrostatics. The next major step will be to link models in order to describe oligopeptides and later even proteins. Meanwhile, innovative ML techniques are being introduced[12] to improve the prediction errors.
Figure 1.
 Prof. Paul Popelier