Concluding remarks


These questions were collected in 30 mins at the end of the workshop. The document is useful in highlighting and summarizing some of the topics discussed.

What should we call machine learning potentials?

  • “Quantum machine learning” is highly disfavored

  • “Neural network potentials” is too specific

  • “Machine learning potentials” is vastly preferred by the audience


Will machine learning potentials replace molecular mechanics potentials? If so, when? 3 years? 5 years? 10 years?

  • 3 years = 0%

  • 5 years = 10%

  • 10 years = 50%

  • Uncertain = 40%


Would "foundation models" (e.g. GPT-3 like models for molecular machine learning, as Jonathan Godwin discussed) be useful? What kinds of models?

  • Models trained on lots of QM data? For ML potentials or for representation learning? 75%

  • Models like coarse-graining or enhanced sampling trained on lots of MD trajectory data? 20 % (for geometric similarity, for downstream sampling tasks, tie down to downstream experiments)

  • Models trained on lots of biomolecular structures or experimental affinities?

  • Multiscale?

    • Data that includes multiple atomic systems (molecules crystals surfaces nanoclusters macromolecules)

    • There will be diversification just like we have diversification in simulators


What is the function of foundational models?

  • Representation for downstream potentials?

  • Tool to investigate correlations in data?

  • Are they a source of interpretability or do they take it away?

  • Representation learning to compare with scarcer, expanse experimental data?

  • Where will the datasets for foundation models come from? Can this community help curate them? Can this community make metrics and standards for data quality?


    • Structural data - (crystallography, molecules proteins crystals)

    • Will data mining make these datasets? We need an ontology and long-term records (report negatives)

    • Drive data generation and collection (with autonomos=us data)

    • Open reaction database for organics (plus closed reaxsys or scifinder)

    • Incentive system and labor of curating and sanitizing one’s internal data. What are the tradeoffs of continuous data releases

    • Not too much hope in mixing and matching quantum chemistry

    • How do we create and enforce standards for data?

    • Data has a longer lifetime than models, better to release it and create more sustained value in the community

    • Go beyond static datasets (where we can) into dynamic tasks for validation and continuous release/evaluation

    • General-purpose vs application-driven datasets (i.e. method development is a general application)


What shared resources would best help accelerate the research in this community?

Is there a benefit to collaborating in generating or curating datasets?

  • Quantum chemical datasets? what should their composition be? how big does it need to be? does it need to be generated by active learning? What level of theory? Does it make sense to mix and match?

  • Molecular dynamics trajectory datasets? what kind of systems? how much data do we need?

  • Experimental datasets? what kind? how much data do we need?


    • The community needs well-defined tasks and well-defined metrics of success that relate to the objective of the task

    • Avoid duplicate effort, for instance in the integration of ML into a simulator via one-off packages. How many independent neighbor-list codes or ANI-LAMMPs plug-ins?

    • Hardware will follow slowly if at all and needs to have many users


How to handle charges and long-range interactions? Accurately and scalable?

  • Its just coulomb? What about many-body

  • If long range is all multipole maybe the ML should go to predicting multipoles

  • Do we have a Qm9 for long-range? (no)


How important are quantum nuclear effects? Can we incorporate them into an “effective potential”, or will we need path integral/ring polymer MD for many applications?

  • Might be important. Maybe they are learnable, but the data is lacking (expensive simulations)


How to extend machine learning potentials to charge transfer and reactions?

  • And excited states? More transferability challenges because non-local correlated


How to optimize neural network calculations, latency, CUDA graphs, and tensor cores? Is speed a problem? What kind of computational infrastructure do we need?

  • Most of the time we have low-level kernel challenges and I/O between CPU GPU and not just raw FLOP issues


What is the role of quantum computers and special hardware in quantum chemistry?

  • Quantum computers for strongly correlated systems in solid-state physics

  • Orbital free DFT

  • TPU? What are the tradeoffs in development and acceleration?


PINNs for simulations?

  • Inductive bias and interpretability

  • Invertibility

  • Prevents the model to learn how to bypass the simulation altogether


DeepMind has shown the value of having a well-defined problem with well-defined metrics of success and well-curated datasets that could then enable ML researchers to work independently to solve them. What other areas in chemistry and materials are ripe for this kind of work? Where should we focus effort in curating data and establishing metrics to enable more of this kind of work in chemistry and materials?

    • Avoid fitting to the test and hurt real-world applicability

    • Are the physics fields and communities organized and ready to operate in the same metric-driven way as the ML community? We do not have the culture and infrastructure for this in enhanced sampling. PLUMED is doing this for metadynamics.


What applications will be the first domain to be transformed by machine learning potentials? Drug discovery? Materials?

  • Financial incentives say drug discovery


How do we make it easy for everyone to experiment with new architectures? Are existing ML frameworks enough?

  • What is the flow between innovation, development, and maintenance?


Can we create a library like the deep graph library (DGL) that can provide a unified interface to many architectures for QML potentials, and make it easier to develop new architectures or search over architectures?

  • Not so much for sandboxing but more so for deploying fast methods


How will general machine learning potentials compare in accuracy to active learning on specific systems to generate bespoke models?

  • The general purpose will most likely be bigger and thus slower so bespoke and active learning will result in faster models

  • Careful benchmarking customized vs. bespoke

  • Transfer learning / multi fidelity / fine tuning can get the best of both


Is there a way that model distillation could be useful to generate faster ML models for inference?

  • Other domains do it

  • Fewer weights are not faster / less memory footprint necessarily and sparsifying does not necessarily accelerate models