Kevin Jablonka

FSU Jena, DE



Kevin Jablonka obtained his bachelor’s degree in chemistry at TU Munich. He joined EPFL for his master’s studies (and an extended study degree in applied machine learning), after which he joined Berend Smit’s group for a Ph.D. He now leads a research group at the Helmholtz Institute for Polymers in Energy Applications of the University of Jena and the Helmholtz Center Berlin. Kevin’s research interests are in the digitization of chemistry. For this, he has been contributing to the cheminfo electronic lab notebook ecosystem. He also developed a toolbox for digital reticular chemistry. Using tools from this toolbox, he addressed questions from the atom to the pilot-plant scale. Kevin is also interested in using large language models in chemistry and co-leads the ChemNLP project (with support from and Stability.AI)


Large language models for materials science


In this session, we will discuss how large language models (e.g. decoder-only architectures like GPT) work and how we can apply them in materials science in chemistry. We will build a simple model from scratch in Python and discuss aspects (e.g. tokenization) that are specific to chemistry and materials science data. In a second part we will delve deeper in tool-augmented language models, in which we no longer rely on the reasoning of an LLM but give the LLM access to robust tools and knowledge bases. 


1 | Introduction to LLMs and their use in materials science
We will address the following questions:

  • What are "foundation models"?

  • What is self-supervision, fine-tuning, RLHF?

  • What is in-context learning?

  • How can I use those models in materials science and why might this be useful?
2 | Tutorial: Building an LLM from scratch
We will build a GPT-style model on molecules from scratch using simple Python code. 

3 | Advanced applications of LLMs
We will address the following questions:

  • How can we give models access to knowledge bases?

  • How can we make models use tools?

  • How has this been used in chemistry?
4 | Tutorial: Building a tool-augmented system from scratch
We will use simple Python code and the OpenAI API to let a model use external tools such as RDKit or Web APIs.
Scroll to Top