EPFL’s Synthegy ranks synthesis routes like chemists 71%

EPFL’s Synthegy uses large language models to score retrosynthesis plans against plain-English chemist goals, matching expert selections 71.2% in a study of 36 chemists and 368 evaluations.

EPFL researchers led by Philippe Schwaller published a paper this week in Matter describing Synthegy, a framework that uses large language models to rank chemical synthesis routes against plain-English instructions from chemists.

Synthegy pairs existing retrosynthesis software with an LLM reasoning layer. A chemist types a directive such as “form the pyrimidine ring in the early stages.” The retrosynthesis engine generates many candidate routes, Synthegy converts each route into text, and the language model scores how well each route meets the instruction. The highest-scoring routes are returned with written explanations.

The team validated Synthegy in a double-blind study that presented 368 pairs of routes to 36 independent chemists. The system’s top choices matched the chemists’ selections 71.2% of the time. Senior researchers, including professors and research scientists, matched Synthegy’s selections at higher rates than PhD students.

The researchers benchmarked several models. Gemini-2.5-pro achieved the highest score in their tests. DeepSeek-r1 was noted as a strong open-source option that can run locally. The group also evaluated GPT-4o and Claude.

Synthegy can also analyze reaction mechanisms by breaking reactions into elementary electron-movement steps and asking the language model to assess each step for chemical plausibility. On straightforward reactions such as nucleophilic substitutions, top models reached near-perfect accuracy.

Evaluating roughly 60 candidate routes takes about 12 minutes and incurs approximately $2–3 in API fees, according to the paper. The authors have published the code and benchmarking data on GitHub at github.com/schwallergroup/steer.

The paper lists several limitations. Language models can misread the textual representation of a reaction and infer the wrong reaction direction, producing incorrect feasibility judgments. Smaller models in the tests performed at or near random. Routes longer than about 20 steps degraded model performance and coherence. The framework depends on the quality of the retrosynthesis engine that generates candidate routes.

Andres M. Bran, lead author, noted in an EPFL statement: “When making tools for chemists, the user interface matters a lot, and previous tools relied on cumbersome filters and rules.” The team designed Synthegy to be modular so it can connect to different retrosynthesis backends and use various LLMs on the reasoning side.

The paper identifies drug discovery, materials design and industrial reaction optimization as potential application areas where ranking strategic synthesis routes could affect laboratory decision-making.

The material on GNcrypto is intended solely for informational use and must not be regarded as financial advice. We make every effort to keep the content accurate and current, but we cannot warrant its precision, completeness, or reliability. GNcrypto does not take responsibility for any mistakes, omissions, or financial losses resulting from reliance on this information. Any actions you take based on this content are done at your own risk. Always conduct independent research and seek guidance from a qualified specialist. For further details, please review our Terms, Privacy Policy and Disclaimers.

Articles by this author