Mokslas, studijos ir ekonomika: How do organic synthesis laboratories work? Large language models direct automated chemistry laboratory working the same way

2024 m. balandžio 3 d., trečiadienis

How do organic synthesis laboratories work? Large language models direct automated chemistry laboratory working the same way

"Automation of chemistry research has focused on developing robots to execute jobs.

Artificial-intelligence technology has now been used not only to control robots, but also to plan their tasks on the basis of simple human prompts.

Chemistry research is grounded on iterative cycles in which experiments are designed, executed and then refined to achieve a particular goal. The experience and intuition of researchers has a crucial role in working out the initial design, and in the subsequent optimization process — something that could not previously have been replicated in autonomous systems that carry out chemistry research. Writing in Nature, Boiko et al.1 report an artificial intelligence (AI) agent named Coscientist that can plan and orchestrate multiple tasks in the chemistry-research cycle without detailed human input, bringing the vision of self-driving laboratories a step closer to reality.

Work done by chemists is multipronged — it requires not only technical skills to execute chemical reactions, but also knowledge to plan them. For example, designing an organic synthesis might involve carrying out retrosynthetic analyses (working backwards from the target molecule to identify simpler precursor molecules), searching databases for suitable reaction conditions and selecting the reactions that are most likely to achieve a pre-established research goal, such as maximizing product yield. But chemical reactions often fail to provide the product in acceptable yields, and the iterative process of searching the literature, working out what the next experiment (or experiments) should be and executing them can rapidly become cumbersome.

Chemists have, therefore, long aspired to develop automated systems to facilitate their work2. One of the first successes was the development of pipetting robots, which can be programmed to set up new reactions or to add reagents to vessels at specified times. Some robots are now reasonably affordable and have been adopted by many laboratories, freeing up researchers to focus on more intellectually challenging tasks.

Google AI and robots join forces to build new materials

In parallel, AI has made strides in chemistry, guiding decision-making in planning tasks that could hardly be automated just a few years ago (see ref. 3, for example). Nevertheless, those AI tools are typically trained to execute a single operation — a general understanding of various aspects of chemical research is beyond their capabilities. These limitations have frustrated the dream of establishing a work environment in which people supervise robots that are capable of planning and executing experiments autonomously.

However, the advent of generative pre-trained transformers (GPTs), which are the workhorses behind chatbots such as ChatGPT, suddenly provided chemists with an important piece of the automation puzzle.

By ‘understanding’ natural human language, GPTs allow machines to interact with people and thereby provide solutions to specific questions.

These large language models are useful for a wide range of topics but their proficiency in chemistry is subpar, and they require the implementation of additional tricks — fine-tuning of the models — to become effective for chemistry applications.

With that in mind, Boiko et al. now explore whether it is possible to string together fine-tuned GPTs to orchestrate self-driving labs using a single human prompt such as “Can you synthesize molecule A?” This requires not only an understanding of the question, but also a determination of the tasks that must be performed to complete the assignment successfully.

In brief, the AI Coscientist consists of modules that: assist literature searching to work out synthetic pathways and decide on experimental protocols; write code to enable communication between the modules; and search hardware documentation so that robots can be triggered to carry out experiments remotely. Boiko et al. benchmarked Coscientist’s web-searching capabilities by asking it to identify synthetic procedures for seven molecules that posed different levels of complexity. Those examples included blockbuster drugs, such as paracetamol, aspirin and ibuprofen, but also other chemicals. Coscientist performed better than other GPTs by reliably generating detailed and chemically accurate synthetic procedures.

More interestingly, Coscientist was able to design protocols and coordinate the execution of two types of reaction, known as Sonogashira and Suzuki–Miyaura cross coupling, both of which are often used in drug discovery to form carbon–carbon bonds. Once it had identified the reaction partners needed for the two types of cross coupling, Coscientist correctly calculated the amounts needed and programmed a pipetting robot with access to stock solutions of chemicals to mix them. The reactions successfully afforded the intended products. Not only that, Coscientist made choices about which reagents to use on the basis of chemically sensible reactivity rules.

Machine learning classifies catalytic-reaction mechanisms

As a final example, Coscientist was tasked with optimizing reactions to maximize product yields, in a process that involved iteratively suggesting reaction conditions and using the outcomes to propose better experiments. Its performance compared favourably to that of Bayesian optimization (an established machine-learning method) when supplied with as few as ten example reactions. When the GPT was not primed with examples, its initial suggestions for reaction conditions were sometimes poor. But when examples were available, subsequent suggestions quickly improved with each iteration — demonstrating Coscientist’s ability to acquire knowledge and adapt its reasoning over time.

Boiko and colleagues’ findings provide a robust proof of principle that the current version of Coscientist can semi-autonomously conduct experiments. However, it still has some limitations. As pointed out by the authors, chemically incorrect responses are sometimes obtained. But these can be mitigated by using sophisticated prompting strategies (such as chain of thought4 and tree of thoughts5) alongside chemistry-focused data sources. It should also be noted that real-world scenarios involve much more complex research questions than those tackled in this study, often involving concepts from disciplines other than chemistry — such as biology, in the case of drug development. Such complex questions are currently beyond Coscientist’s reach.

Taken together, the presented examples are a crucial step towards the establishment of self-driving labs. However, Coscientist and other forthcoming AI technologies must mature before researchers can fully understand their shortcomings and how they can best be used in science. Provided that the potential for misuse of large language models in chemistry does not lead to the introduction of suffocating regulations that stifle research, we expect many more exciting developments in the near future." [6]

References

    Boiko, D. A., MacKnight, R., Kline, B. & Gomes, G. Nature 624, 570–578 (2023).

    Article Google Scholar

    Seifrid, M. et al. Acc. Chem. Res. 55, 2454–2466 (2022).

    Article PubMed Google Scholar

    Segler, M. H. S., Preuss, M. & Waller, M. P. Nature 555, 604–610 (2018).

    Article Google Scholar

    Wei, J. et al. Preprint at https://arxiv.org/abs/2201.11903 (2023).

    Yao, S. et al. Preprint at https://arxiv.org/abs/2305.10601 (2023).

Competing Interests

T.R. is a consultant to the pharmaceutical, biotechnology and tech industries, and also a full member of the Acceleration Consortium, University of Toronto.

6. Large language models direct automated chemistry laboratory. By Ana Laura Dias & Tiago Rodrigues. Nature 624, 530-531 (2023)

Komentarų nėra:

Rašyti komentarą

Mokslas, studijos ir ekonomika

Sekėjai

Ieškoti šiame dienoraštyje

Subscribe Now: Feed Icon

Tinklaraščio archyvas

Apie mane

2024 m. balandžio 3 d., trečiadienis

How do organic synthesis laboratories work? Large language models direct automated chemistry laboratory working the same way

Komentarų nėra:

Translate