5x Boost in In-Silico Drug Discovery with a Multimodal LLM

The new LLM allows the client’s research team to explore molecular properties and relationships more effectively.

Meet our client

Client:

Multi-billion-dollar industry leader

Industry:

Healthcare / Pharma, Software & Technology

Market:

Europe

Technology:

LLM

In a Nutshell

Client’s Challenge

The client sought to leverage their knowledge base of in-vivo experiments to boost their in-silico molecule discovery efforts. They needed a solution that could integrate their existing data to accelerate the discovery of new molecules.

Our Solution

Working with the client’s in-silico drug discovery team, we developed a pipeline to train multi-modal LLMs with a deep understanding of chemistry. We integrated Llama 3.1 and Graphormer, incorporating molecular structures as graphs into the LLM, creating a unique model that could analyze textual, SMILES, and molecular data.

Client’s Benefits

The new LLM allows the client’s research team to explore molecular properties and relationships more effectively. It can also predict the properties of new molecules generated through chemical reactions, even those it has never seen before, significantly enhancing in-silico drug discovery capabilities.

A Deep Dive

1. Overview

Our client, a global pharmaceutical technology leader, wanted to speed up in-silico molecule discovery by integrating their vast database of in-vivo experiments into an advanced AI model. The challenge? Traditional LLMs don’t understand molecular structures, and Graphormer training was too slow, taking over a month to process large datasets.

We built a multimodal LLM, integrating Graphormer with Llama 3.1, allowing researchers to analyze chemical reactions through text, SMILES, and molecular graphs—all within a simple chat interface. To eliminate training bottlenecks, we engineered custom CUDA kernels, achieving a 2-5x speedup in Graphormer’s attention mechanism and cutting overall training time by 50%.

Now, chemists without coding expertise can interact with AI to predict molecule properties, while the research team runs faster, scalable experiments—pushing the boundaries of AI-driven drug discovery.

Key Outcomes:

Developed a multimodal LLM integrating Graphormer for molecular understanding.
Achieved a 2-5x speedup in Graphormer training via custom CUDA optimizations.
Built a chemistry-aware AI assistant for predicting chemical reaction outcomes.

Designed a real-time, interactive dashboard for experiment tracking and analysis.

2. Client

A leading global pharmaceutical company focused on medical innovation and medtech development.

Industry: Pharmaceutical / Healthcare
Market Value: Multinational operations in over 60 countries, serving millions of patients globally.

Achievements & Context:

Pioneering AI-driven solutions in digital health.
Runs a vast online healthcare platform with services like telemedicine, doctor reviews, and AI-assisted diagnostics.
Invests heavily in AI-powered drug discovery to accelerate molecule identification and reduce R&D costs.

3. Challenges

Business Challenge

The client aimed to reduce the high costs of traditional drug discovery by enhancing their in-silico research division. However, chemists with limited ML expertise struggled to apply advanced AI models for predicting chemical reactions.

Technology Challenge

LLMs lacked molecular understanding – Conventional AI models couldn’t accurately interpret chemical structures.
Graphormer training was slow – Training times exceeded a month due to proprietary attention mechanisms.
No structured visualization tools – The client lacked an interactive way to compare and analyze molecular properties efficiently.

4. Solution

AI Assistant for Chemists

We developed a bespoke LLM-powered AI assistant, allowing chemists to interact with chemical models through a simple chat-based interface:

Integrated molecular embeddings from Graphormer for chemistry-aware AI.
Enabled direct reaction querying to predict molecule properties and interactions.
No-code AI access for chemists, removing technical barriers.

The multimodal Chemical LLM provides an intuitive chat interface for chemists and transforms millions of previously untapped historical natural language experiment records into actionable insights that significantly accelerate drug discovery.

Graphormer Training Optimization

To accelerate model training, we:

Implemented custom CUDA kernels for Graphormer attention mechanisms.
Optimized forward & backward passes, achieving a 2-5x speedup over PyTorch’s implementation.

Reduced overall training time by 50%, significantly improving AI-driven research efficiency.

Custom In-Silico Drug Discovery Dashboard

To streamline research insights, we built a dedicated AI-powered dashboard with:

Custom interactive visualizations for experimental results.
Optimized queries & real-time rendering to handle millions of data points.

Domain-specific charts for chemists to analyze molecular relationships.

Technologies Used:

AI Models: Llama 3.1, Graphormer
Frameworks & Libraries: PyTorch, CUDA, Dash for visualization

Infrastructure: Multi-GPU distributed training

5. Process

Step 1: AI Model Development

Trained a multimodal LLM with Graphormer for molecule understanding.
Integrated text and molecular embeddings into a conversational AI assistant.

Step 2: Graphormer Performance Optimization

Developed custom CUDA kernels for faster attention computations.
Optimized training pipelines for distributed multi-GPU infrastructure.

Step 3: Dashboard & Data Visualization

Designed a chemistry-specific dashboard for experiment analysis.
Created interactive, real-time visualizations to compare in-silico experiments.

Key Experts Involved:

AI Researchers (LLM training, Graphormer integration)
Machine Learning Engineers (CUDA optimizations, training pipeline development)
UX/UI Designers (Dashboard & visualization design)

Chemistry Experts (Ensuring AI models align with real-world drug discovery needs)

6. Outcome

Quantitative Results

2-5x faster Graphormer training using custom CUDA kernels.
2x reduction in overall training time, improving AI efficiency.
Higher AI model accuracy in predicting chemical reactions.

Qualitative Results

AI assistant adoption: Non-technical chemists can now use AI for predictions without coding.
Enhanced research efficiency: Faster insights from experiments with interactive dashboards.
Scalability improvements: Optimized infrastructure supports large-scale AI training.

Lessons Learned

Domain-specific AI models outperform generic LLMs for specialized tasks like chemistry.
Optimizing hardware-level computations (CUDA) significantly accelerates AI workloads.

User-friendly AI interfaces drive adoption among non-technical experts.

7. Summary

Final Thoughts

Through LLM-powered AI, CUDA optimization, and real-time visualizations, we helped the client dramatically accelerate drug discovery, making AI more accessible to chemists and improving research outcomes.