AI-Powered Cheminformatics

Derify develops state-of-the-art chemical language models for molecular property prediction, de novo drug generation, and efficient molecular information retrieval.

🤗Explore on Hugging Face Try the API

Cheminformatics

Foundation Models

Information Retrieval

Our Models

State-of-the-art chemical language models designed for molecular information retrieval and property prediction.

🧪

ModChemBERT

FoundationModel

ModernBERT as a Chemical Language Model

View model card ↗

A ModernBERT-based chemical language model trained on SMILES strings using a multi-stage training pipeline. Achieves state-of-the-art performance on molecular property prediction benchmarks.

📐 768 hidden size

📏 256 max sequence length

💊 drug-like SMILES

ModernBERT

Molecular Property Prediction

Flash Attention

Apache 2.0

🔬

ChemMRL

EmbeddingModel

SMILES Matryoshka Representation Learning Embedding Transformer

View model card ↗

A state-of-the-art sentence transformer for generating molecular embeddings using Matryoshka representation learning. Enables flexible embedding dimensions for efficient similarity search and molecular information retrieval.

📐 1024 dimensions

📊 Tanimoto similarity

📏 512 max sequence length

Sentence Transformer

Embedding

Flash Attention

Apache 2.0

⚡

ChemRanker

RerankerModel

CrossEncoder for Molecular Reranking

A family of specialized CrossEncoder models for reranking a query SMILES against document SMILES. Complements ChemMRL for efficient similarity search and molecular information retrieval.

🎯 Reranking

📈 QED

📈 Similarity

🔍 IR pipeline

Cross Encoder

Reranker

Alpha

Flash Attention

Apache 2.0

Inference API

Empower your chemistry workflows with our high-performance Inference API. Generate precise molecular embeddings for similarity search, reranking molecules, and virtual screening using our state-of-the-art chemical language models.

Fast embedding generation
Batch processing support
Flexible quantization (int8, binary) for scalability

Explore OpenAPI docs

main.py

import requests

url = "https://api.derifyai.com/v1/models/Derify/ChemMRL/embed"
headers = {
    "X-API-Key": "YOUR_API_KEY",
    "Content-Type": "application/json"
}
data = {
    "smiles": ['CCO', 'CCN'],
    "normalize_embeddings": True,
    "precision": "float32"
}

response = requests.post(url, headers=headers, json=data)
result = response.json()
print(result)

Input

Endpoint

Embed

Similarity

SMILES (one per line)

Options

Normalize Embeddings

Precision

Truncate Dimension

Leave empty to use full model dimensions.

API Response

No results to display. Press "Run" to fetch data.