Our Models
State-of-the-art chemical language models designed for molecular information retrieval and property prediction.
๐งช
ModChemBERT
ModernBERT as a Chemical Language Model
View model card โA ModernBERT-based chemical language model trained on SMILES strings using a multi-stage training pipeline. Achieves state-of-the-art performance on molecular property prediction benchmarks.
๐ 768 hidden size
๐ 256 max sequence length
๐ drug-like SMILES
๐ฌ
ChemMRL
SMILES Matryoshka Representation Learning Embedding Transformer
View model card โA state-of-the-art sentence transformer for generating molecular embeddings using Matryoshka representation learning. Enables flexible embedding dimensions for efficient similarity search and molecular information retrieval.
๐ 1024 dimensions
๐ Tanimoto similarity
๐ 512 max sequence length
โก
ChemRanker
CrossEncoder for Molecular Reranking
A family of specialized CrossEncoder models for reranking a query SMILES against document SMILES. Complements ChemMRL for efficient similarity search and molecular information retrieval.
๐ฏ Reranking
๐ QED
๐ Similarity
๐ IR pipeline
Inference API
Empower your chemistry workflows with our high-performance Inference API. Generate precise molecular embeddings for similarity search, reranking molecules, and virtual screening using our state-of-the-art chemical language models.
- Fast embedding generation
- Batch processing support
- Flexible quantization (int8, binary) for scalability
Input
Endpoint
SMILES (one per line)
Options
Normalize Embeddings
Precision
Truncate Dimension
Leave empty to use full model dimensions.