Our Models

Specialized models for real-world tasks

Purpose-built models for reasoning, retrieval, and domain-specific applications. Access via API or deploy on your infrastructure.

Request API Access View API Docs

production-ready

2.4B

across all models

<100ms

p95 response time

99.9%

SLA guaranteed

Reasoning Models

Core reasoning and inference models for complex tasks

Engram-VQ

v2.1

300Mavailable

Recurrent model with fast/slow memory systems and surprise-driven meta-gating. Optimized for multi-step reasoning with efficient context utilization.

Multi-step reasoningContext compressionSurprise gating

MMLU:78.2%

GSM8K:82.4%

HumanEval:71.3%

/v1/models/engram-vq

Engram-VQ-Lite

v2.0

70Mavailable

Distilled version of Engram-VQ optimized for edge deployment and low-latency inference.

Low latencyEdge deploymentMobile-ready

MMLU:68.1%

GSM8K:71.2%

/v1/models/engram-vq-lite

Diffusion Reasoner

v0.3

180Mresearch

Experimental model that performs reasoning through iterative diffusion-based refinement. Particularly strong on constraint satisfaction problems.

Iterative refinementConstraint solvingUncertainty quantification

ARC-Challenge:74.8%

MATH:45.2%

Retrieval & Embedding

Domain-specific embedding and retrieval models

Engram-Embed

v1.2

110Mavailable

General-purpose embedding model with strong performance on retrieval benchmarks. Supports documents up to 8K tokens.

8K contextSemantic searchCross-lingual

MTEB:67.4

BEIR:52.1

/v1/embeddings/engram-embed

Engram-Embed-Clinical

v1.0

110Mavailable

Fine-tuned embedding model for clinical and biomedical literature. Trained on PubMed, clinical trials, and regulatory documents.

Medical terminologyDrug interactionsClinical reasoning

PubMedQA:78.3%

BioASQ:72.1%

/v1/embeddings/engram-embed-clinical

Engram-Embed-Cyber

v1.0

110Mavailable

Specialized embedding model for cybersecurity domains. Understands CVEs, threat intelligence, and security advisories.

CVE parsingThreat classificationIOC extraction

CyberBench:81.2%

/v1/embeddings/engram-embed-cyber

Document Synthesis

Multi-document reasoning and synthesis models

Multi-Doc Reasoner

v1.1

450Mavailable

Cross-document synthesis with automatic citation tracking and conflict resolution. Handles up to 32 documents simultaneously.

Citation trackingConflict detection32-doc context

Multi-News:42.3 R-L

QMSum:38.7 R-L

/v1/models/multi-doc-reasoner

Engram-Summarize

v1.0

220Mavailable

Abstractive summarization model with controllable length and style. Optimized for technical documents.

Length controlStyle transferTechnical docs

CNN/DM:44.1 R-L

XSum:24.8 R-L

/v1/models/engram-summarize

Domain-Specific

Models trained for specific verticals and use cases

Engram-Code

v0.9

350Mbeta

Code generation and understanding model supporting 20+ programming languages. Specialized for refactoring and bug detection.

20+ languagesBug detectionRefactoring

HumanEval:68.2%

MBPP:72.4%

/v1/models/engram-code

Engram-SQL

v1.0

85Mavailable

Natural language to SQL model with schema awareness. Supports complex joins and nested queries.

Schema-awareComplex joinsQuery optimization

Spider:79.1%

BIRD:58.3%

/v1/models/engram-sql

API Access

Simple, unified API

Access all models through a single API. Consistent interface across reasoning, embedding, and synthesis models with streaming support and comprehensive error handling.

Low Latency

Optimized inference with <100ms p95 response times

SDK Support

Official SDKs for Python, TypeScript, and Go

Usage Analytics

Real-time monitoring and usage dashboards

Self-Hosted

Deploy on your infrastructure with our container images

from engram import Client

client = Client(api_key="your-api-key")

# Generate with Engram-VQ
response = client.generate(
    model="engram-vq",
    prompt="Explain quantum entanglement",
    max_tokens=512,
    temperature=0.7
)

print(response.text)

Start building with Engram models

Request API access to start using our models in your applications. Free tier available for development and testing.

Request Access View Research