State-of-the-Art Healthcare AI — Quick Start

State-of-the-Art LLMs for Healthcare, Free Forever.

OpenMed equips clinical engineering and data science teams with open-source LLMs, biomedical NER, and evaluation tooling to ship HIPAA-aware workflows faster—without vendor lock-in.

Launch production SageMaker endpoints in under five minutes with managed packages.
Reduce annotation and triage costs with 12 benchmark-proven entity extractors.
Meet compliance goals with Apache-2.0 licensing and transparent training data.

Get started

500+

HF models

Biomedical Categories

State-of-the-Art Healthcare AI — Quick Start

State-of-the-Art Healthcare AI Models

Production-ready state-of-the-art LLMs for healthcare and clinical AI across biomedical domains — chemicals, diseases, genes/proteins, species, anatomy, oncology, and more. Advanced healthcare AI models trained on clinical and biomedical data.

Available on

Hugging Face (500+ models)

Python Package

Apache-2.0

Python Toolkit

A complete toolkit for clinical NLP workflows — from quick analysis to batch processing, with an interactive terminal UI for exploration.

Interactive TUI

Rich terminal interface for visual NER analysis. Model switching, config profiles, history, and export.

openmed tui

CLI Analysis

Analyze text from the command line. Output as JSON, CSV, or HTML for downstream processing.

openmed analyze --text "..."

Batch Processing

Process directories of clinical notes. Progress tracking, error handling, and aggregated results.

openmed batch --input-dir ./notes

Configuration Profiles

Built-in dev/prod/test profiles. Save custom configurations and switch instantly.

openmed config profile-use prod

Interactive TUI Preview

┌──────────────────────────────────────────────────────────────────────────────┐
│ OpenMed TUI                                         Interactive Clinical NER │
├──────────────────────────────────────────────────────────────────────────────┤
│ ┌─ Input (Ctrl+Enter to analyze) ──────────────────────────────────────────┐ │
│ │ Patient diagnosed with chronic myeloid leukemia, started on imatinib.    │ │
│ └──────────────────────────────────────────────────────────────────────────┘ │
│ ┌─ Annotated ──────────────────────────────────────────────────────────────┐ │
│ │ Patient diagnosed with [chronic myeloid leukemia], started on [imatinib].│ │
│ └──────────────────────────────────────────────────────────────────────────┘ │
│ ┌─ Entities (2) ───────────────────────────────────────────────────────────┐ │
│ │  Label     Entity                      Confidence                        │ │
│ │  DISEASE   chronic myeloid leukemia    ████████████████████ 0.98         │ │
│ │  DRUG      imatinib                    ███████████████████░ 0.95         │ │
│ └──────────────────────────────────────────────────────────────────────────┘ │
├──────────────────────────────────────────────────────────────────────────────┤
│ Model: default │ Profile: dev │ Thresh: 0.30 │ MedTok │ 23ms                 │
├──────────────────────────────────────────────────────────────────────────────┤
│ Ctrl+Enter Analyze  F2 Model  F3 Config  F4 Profile  F5 History  F6 Export   │
└──────────────────────────────────────────────────────────────────────────────┘

TUI Documentation CLI Reference

Healthcare Data Privacy

Clinical Text De-Identification
Built for HIPAA & GDPR Compliance

De-identify clinical notes, medical records, and healthcare documents with context-aware AI models. Process data locally—your PHI never leaves your environment. Open-source, auditable, and production-ready.

18+

HIPAA Safe Harbor PHI Types

100%

Local Processing

Per-Token API Fees

Apache-2.0

Fully Open Source

Context-Aware PHI Detection

Presidio-inspired scoring system that boosts confidence when keywords like "SSN:", "DOB:", or "NPI:" appear near detected entities. Low base scores prevent false positives; context words confirm true PHI.

Keyword Boosting 100-char Context Window

Checksum & Format Validation

Built-in validators reduce false positives: Luhn algorithm for credit cards and NPIs, SSN area code rules, US phone number format checking. Invalid matches get reduced confidence scores.

Luhn Checksum SSN Validation NPI Verification

Smart Entity Merging

Fixes tokenization fragmentation automatically. Subword tokenizers split "123-45-6789" into fragments—our semantic patterns reassemble complete SSNs, phones, and multi-word entities.

BIO Tag Handling Regex Patterns

Zero Data Movement

Process PHI entirely on your infrastructure. No API calls to external services. Models run locally on CPU or GPU. Your clinical data never leaves your secure environment.

Air-Gapped Ready On-Premises

Flexible Redaction Methods

Choose your de-identification strategy: mask with entity type labels [NAME], redact completely, or replace with synthetic data. Configurable confidence thresholds for precision control.

Masking Redaction Replacement

HIPAA Safe Harbor Ready

Detects all 18 HIPAA Safe Harbor identifiers: names, dates, SSNs, phone numbers, emails, addresses, MRNs, account numbers, NPIs, IP addresses, URLs, and more.

18+ Entity Types GDPR Compatible

PII De-identification Example

from openmed.core.pii import extract_pii, deidentify

# Extract PII with context-aware scoring
text = """Patient: John Smith, DOB: 01/15/1980
SSN: 123-45-6789, Phone: (555) 234-5678
Email: john.smith@email.com
Provider NPI: 1234567893"""

result = extract_pii(text, use_smart_merging=True)

# View detected entities with confidence scores
for entity in result.entities:
    print(f"{entity.text} → {entity.label} ({entity.confidence:.2f})")

# De-identify the text
deid_result = deidentify(text, method="mask")
print(deid_result.deidentified_text)
# Output: Patient: [FIRST_NAME] [LAST_NAME], DOB: [DATE_OF_BIRTH]
# SSN: [SSN], Phone: [PHONE_NUMBER]
# Email: [EMAIL], Provider NPI: [NPI]

chemical-entity-recognition drug-discovery pharmacology biocuration chem

278M parameters 100.12K downloads

View on HF

Browse All 500+ Models

AWS SageMaker Deployment

Production-ready healthcare AI models deployed on AWS SageMaker with enterprise-grade security and scalability.

Enterprise Deployment

Scalable cloud infrastructure with automatic provisioning and load balancing

High Performance

Optimized inference endpoints with sub-100ms latency for real-time applications

Security & Compliance

HIPAA-compliant infrastructure with end-to-end encryption and audit trails

OpenMed NER Genome Detection Tiny

Model Package Amazon SageMaker

Open-source gene entity recognition tuned on BC2GM for precision genomics workflows. Deploy secure inference endpoints in under five minutes using the managed marketplace package.

Regions: us-east-1, us-east-2, us-west-1, us-west-2, eu-central-1 and more

Gene NER BC2GM dataset Clinical genomics

Fulfillment: Marketplace-managed SageMaker model package

View Marketplace Listing

OpenMed NER Species Detection

Pathogen NER Marketplace ready

Identifies organism mentions in clinical narratives to accelerate biosurveillance, antimicrobial stewardship and microbiome research pipelines.

Best for: Epidemiology pipelines, infectious disease dashboards, lab automation

Species NER Microbiology Batch & real-time

Quickstart: Use the SageMaker sample notebook to deploy and monitor endpoints.

Find on AWS Marketplace

SageMaker Starter Notebooks

GitHub Examples repo

Hands-on notebooks for deployment, scaling, monitoring and cost optimization across every OpenMed marketplace model.

Includes: JumpStart templates, marketplace entitlement helpers, automation scripts

Notebook-ready Step-by-step Monitoring

Coverage: Batch transform, real-time endpoints, cost governance

Browse Notebooks

OpenMed on AWS Marketplace

Research & Publications

Peer-reviewed research advancing the state of healthcare AI with rigorous scientific methodology and reproducible results.

OpenMed NER: Open-Source, Domain-Adapted State-of-the-Art Transformers for Biomedical NER Across 12 Public Datasets

arXiv:2508.01630 2025 Maziyar Panahi

Abstract

Named-entity recognition (NER) is fundamental to extracting structured information from the >80% of healthcare data that resides in unstructured clinical notes and biomedical literature. Despite recent advances with large language models, achieving state-of-the-art performance across diverse entity types while maintaining computational efficiency remains a significant challenge. We introduce OpenMed NER, a suite of open-source, domain-adapted transformer models that combine lightweight domain-adaptive pre-training (DAPT) with parameter-efficient Low-Rank Adaptation (LoRA). Our approach performs cost-effective DAPT on a 350k-passage corpus compiled from ethically sourced, publicly available research repositories and de-identified clinical notes (PubMed, arXiv, and MIMIC-III) using DeBERTa-v3, PubMedBERT, and BioELECTRA backbones. This is followed by task-specific fine-tuning with LoRA, which updates less than 1.5% of model parameters. We evaluate our models on 12 established biomedical NER benchmarks spanning chemicals, diseases, genes, and species. OpenMed NER achieves new state-of-the-art micro-F1 scores on 10 of these 12 datasets, with substantial gains across diverse entity types. Our models advance the state-of-the-art on foundational disease and chemical benchmarks (e.g., BC5CDR-Disease, +2.70 pp), while delivering even larger improvements of over 5.3 and 9.7 percentage points on more specialized gene and clinical cell line corpora. This work demonstrates that strategically adapted open-source models can surpass closed-source solutions. This performance is achieved with remarkable efficiency: training completes in under 12 hours on a single GPU with a low carbon footprint (< 1.2 kg CO2e), producing permissively licensed, open-source checkpoints designed to help practitioners facilitate compliance with emerging data protection and AI regulations, such as the EU AI Act.

Key Achievements

10/12

SOTA Benchmarks

+9.7pp

Max Improvement

<12h

Training Time

<1.2kg

CO2e Footprint

Read on Hugging Face Download PDF View Models

Author

Maziyar Panahi

Healthcare AI FAQ

Answers to the most common questions from clinical innovation, ML, and compliance teams evaluating OpenMed for production NLP and decision support workloads.

What makes OpenMed models production-ready for healthcare?

Each model ships with benchmark results across 12 biomedical NER datasets, guardrail evaluations, and Apache-2.0 licensing, so teams can satisfy procurement reviews and accelerate go-live timelines.

How do I deploy OpenMed in the cloud or on-premises?

Use the AWS Marketplace model packages and JumpStart notebooks for guided SageMaker deployment, or pull the Docker images and Python package to host inside your own VPC or on hospital infrastructure.

Does OpenMed support HIPAA-aligned workflows?

Yes. Models are trained on de-identified, ethically sourced corpora, support private deployment, and integrate audit logging, encryption, and access controls when run on SageMaker or self-managed environments.

Can we fine-tune OpenMed LLMs for custom vocabularies?

Absolutely. Lightweight LoRA adapters, curated tokenizers, and starter notebooks are provided so you can extend entity coverage to local ontologies or multi-lingual records while keeping compute requirements modest.

About OpenMed - Leading Healthcare AI Innovation

Founded by Maziyar Panahi, OpenMed is a community-driven, non-profit effort to democratize state-of-the-art LLMs for healthcare and make powerful clinical AI freely available. All healthcare AI models are released under Apache-2.0 with practical demos and deployment recipes for medical and clinical applications.

Transparent benchmarking & reproducible training
Privacy-minded, on-prem & cloud-friendly
Ecosystem: Hugging Face org, Spaces, GitHub starter

Hugging Face GitHub Model Discovery

Press & Community

For interviews, speaking, or partnerships, reach out and we'll get back quickly.

no email :-) LinkedIn X (Twitter) Hugging Face