OpenMed 1.9.1 · On-Device Clinical AI

Your Data. Your Model. Your Hardware.

OpenMed reads clinical text and removes 55+ PHI types on the hardware you control, so patient data never leaves the device. The same open models run from a phone to a GPU server, fully offline. 2,000+ models, 21 PII languages, and state of the art on 10 of 12 biomedical NER benchmarks.

View on GitHub Read the docs

Apple MLX · Browser WebGPU · 9.4M installs · On-device · Apache-2.0

pii.detect() · on-device

0/5 entities

lang="en" · HIPAA Safe Harbor 0 bytes left device

Live PHI detection · 21 model-backed languages ▶ watch

Impact Adoption and footprint across the open ecosystem: counted, not claimed. Hugging Face + PyPI · July 2026

340M

Model downloads, all-time

30M

Downloads every month

9.4M

OpenMed installs

2,000+

Open models: the largest open medical-AI collection

Model-backed PII languages: ar, de, en, es, fr, he, hi, id, it, ja, ko, nl, pt, ro, sw, te, th, tr, xh, zu

<1.2kg

CO₂e to fine-tune a model: under 12 hours on a single GPU

One local-first story, every runtime python mlx swift javascript typescript react android

Quickstart

Four lines to production.

Composable Python APIs for notebooks and services. Same call shape across local MLX, CPU, and cloud.

analyze_text(...)
One-call inference with structured outputs
extract_pii / deidentify
Language-aware across 22 supported PII languages
BatchProcessor(...)
Progress callbacks, per-item results
OpenMedConfig.from_profile(...)
dev / prod / test profiles

Read full API guide

openmed 1.9.1 · quickstart

$ pip install openmed

Runs everywhere

The same models, from a phone to a GPU server.

The same open models run on iPhone, iPad, and Android via OpenMedKit, in React Native apps, on plain laptop CPUs, Apple Silicon, and NVIDIA GPUs, in the browser through WebGPU, and as REST or gRPC services: no cloud, no telemetry, nothing leaving the device.

Python MLX

Accelerated local workflows on Apple Silicon

Install openmed[mlx] to run local inference, PII extraction, and Nemotron Privacy Filter workflows with Apple-native MLX acceleration on Mac.

Read the MLX backend guide →

OpenMedKit

Swift-native PII and clinical LLMs for Apple apps

Bring detection, smart entity merging, and local model execution into macOS, iOS, and iPadOS apps without sending PHI off device.

Explore OpenMedKit docs →

Browser / WebGPU

Run token classification in the browser

Export ONNX token-classification artifacts into the transformersjs/ layout and load them with Transformers.js for WebGPU-backed browser inference.

Read the Transformers.js export guide →

Healthcare data privacy

Clinical text de-identification, built for HIPAA & GDPR.

Language-aware PII routing across 22 supported PII language codes: am, ar, de, en, es, fr, he, hi, id, it, ja, ko, nl, pt, ro, sw, te, th, tr, xh, zh, and zu, plus validator-backed national-ID coverage for additional ID-only locales. Chinese uses a documented multilingual default-model placeholder. A user-configured Indic NER adapter adds nine optional routes and can also serve Hindi and Telugu. Process data locally, so your PHI never leaves your environment.

Context-aware PHI detection

Presidio-inspired scoring boosts confidence when keywords like SSN:, DOB:, or NPI: appear near detected entities.

Keyword boosting 100-char window

Checksum & format validation

Built-in validators reduce false positives: French NIR, German Steuer-ID, Italian Codice Fiscale, Spanish DNI/NIE, Dutch BSN, Portuguese CPF/CNPJ, Luhn.

NIR Steuer-ID Codice Fiscale CPF / CNPJ

Smart entity merging

Subword tokenizers split 123-45-6789 into fragments; semantic patterns reassemble complete SSNs, phones, and multi-word entities.

BIO tags Regex patterns

Zero data movement

Process PHI entirely on your infrastructure. No API calls to external services. Your clinical data never leaves your secure environment.

Air-gapped On-premises

Flexible redaction methods

Mask with entity-type labels [NAME], redact completely, hash, shift dates, or replace with deterministic Faker-backed surrogates. Configurable thresholds for precision control.

Mask Redact Replace

HIPAA Safe Harbor ready

Detects all 18 HIPAA Safe Harbor identifiers, part of a wider 55+ PII entity catalog, across 22 supported PII language codes, with additional validator-backed national-ID coverage for ID-only locales.

18 Safe Harbor 55+ entity types 21 model-backed languages

The model library

State-of-the-art, by entity type.

Production-ready LLMs for healthcare and clinical AI across 13 biomedical domains: chemicals, diseases, genes, proteins, species, anatomy, oncology.

Browse 2,000+ on HF

Featured

OpenMed-NER ChemicalDetect

ElectraMed · 33M

chemical-entity-recognition drug-discovery pharmacology

117K downloads Open on HF

Featured

OpenMed-NER DiseaseDetect

BioMed · 335M

disease-entity-recognition medical-diagnosis pathology

104K downloads Open on HF

OpenMed-NER GenomicDetect

PubMed · 109M

gene-recognition genetics genomics

103K downloads Open on HF

OpenMed-NER OncologyDetect

MultiMed · 568M

cancer-genetics oncology gene-regulation

102K downloads Open on HF

OpenMed-NER DNADetect

SuperClinical · 184M

protein-recognition molecular-biology dna

90K downloads Open on HF

OpenMed-NER PharmaDetect

BigMed · 278M

chemical-entity-recognition drug-discovery chem

100K downloads Open on HF

Managed deployment

Prefer managed, HIPAA-compliant endpoints over self-hosting? Selected OpenMed models ship as model packages on AWS SageMaker, with sub-100ms inference and end-to-end encryption.

OpenMed on AWS Marketplace

Built with OpenMed

Open at the core. Real products.

The OpenMed library is open source and free. Welna and OpenMed Agent are separate products built on top of it: one for patients, one for clinical teams.

wel·na

App Store · iOS 17+

Private patient intelligence for iPhone

Reads Apple Health with your permission, redacts identifiers on device, and turns ninety days of signals into plain-language reads, plus the questions worth bringing to a clinician.

Welna iPhone app home screen: a Vitality score of 100 from six signals, a card marked redacted before it leaves the device, and a Prepare for your next visit workflow.

Runs the OpenMed PII model on device (~40 MB), redacting identifiers before anything leaves the phone.

Download on the App Store

openagent

Preview

Terminal-native AI for clinical workflows

One inspectable agent for prior auth, appeals, coding, claims, care coordination, and FHIR work, with visible plans and auditable artifacts before any action.

openmed · prior-authplan · 5 steps

›Review CPT 27447 against LCD criteria

✓Read clinical record· 6 summaries

✓De-identified on service· OpenMed PII

✓Matched ICD-10 and LCD/NCD· terminology

▸Drafting appeal, review before send

Built on OpenMed extraction, de-ID & terminology.

Open agent.openmed.life

Paper · arXiv:2508.01630 · since 2025

Started as a SOTA NER paper. Grew into the catalog.

The original work, domain-adaptive pre-training with parameter-efficient LoRA on 350k biomedical passages, set state-of-the-art on 10 of 12 NER benchmarks. Since then, the catalog has expanded into multilingual PII, Apple-native MLX variants, and curated datasets across the broader OpenMed collection.

Download PDF Read on HF

25+

Curated Datasets

MIMIC-III, PubMed, BC5CDR, BC2GM, JNLPBA…

600+

PII Models

Arabic, German, English, Spanish, French, Hebrew, Hindi, Indonesian, Italian, Japanese, Dutch, Portuguese, Telugu, Thai, Turkish

55+

PII Entity Types

Locale-aware: SSN, NIR, Steuer-ID, CF, DNI, BSN, CPF/CNPJ…

650+

MLX Variants

Apple Silicon ready, BERT-family supported

Healthcare AI FAQ

Questions from the clinical, ML, and compliance teams.

If we haven't answered yours, reach out; we reply from the same people who train the models.

What is OpenMed, exactly?

An open-source medical NLP toolkit. Specialized transformer models fine-tuned for biomedical named-entity recognition: diseases, drugs, genes, anatomy, chemicals, oncology, plus PII extraction and de-identification across 22 supported PII language codes: am, ar, de, en, es, fr, he, hi, id, it, ja, ko, nl, pt, ro, sw, te, th, tr, xh, zh, and zu. Chinese routing uses a documented multilingual default-model placeholder. Ships as a Python package, Dockerized FastAPI and gRPC services, Swift and Kotlin OpenMedKit packages, React Native bridge, and browser runtime. Apache-2.0, no vendor lock-in, runs on your infrastructure.

Are these generative LLMs or something else?

Encoder transformers (BERT, ELECTRA, DeBERTa families), not generative chat models. They do extraction and classification, pulling structured entities out of unstructured clinical text, and stay small enough to run on a laptop or a phone. The paper (arXiv:2508.01630) reports new state-of-the-art on 10 of 12 biomedical NER benchmarks. Think of them as complementary to the larger generative "medical LLM" category, not a replacement.

Where does my data go when I use OpenMed?

Nowhere you don't send it. You download the models once (Hugging Face or a private mirror) and inference runs wherever you run the Python process, the Docker container, or the Swift app: your laptop, your VPC, an on-prem server, or air-gapped hardware. No telemetry, no license check-in, no outbound calls at runtime. PHI stays on your side of the fence by default.

Does OpenMed support HIPAA-aligned workflows?

The PII catalog covers all 18 HIPAA Safe Harbor identifiers across 22 supported PII language codes: am, ar, de, en, es, fr, he, hi, id, it, ja, ko, nl, pt, ro, sw, te, th, tr, xh, zh, and zu, with Chinese using a documented multilingual default-model placeholder, locale-aware validators for SSN, NIR, Steuer-ID, Codice Fiscale, DNI, BSN, CPF/CNPJ, RRN, CNP, South African ID numbers, Luhn checks, and additional ID-only national-ID validators. Models are trained on de-identified, ethically sourced corpora. OpenMed provides the technical controls (on-device processing, configurable thresholds, multiple redaction methods); the legal compliance boundary lives in your deployment.

Can I fine-tune OpenMed models for my own vocabulary?

Yes. Models are published on Hugging Face under permissive licensing with full training recipes. The reference approach combines lightweight domain-adaptive pre-training on a 350k-passage biomedical corpus with parameter-efficient LoRA fine-tuning, updating less than 1.5% of model parameters and completing in under 12 hours on a single GPU (<1.2 kg CO₂e). Tokenizer assets and starter notebooks are in the openmed-starter repo.

Do I need OpenMed Agent to use OpenMed?

No. The Python toolkit, OpenMedKit (Swift), and MLX backend are fully self-contained Apache-2.0, so you can build and ship without ever touching the Agent. OpenMed Agent is a separate medical agent currently in preview: a terminal-native runner for clinical workflows on top of the same stack.

Open source, open invitation

9.4M installs and counting. Built in the open.

OpenMed has crossed 9.4 million installs on PyPI, and every model, validator, and language pack came from a community that ships in public. Star the repo, open an issue, or send a pull request: clinical AI gets better when more people can see how it works.

Contribute on GitHub Read the contributing guide