MLX Backend (Apple Silicon)¶
OpenMed v1.0.0 introduces native Apple Silicon acceleration via Apple MLX. On Macs with M1/M2/M3/M4 chips, NER and PII inference runs directly on the GPU — up to 10x faster than CPU-only PyTorch.
Installation¶
This installs mlx, huggingface-hub, tokenizers, and safetensors.
Quick Start¶
from openmed import analyze_text
from openmed.core.config import OpenMedConfig
# MLX is auto-detected on Apple Silicon — no config needed
result = analyze_text(
"Patient John Doe, DOB 1990-05-15, SSN 123-45-6789",
model_name="pii_detection",
)
print(result.entities)
To force a specific backend:
config = OpenMedConfig(backend="mlx") # Force MLX
config = OpenMedConfig(backend="hf") # Force HuggingFace/PyTorch
config = OpenMedConfig(backend=None) # Auto-detect (default)
How It Works¶
- Auto-detection: On Apple Silicon Macs with
mlxinstalled, OpenMed automatically selects the MLX backend. - On-the-fly conversion: The first time you use a model with MLX, it's automatically converted from HuggingFace format and cached in
~/.cache/openmed/mlx/. - Identical output: MLX produces the same entity format as the HuggingFace backend — all downstream processing (entity merging, quality gates, PII detection) works identically.
Model Conversion¶
Automatic (recommended)¶
Models are converted automatically on first use. No manual step needed.
Manual conversion¶
For pre-converting models (e.g., for offline deployment):
python -m openmed.mlx.convert \
--model OpenMed/OpenMed-PII-SuperClinical-Small-44M-v1 \
--output ./mlx-models/pii-small
With 8-bit quantization (reduces model size by ~4x):
python -m openmed.mlx.convert \
--model OpenMed/OpenMed-PII-SuperClinical-Small-44M-v1 \
--output ./mlx-models/pii-small-q8 \
--quantize 8
The output directory contains: - weights.npz — Model weights in MLX/NumPy format - config.json — Model architecture configuration - id2label.json — Label ID to entity name mapping
Supported Models¶
Currently, the MLX backend supports BERT-based token classification models:
| Model | Parameters | Status |
|---|---|---|
| OpenMed-PII-SuperClinical-Small-44M-v1 | 44M | Supported |
| OpenMed-PII-SuperClinical-Base-110M-v1 | 110M | Supported |
| Other BERT-based NER models | Varies | Supported |
DeBERTa, ModernBERT, and ELECTRA architectures will be added in future releases.
Fallback Behavior¶
If MLX is not available (non-Apple hardware, or mlx not installed), OpenMed automatically falls back to the HuggingFace/PyTorch backend. No code changes required.
Conversion Without MLX¶
You can convert models on any machine (even Linux) — the converter falls back to NumPy format:
# On Linux CI — produces NumPy .npz (no MLX needed for conversion)
python -m openmed.mlx.convert \
--model OpenMed/OpenMed-PII-SuperClinical-Small-44M-v1 \
--output ./mlx-models/pii-small
The NumPy .npz files are fully compatible with the MLX backend.