ModelLoader & Pipelines¶
openmed.core.models.ModelLoader is the backbone for all runtime integration. It centralizes Hugging Face discovery, credential management, caching, and pipeline instantiation so you can move between quick experiments and production runners without rewriting glue code.
When to use it¶
- You want to reuse a single tokenizer/pipeline across many documents.
- You need to load multiple models (e.g., disease + pharma) side-by-side.
- You are deploying a service and prefer to hydrate everything at startup.
- You require maximum control over device placement, dtype, batch size, or tokenizer configuration.
Essentials¶
from openmed.core import ModelLoader, OpenMedConfig
config = OpenMedConfig(
device="cuda",
cache_dir="~/.cache/openmed",
hf_token="hf_api_token_if_needed",
)
loader = ModelLoader(config=config)
pipeline = loader.create_pipeline(
"disease_detection_superclinical",
task="token-classification",
aggregation_strategy="simple",
use_fast_tokenizer=True,
)
raw = pipeline("Administered paclitaxel alongside trastuzumab.")
create_pipelineaccepts any kwargs supported bytransformers.pipeline.- Tokens are cached per model/config combination; repeated calls reuse the same HF objects.
ModelLoader.get_max_sequence_length(model_name)infers tokenizer limits when you need manual truncation logic.
Local model directories¶
If a model has already been downloaded or vendored into your runtime image, pass the directory path directly:
from pathlib import Path
from openmed.core import ModelLoader, OpenMedConfig
model_dir = Path("./models/OpenMed-NER-DiseaseDetect-SuperClinical-434M").resolve()
loader = ModelLoader(OpenMedConfig(device="cpu"))
pipeline = loader.create_pipeline(
str(model_dir),
task="token-classification",
aggregation_strategy="simple",
)
raw = pipeline("Patient has chronic myeloid leukemia.")
Existing filesystem paths are preserved as local paths rather than expanded to the default OpenMed organization. For Hugging Face / PyTorch loading, OpenMed also sets local_files_only=True by default for those paths so the loader does not contact the Hub in air-gapped or firewalled environments.
Discovery helpers¶
from openmed.core import ModelLoader
loader = ModelLoader()
print(loader.list_available_models(include_registry=True, include_remote=False))
print(loader.list_available_models(include_registry=True, include_remote=True)[:5])
These functions power openmed.list_models(). Use them to present dropdowns in UIs or to pre-flight deployments before running inference.
Device & caching strategy¶
- CPU-only deployments: leave
device="cpu"and skip installing GPU runtimes. Transformers will usetorchCPU wheels. - GPU deployments: set
device="cuda"orcuda:1, and optionally configuretorch_dtype="auto"viaOpenMedConfig.pipeline. - Air-gapped or repeated builds: pass an existing local model directory, or point
cache_dirat a persistent volume after prefetching the model. Local directories are loaded withlocal_files_only=Trueby default. - Provide
hf_tokenwhen you consume gated models, or rely onHfFoldercredentials.
Token helpers¶
If you need raw token alignment, use the tokenization utilities that ship alongside the loader:
from openmed.processing import TokenizationHelper
model_data = loader.load_model("anatomy_detection_electramed")
token_helper = TokenizationHelper(model_data["tokenizer"])
encoding = token_helper.tokenize_with_alignment("BP 120/80. Start metformin 500mg bid.")
print(encoding["tokens"][:10])
load_model lets you access the underlying HF AutoModel + AutoTokenizer for workflows that outgrow pipelines.
Sentence detection reuse¶
ModelLoader does not run pySBD itself, but it exposes hooks so analyze_text can pass in the tokenizer/pipeline it creates. If you batch many texts with custom segmentation, build the segmenter once and reuse it:
from openmed.processing import sentences
segmenter = sentences.create_segmenter(language="en", clean=True)
for doc in docs:
segments = sentences.segment_text(doc, segmenter=segmenter)
Troubleshooting checklist¶
- “Tokenizer length mismatch”: call
loader.get_max_sequence_length(model_name)and setmax_lengthexplicitly. - “Model not found”: confirm it exists in
openmed.core.model_registry.OPENMED_MODELSor pass a full HF path and setinclude_remote=Trueif you rely on discovery. - Slow cold-starts: prefetch pipelines at startup and mount the cache dir on SSD/NVMe storage.