Async De-identification Jobs¶
POST /jobs accepts a multi-document de-identification batch and returns a job id immediately. The service processes jobs on a bounded local worker pool and persists only job metadata: status, progress, document hashes, label counts, offsets, span hashes, and webhook delivery status.
Raw submitted text is not written to the local job store. It remains in memory only until a worker finishes the job.
Submit a Job¶
curl -s http://127.0.0.1:8080/jobs \
-H 'Content-Type: application/json' \
-d '{
"documents": [
{"id": "note-1", "text": "Patient Maria Garcia arrived at 09:00."},
{"id": "note-2", "text": "Call Maria Garcia at 555-1212."}
],
"method": "mask",
"webhook": {
"url": "https://pipeline.example.com/openmed/jobs",
"secret": "replace-with-shared-secret",
"max_attempts": 3,
"backoff_seconds": 0.5
}
}'
The response uses HTTP 202 and includes id, status, progress_percent, document_count, and status_url.
Poll Status¶
Status moves from queued to running and then to done or failed. Progress is updated after each document. Terminal records include:
processed_countandfailed_countlabel_histogramspanswithdocument_id, offsets, labels, confidence, and text hashesdocumentswith ids, lengths, and document hasheswebhook_deliveryafter callback attempts complete
The job API does not return raw redacted text. Use synchronous POST /pii/deidentify when a caller needs the redacted payload in the response.
Webhook Payloads¶
Terminal jobs trigger a signed webhook when webhook is provided. The payload contains no raw PHI:
{
"event": "job.done",
"job_id": "<job-id>",
"status": "done",
"progress_percent": 100.0,
"document_count": 2,
"processed_count": 2,
"failed_count": 0,
"label_histogram": {"NAME": 2, "PHONE": 1},
"spans": [
{
"document_id": "note-1",
"start": 8,
"end": 20,
"label": "NAME",
"text_hash": "sha256:...",
"confidence": 0.99
}
],
"documents": [
{"id": "note-1", "length": 42, "text_hash": "sha256:..."}
],
"error": null,
"completed_at": "2026-07-01T10:00:00Z"
}
Webhook requests include:
X-OpenMed-Event:job.doneorjob.failedX-OpenMed-Timestamp: Unix timestampX-OpenMed-Signature:sha256=<hmac>
The signature is computed over <timestamp>.<canonical-json-body> using the configured shared secret and HMAC-SHA256. Non-2xx responses and transport errors are retried with exponential backoff.
Local Store Configuration¶
The default metadata store is ~/.cache/openmed/service-jobs.json. Override it for containers or tests:
OPENMED_SERVICE_JOBS_STORE_PATH=/var/lib/openmed/jobs.json \
OPENMED_SERVICE_JOBS_TTL_SECONDS=86400 \
OPENMED_SERVICE_JOBS_WORKERS=2 \
uvicorn openmed.service.app:app --host 0.0.0.0 --port 8080
OPENMED_SERVICE_JOBS_TTL_SECONDS controls terminal metadata cleanup. OPENMED_SERVICE_JOBS_WORKERS bounds concurrent job execution. The local backend is intended for one service process and does not provide multi-node distribution.