Performance Profiling¶
OpenMed includes built-in performance profiling utilities to measure and optimize your NLP pipelines.
Quick Start¶
from openmed import enable_profiling, get_profile_report, analyze_text
# Enable global profiling
profiler = enable_profiling()
# Run your analysis
result = analyze_text(
"Patient has diabetes mellitus type 2.",
model_name="disease_detection_superclinical",
)
# Get the report
report = get_profile_report()
print(report.format_report())
Profiler Class¶
For fine-grained control:
from openmed import Profiler
profiler = Profiler(enabled=True)
# Measure operations
with profiler.measure("model_loading"):
model = load_model("disease_detection_superclinical")
with profiler.measure("inference"):
result = model(text)
with profiler.measure("postprocessing"):
formatted = format_predictions(result, text)
# Generate report
report = profiler.report()
print(f"Total time: {report.total_duration * 1000:.2f}ms")
Context Managers¶
measure()¶
The primary way to time operations:
profiler = Profiler(enabled=True)
with profiler.measure("my_operation", metadata={"input_size": len(text)}):
result = process(text)
timed()¶
Quick timing with automatic logging:
from openmed import timed
with timed("model inference") as timer:
result = model(text)
print(f"Took {timer.elapsed_ms:.2f}ms")
# Also logs: "model inference took 123.45ms"
Timer Class¶
For manual timing:
from openmed import Timer
timer = Timer()
timer.start()
# Do work
result = analyze_text(text)
elapsed = timer.stop()
print(f"Analysis took {elapsed:.2f}ms")
# Properties
print(f"Seconds: {timer.elapsed_s}")
print(f"Milliseconds: {timer.elapsed_ms}")
Profile Decorator¶
Decorate functions to automatically profile them:
from openmed import profile, enable_profiling, get_profile_report
enable_profiling()
@profile("data_preprocessing")
def preprocess(text):
# preprocessing logic
return cleaned_text
@profile() # Uses function name
def run_inference(text):
return analyze_text(text)
# Call functions
result = run_inference(preprocess(text))
# Get timings
report = get_profile_report()
for timing in report.timings:
print(f"{timing.name}: {timing.duration * 1000:.2f}ms")
Profile Reports¶
ProfileReport Structure¶
report = profiler.report()
# Properties
report.total_duration # Total time in seconds
report.timing_count # Number of measurements
report.timings # List of TimingResult objects
report.metadata # Custom metadata dict
# Methods
report.get_timing("model_loading") # Get specific timing
report.get_timings_by_prefix("inference.") # Filter by prefix
report.summary() # Dict summary
report.to_dict() # Full dict representation
report.format_report() # Human-readable string
Example Output¶
Performance Profile Report
==================================================
Total Duration: 1234.56ms
Operations: 4
Timings:
--------------------------------------------------
inference.forward 856.23ms ( 69.4%)
model_loading 312.45ms ( 25.3%)
postprocessing 45.67ms ( 3.7%)
preprocessing 20.21ms ( 1.6%)
Inference Metrics¶
Track detailed inference metrics:
from openmed.utils.profiling import InferenceMetrics
metrics = InferenceMetrics(
text_length=500,
token_count=120,
entity_count=5,
inference_time_ms=150.0,
preprocessing_time_ms=10.0,
postprocessing_time_ms=5.0,
total_time_ms=165.0,
)
print(f"Tokens/sec: {metrics.tokens_per_second:.0f}")
print(f"Chars/sec: {metrics.chars_per_second:.0f}")
Batch Metrics¶
Aggregate metrics for batch processing:
from openmed.utils.profiling import BatchMetrics, InferenceMetrics
items = [
InferenceMetrics(text_length=100, entity_count=2, total_time_ms=50),
InferenceMetrics(text_length=200, entity_count=3, total_time_ms=80),
InferenceMetrics(text_length=150, entity_count=1, total_time_ms=60),
]
batch = BatchMetrics(items=items, total_time_ms=190)
print(batch.format_report())
# Output:
# Batch Processing Metrics
# ========================================
# Items processed: 3
# Total characters: 450
# Total entities: 6
# Total time: 190.00ms
# Avg time/item: 63.33ms
# Throughput: 15.79 items/sec
# Throughput: 2368 chars/sec
Global Profiling¶
Enable profiling globally for an entire session:
from openmed import (
enable_profiling,
disable_profiling,
get_profile_report,
)
# Enable profiling
profiler = enable_profiling()
# All analyze_text calls will be profiled
result1 = analyze_text(text1)
result2 = analyze_text(text2)
# Get combined report
report = get_profile_report()
print(report.format_report())
# Disable when done
disable_profiling()
Session Tracking¶
Track overall session duration:
profiler = Profiler(enabled=True)
profiler.start() # Start session timer
with profiler.measure("op1"):
do_work()
with profiler.measure("op2"):
do_more_work()
profiler.stop() # Stop session timer
report = profiler.report()
print(f"Session duration: {report.metadata['session_duration_ms']:.2f}ms")
Adding Custom Metadata¶
profiler = Profiler(enabled=True)
# Add metadata
profiler.add_metadata("model", "disease_detection_superclinical")
profiler.add_metadata("batch_size", 32)
# Add manual timing
profiler.add_timing("external_api_call", 0.5, metadata={"endpoint": "/api/v1"})
report = profiler.report()
print(report.metadata)
# {'model': 'disease_detection_superclinical', 'batch_size': 32}
Disabled Profiler (Zero Overhead)¶
When profiling is disabled, all operations are no-ops:
profiler = Profiler(enabled=False)
# These have virtually no overhead
with profiler.measure("operation"):
result = expensive_operation()
report = profiler.report()
assert report.timing_count == 0 # No timings recorded
Best Practices¶
- Enable only when needed: Use
enabled=Falsein production unless actively profiling - Use prefixes for organization:
model.load,model.infer,post.format - Add metadata for context: Include batch sizes, model names, input characteristics
- Profile at multiple levels: Overall pipeline + individual components
- Reset between runs: Call
profiler.reset()for clean measurements