Welcome to pydftracer documentation!
pydftracer is the Python frontend for DFTracer, a powerful I/O profiling and tracing tool. This library provides Python bindings and utilities to integrate DFTracer profiling capabilities into Python applications, with specialized support for AI/ML frameworks and PyTorch integration.
Features
Python Frontend: Easy-to-use Python API for DFTracer profiler
Type-Safe Decorators: Fully type-checked with mypy, preserves function signatures
Function Decorators: Simple
@dft_fndecorator for tracing Python functionsAI/ML Support: Specialized tracing for AI/ML workflows
PyTorch Integration: Capture PyTorch profiler events to DFTracer traces
PyTorch Dynamo Integration: Wrapper of PyTorch’s Dynamo
Automatic I/O Tracing: Transparent tracing of I/O operations when enabled
Debugging Tools: Built-in debugging utilities
Environment Configuration: Flexible configuration via environment variables
Cross-platform: Works on Linux and other Unix-like systems
Contents:
- Installation
- Quick Start
- Examples
- Type Safety
- AI/ML Tracing Guide
- Motivation
- Overview
- Basic Setup
- Data Operations
- Device Operations
- Compute Operations
- Communication Tracing
- Checkpointing
- Training Pipeline
- Advanced Features
- AI/DL Logging Conventions
- Flexible API Styles
- Updating Arguments
- Force Enable or Disable Specific Events
- Hook/Checkpoint Style
- Derivation
- Metadata / Streaming Style
- Init Events
- Caveats
- Summary
- PyTorch Profiler Integration
- PyTorch Dynamo Integration
- Developer Guide
- API Reference
Getting Started
To get started with pydftracer, check out the Installation guide and then follow the Quick Start tutorial.
Installation
pip install pydftracer
For more detailed installation instructions, see Installation.
Quick Example
# Enable DFTracer via environment variable
export DFTRACER_ENABLE=1
from dftracer.python import dftracer, dft_fn
# Initialize the DFTracer logger
df_logger = dftracer.initialize_log("trace.pfw", "/tmp/data", -1)
# Create a tracer for your functions
io_tracer = dft_fn("io_operations")
@io_tracer.log
def read_data(filename):
with open(filename, 'r') as f:
return f.read()
# I/O operations will be automatically profiled
data = read_data('data.txt')
# Finalize the logger
df_logger.finalize()