Quick Start

This guide will help you get started with pydftracer quickly.

Basic Usage

Enabling DFTracer

DFTracer must be enabled via environment variable before use:

export DFTRACER_ENABLE=1
python your_script.py

Initialize DFTracer Logger

First, initialize the DFTracer logger:

from dftracer.python import dftracer

# Initialize with log file path, data directory, and process ID
df_logger = dftracer.initialize_log(
    log_file="trace.pfw",
    data_dir="/tmp/data",
    process_id=-1  # -1 for auto process ID
)

# Your code here - I/O operations will be automatically traced

# Always finalize when done
df_logger.finalize()

Function Tracing with dft_fn

Use dft_fn to create tracers for specific functions:

from dftracer.python import dftracer, dft_fn
import numpy as np

# Initialize logger
df_logger = dftracer.initialize_log("trace.pfw", "/tmp/data", -1)

# Create a function tracer
io_tracer = dft_fn("data_io")

@io_tracer.log
def write_data(filename, data):
    np.save(filename, data)

@io_tracer.log
def read_data(filename):
    return np.load(filename)

# Use the traced functions
data = np.ones((100, 100))
write_data('test.npy', data)
result = read_data('test.npy')

df_logger.finalize()

Using Iterators

Track iterations with dft_fn.iter():

from dftracer.python import dftracer, dft_fn

df_logger = dftracer.initialize_log("trace.pfw", "/tmp/data", -1)
my_tracer = dft_fn("training")

@my_tracer.log
def process_batch(batch_id):
    # Process each batch
    for i in my_tracer.iter(range(10)):
        # Process item i
        pass

process_batch(0)
df_logger.finalize()

Environment Configuration

Configure pydftracer using environment variables:

# Enable DFTracer
export DFTRACER_ENABLE=1

# Set initialization mode (PRELOAD or other)
export DFTRACER_INIT=PRELOAD

# Set log level (DEBUG, INFO, WARN, ERROR)
export DFTRACER_LOG_LEVEL=INFO

You can also check these in your code:

from dftracer.python import (
    DFTRACER_ENABLE,
    DFTRACER_INIT_PRELOAD,
    DFTRACER_LOG_LEVEL
)

if DFTRACER_ENABLE:
    print("DFTracer is enabled")

AI/ML Tracing

PyTorch Dynamo Integration

For PyTorch applications, use the create_backend function to create a custom DFTracer backend for torch.compile:

import torch
from dftracer.python import dftracer
from dftracer.python.dynamo import create_backend

# Initialize logger
df_logger = dftracer.initialize_log("model_trace.pfw", "/tmp/data", -1)

# Create a custom DFTracer backend
backend = create_backend(
    name="my_model",
    epoch=0,
    step=0,
    enable=True
)

# Use with torch.compile
model = MyModel()
compiled_model = torch.compile(model, backend=backend)

# Run your model - operations will be traced
output = compiled_model(input_tensor)

# Finalize when done
df_logger.finalize()

Dynamo Class

For more control over Dynamo tracing:

from dftracer.python import Dynamo

# Create a Dynamo tracer instance
dynamo_tracer = Dynamo(
    name="my_model",
    epoch=1,
    step=100,
    enable=True
)

# Use in your training loop
# The tracer will record PyTorch operations

AI Tracing Features

Use the AI tracing utilities:

from dftracer.python import DFTracerAI

# Create an AI-specific tracer
ai_tracer = DFTracerAI(
    cat="training",
    name="resnet50",
    epoch=5,
    step=1000,
    enable=True
)

Advanced Usage

Custom Tags

Add custom metadata to your traces:

from dftracer.python import dftracer, TagValue, TagDType, TagType

dft = dftracer()

# Create custom tags
tag = TagValue(
    value="my_value",
    dtype=TagDType.STRING,
    tag_type=TagType.KEY
)

# Use in your traced functions

Metadata Events

Log metadata events:

from dftracer.python import dftracer

dft = dftracer()

# Log metadata
dft.log_metadata_event("key", "value")

Next Steps

Explore the API Reference for detailed API documentation
Check the DFTracer main documentation for more advanced features
Look at example scripts in the repository