Text Processing

Namespace: dftracer::utils::utilities::text

struct FilterableLine

A line with a predicate function for filtering.

Public Functions

FilterableLine() = default
inline FilterableLine(fileio::lines::Line l, std::function<bool(const fileio::lines::Line&)> pred)

Public Members

fileio::lines::Line line
std::function<bool(const fileio::lines::Line&)> predicate
class LineFilterUtility : public dftracer::utils::utilities::Utility<FilterableLine, std::optional<Line>>

Utility that filters lines based on a predicate function.

This utility takes a FilterableLine (line + predicate) and returns the line only if it passes the predicate. Otherwise, returns empty optional.

Features:

  • Flexible predicate-based filtering

  • Returns std::optional for composability

  • Can be used in map/filter pipelines

  • Can be tagged with Retryable, Monitored behaviors (Note: Cacheable not recommended due to std::function in input)

Usage:

auto filter = std::make_shared<LineFilter>();

auto predicate = [](const Line& line) {
    return line.content.find("ERROR") != std::string::npos;
};

FilterableLine input{Line{"ERROR: Something went wrong"}, predicate};
auto result = filter->process(input);

if (result.has_value()) {
    std::cout << "Matched: " << result->content << "\n";
}

Filtering multiple lines:

auto is_error = [](const Line& line) {
    return line.content.find("ERROR") != std::string::npos;
};

Lines input = get_lines();
std::vector<Line> filtered;

for (const auto& line : input.lines) {
    auto result = filter->process(FilterableLine{line, is_error});
    if (result.has_value()) {
        filtered.push_back(*result);
    }
}

Public Functions

LineFilterUtility() = default
~LineFilterUtility() override = default
inline coro::CoroTask<std::optional<Line>> process(const FilterableLine &input) override

Filter a line based on predicate.

Parameters:

input – Line with predicate function

Returns:

Optional line (has_value if predicate returns true)

class LineSplitterUtility : public dftracer::utils::utilities::Utility<Text, Lines, utilities::tags::Parallelizable>

Utility that splits text into individual lines.

This utility takes multi-line text and splits it into a vector of lines, preserving line numbers. It can be used standalone or composed in pipelines.

Features:

  • Splits on ‘

    ’ characters

  • Assigns line numbers (1-indexed)

  • Handles empty lines

  • Can be tagged with Cacheable, Retryable, Monitored behaviors

Usage:

auto splitter = std::make_shared<LineSplitter>();

Text input("Line 1\nLine 2\nLine 3");
Lines output = splitter->process(input);

for (const auto& line : output.lines) {
    std::cout << line.line_number << ": " << line.content << "\n";
}

With pipeline:

Pipeline pipeline;
auto task = use(splitter).emit_on(pipeline);
auto output = SequentialExecutor().execute(pipeline, Text{"data"});
auto lines = output.get<Lines>(task.id());

Public Functions

LineSplitterUtility() = default
~LineSplitterUtility() override = default
inline coro::CoroTask<Lines> process(const Text &input) override

Split text into lines.

Parameters:

inputText to split

Returns:

Lines with line numbers

class MultiLinesFilterUtility : public dftracer::utils::utilities::Utility<Lines, Lines, utilities::tags::Parallelizable>

Utility that filters multiple lines based on a predicate.

This is a batch version that processes all lines at once.

Public Functions

inline explicit MultiLinesFilterUtility(std::function<bool(const Line&)> predicate)
~MultiLinesFilterUtility() override = default
inline void set_predicate(std::function<bool(const Line&)> predicate)

Set the predicate function.

inline coro::CoroTask<Lines> process(const Lines &input) override

Filter lines based on predicate.

Parameters:

input – Lines to filter

Returns:

Filtered lines (only those passing predicate)

struct Text

Represents raw text content.

Public Functions

Text() = default
inline explicit Text(std::string str)
inline explicit Text(const char *str)
inline bool empty() const
inline std::size_t size() const

Public Members

std::string content