JSON Utilities

Namespace: dftracer::utils::utilities::common::json

struct JsonDocGuard

RAII guard that owns a simdjson DOM parser and document. With simdjson, the parser manages document lifetime internally.

Public Functions

JsonDocGuard() = default
inline bool parse(const char *data, std::size_t len)
inline simdjson::dom::element root() const
inline explicit operator bool() const

Public Members

simdjson::dom::parser parser
bool valid = false
class JsonParser

On-Demand JSON parser for zero-copy parsing.

Key design principles:

  1. On-Demand API for lazy field access - only parses what you use

  2. Parser is reused across rows (internal buffer management)

  3. Zero-copy: string_view points directly into the padded JSON buffer

  4. Forward-only iteration: once a field is accessed, it’s consumed

Usage pattern for batch processing:

JsonParser parser;

for (auto& line : input_lines) {
    // parse() copies to internal padded buffer
    if (!parser.parse(line)) continue;

    // Access fields directly from parser
    auto name = parser.get_string("name");
    auto ts = parser.get_int64("ts");

    // Iterate over 'args' object
    parser.for_each_field("args", [](std::string_view key, auto& val) {
        // process nested fields
    });
}

Note

string_view values are only valid until the next parse() call.

Public Functions

explicit JsonParser(std::size_t capacity = DEFAULT_CAPACITY)
JsonParser(const JsonParser&) = delete
JsonParser &operator=(const JsonParser&) = delete
JsonParser(JsonParser&&) = default
JsonParser &operator=(JsonParser&&) = default
bool parse(std::string_view json_line)

Parse a JSON line.

Copies the input to an internal padded buffer for SIMD processing. Previous parse results become invalid after this call.

Parameters:

json_line – The JSON string to parse.

Returns:

true on success, false on parse error.

bool parse_padded(simdjson::padded_string_view json)

Parse from pre-padded string (avoids copy).

inline bool is_valid() const

Check if current document is valid (last parse succeeded).

std::optional<std::int64_t> get_int64(std::string_view key)
std::optional<std::uint64_t> get_uint64(std::string_view key)
std::optional<double> get_double(std::string_view key)
std::optional<bool> get_bool(std::string_view key)
std::optional<std::string_view> get_string(std::string_view key)
template<typename Fn>
void for_each_field(Fn &&fn)

Iterate over all fields in the root object.

Note

This consumes the document. After calling, field access methods will return nullopt. Call parse() again to re-parse.

Parameters:

fn – Callback: void(std::string_view key, simdjson::ondemand::value val)

template<typename Fn>
bool for_each_field(std::string_view object_key, Fn &&fn)

Iterate over fields of a nested object.

Parameters:
  • object_key – The field containing the nested object.

  • fn – Callback: void(std::string_view key, simdjson::ondemand::value val)

Returns:

true if object found and iterated, false otherwise.

void rewind()

Rewind document for re-iteration.

After accessing fields, the document position advances. Call this to reset to the beginning for another pass.

inline simdjson::ondemand::document &raw_document()

Get raw document for advanced usage.

inline void set_borrowed_document(simdjson::ondemand::document_reference ref) noexcept

Borrow an externally-owned parsed document.

After this call, for_each_field/rewind/get_* operate on the borrowed reference. The caller must keep the underlying document alive until another parse() / set_borrowed_document() call. Intended for bridging iterate_many output (document_reference) to consumers that accept a JsonParser&.

Public Static Attributes

static constexpr std::size_t DEFAULT_CAPACITY = 1 << 20
class JsonParserUtility : public dftracer::utils::utilities::Utility<JsonParserInput, JsonParserOutput>

Public Functions

inline coro::CoroTask<JsonParserOutput> process(const JsonParserInput &input) override
class JsonValue

Lightweight wrapper around simdjson::dom::element with convenient accessors.

Provides:

  • Fluent chaining: json[“args”][“hhash”]

  • Template get<T>() with auto-casting

  • Default values for missing/null fields

  • Zero overhead - just element navigation

IMPORTANT: JsonValue is only valid while the simdjson::dom::document is alive.

Public Functions

inline JsonValue()
inline explicit JsonValue(simdjson::dom::element elem)
inline bool is_null() const
inline bool is_bool() const
inline bool is_string() const
inline bool is_uint() const
inline bool is_int() const
inline bool is_number() const
inline bool is_object() const
inline bool is_array() const
inline bool exists() const
inline JsonValue operator[](const char *key) const
inline JsonValue operator[](const std::string &key) const
inline JsonValue operator[](std::string_view key) const
JsonValue at(const char *path) const
JsonValue at(const std::string &path) const
JsonValue at(std::string_view path) const
template<typename T>
inline T get(const T &default_val = T{}) const
template<typename T>
inline std::optional<T> get_optional() const
template<typename Fn>
inline void for_each_member(Fn &&fn) const
inline simdjson::dom::element raw() const
inline explicit operator bool() const
struct JsonValueHelper

Helper to extract typed value from simdjson::ondemand::value.

Use in for_each_field callbacks to safely extract values.

Public Static Functions

static inline std::optional<std::int64_t> get_int64(simdjson::ondemand::value &val)
static inline std::optional<std::uint64_t> get_uint64(simdjson::ondemand::value &val)
static inline std::optional<double> get_double(simdjson::ondemand::value &val)
static inline std::optional<bool> get_bool(simdjson::ondemand::value &val)
static inline std::optional<std::string_view> get_string(simdjson::ondemand::value &val)
static inline bool is_null(simdjson::ondemand::value &val)
static inline std::optional<simdjson::ondemand::json_type> get_type(simdjson::ondemand::value &val)
static inline std::optional<std::string> to_json_string(simdjson::ondemand::value &val)
struct StringJsonParserInput

Public Members

utilities::text::Text content

Public Static Functions

static coro::CoroTask<StringJsonParserInput> from_file_async(const std::string &file_path)
static StringJsonParserInput from_file(const std::string &file_path)
static StringJsonParserInput from_string(const std::string &json_str)
class StringJsonParserUtility : public dftracer::utils::utilities::Utility<StringJsonParserInput, JsonParserOutput>

Public Functions

coro::CoroTask<JsonParserOutput> process(const StringJsonParserInput &input) override
void reset()