processor | year | arithmetic logic units | SIMD units |
---|---|---|---|
Apple M* | 2019 | 6+ | |
Intel Lion Cove | 2024 | 6 | |
AMD Zen 5 | 2024 | 6 |
You are probably using simdjson:
Imagine you're building a game server that needs to persist player data.
You start simple:
struct Player {
std::string username;
int level;
double health;
std::vector<std::string> inventory;
};
Without reflection, you may write this tedious code:
// Serialization - converting Player to JSON
fmt::format(
"{{"
"\"username\":\"{}\","
"\"level\":{},"
"\"health\":{},"
"\"inventory\":{}"
"}}",
escape_json(p.username),
p.level,
std::isfinite(p.health) ? p.health : -1.0,
p.inventory| std::views::transform(escape_json)
);
object obj = json.get_object();
p.username = obj["username"].get_string();
p.level = obj["level"].get_int64();
p.health = obj["health"].get_double();
array arr = obj["inventory"].get_array();
for (auto item : arr) {
p.inventory.emplace_back(item.get_string());
}
struct Equipment {
std::string name;
int damage; int durability;
};
struct Achievement {
std::string title; std::string description; bool unlocked;
std::chrono::system_clock::time_point unlock_time;
};
struct Player {
std::string username;
int level; double health;
std::vector<std::string> inventory;
std::map<std::string, Equipment> equipped; // New!
std::vector<Achievement> achievements; // New!
std::optional<std::string> guild_name; // New!
};
This manual approach has several problems:
How do other languages do it?
string jsonString = JsonSerializer.Serialize(player, options);
Player deserializedPlayer = JsonSerializer.Deserialize<Player>(jsonInput, options);
It is using reflection to access the attributes of a struct during runtime.
#[derive(Serialize, Deserialize)] // Annotation is required
pub struct player {}
// Rust with serde
let json_str = serde_json::to_string(&player)?;
let player: Player = serde_json::from_str(&json_str)?;
language | runtime reflection | compile-time reflection |
---|---|---|
C++ 26 | ||
Go | ||
Java | ||
C# | ||
Rust |
With C++26 reflection and simdjson, all that boilerplate disappears:
// Just define your struct - no extra code needed!
struct Player {
std::string username;
int level;
double health;
std::vector<std::string> inventory;
std::map<std::string, Equipment> equipped;
std::vector<Achievement> achievements;
std::optional<std::string> guild_name;
};
// Serialization - one line!
void save_player(const Player& p) {
std::string json = simdjson::to_json(p); // That's it!
// Save json to file...
}
// Deserialization - one line!
Player load_player(std::string& json_str) {
return simdjson::from(json_str); // That's it!
}
Runnable example at https://godbolt.org/z/Efr7bK9jn
// What you write:
Player p = simdjson::from(runtime_json_string);
// What reflection generates at COMPILE TIME (conceptually):
Player deserialize_Player(const json& j) {
Player p;
p.username = j["username"].get<std::string>();
p.level = j["level"].get<int>();
p.health = j["health"].get<double>();
p.inventory = j["inventory"].get<std::vector<std::string>>();
// ... etc for all members
return p;
}
// Simplified snippet, members stores information about the class
// obtained via std::define_static_array(std::meta::nonstatic_data_members_of(^^T, ...))...
ondemand::object obj;
template for (constexpr auto member : members) {
// These are compile-time constants
constexpr std::string_view field_name = std::meta::identifier_of(member);
constexpr auto member_type = std::meta::type_of(member);
// This generates code for each member
obj[field_name].get(out.[:member:]);
}
See full implementation on GitHub
struct Player {
std::string username; // ← Compile-time: reflection sees this
int level; // ← Compile-time: reflection sees this
double health; // ← Compile-time: reflection sees this
};
// COMPILE TIME: Reflection reads Player's structure and generates:
// - Code to read "username" as string
// - Code to read "level" as int
// - Code to read "health" as double
// RUNTIME: The generated code processes actual JSON data
std::string json = R"({"username":"Alice","level":42,"health":100.0})";
Player p = simdjson::from(json);
// Runtime values flow through compile-time generated code
Try out this example at https://godbolt.org/z/WWGjhnjWW
struct Meeting {
std::string title;
long long start_time;
std::vector<std::string> attendees;
std::optional<std::string> location;
bool is_recurring;
};
// Automatically serializable/deserializable!
std::string json = simdjson::to_json(Meeting{
.title = "CppCon Planning",
.start_time = std::chrono::duration_cast<std::chrono::milliseconds>(
std::chrono::system_clock::now().time_since_epoch()
).count(),
.attendees = {"Alice", "Bob", "Charlie"},
.location = "Denver",
.is_recurring = true
});
Meeting m = simdjson::from(json);
We can say that serializing/parsing the basic types and custom classes/structs is pretty much effortless.
How do we automatically serialize ALL these different containers?
std::vector<T>
, std::list<T>
, std::deque<T>
std::map<K,V>
, std::unordered_map<K,V>
std::set<T>
, std::array<T,N>
// The OLD way - repetitive and error-prone!
void serialize(string_builder& b, const std::vector<T>& v) { /* ... */ }
void serialize(string_builder& b, const std::list<T>& v) { /* ... */ }
void serialize(string_builder& b, const std::deque<T>& v) { /* ... */ }
void serialize(string_builder& b, const std::set<T>& v) { /* ... */ }
// ... 20+ more overloads for each container type!
Problem: New container type? Write more boilerplate!
Concepts let us say: "If it walks like a duck and quacks like a duck..."
template <typename T>
concept container =
requires(T a) {
{ a.size() } -> std::convertible_to<std::size_t>;
{
a[std::declval<std::size_t>()]
}; // check if elements are accessible for the subscript operator
};
begin()
).template <typename T>
concept container_but_not_string =
requires(T a) {
{ a.size() } -> std::convertible_to<std::size_t>;
{
a[std::declval<std::size_t>()]
}; // check if elements are accessible for the subscript operator
} && !std::is_same_v<T, std::string> &&
!std::is_same_v<T, std::string_view> && !std::is_same_v<T, const char *>;
template <class T>
requires(container_but_not_string<T>)
constexpr void atom(string_builder &b, const T &t) {
if (t.size() == 0) {
b.append_raw("[]");
return;
}
b.append('[');
atom(b, t[0]);
for (size_t i = 1; i < t.size(); ++i) {
b.append(',');
atom(b, t[i]);
}
b.append(']');
}
Works with
vector
, array
, deque
, custom containers...
push_back
, append
, emplace_back
template <typename T>
concept appendable_containers =
(details::supports_emplace_back<T> || details::supports_emplace<T> ||
details::supports_push_back<T> || details::supports_push<T> ||
details::supports_add<T> || details::supports_append<T> ||
details::supports_insert<T>);
template <appendable_containers T, typename... Args>
constexpr decltype(auto) emplace_one(T &vec, Args &&...args) {
if constexpr (details::supports_emplace_back<T>) {
return vec.emplace_back(std::forward<Args>(args)...);
} else if constexpr (details::supports_emplace<T>) {
return vec.emplace(std::forward<Args>(args)...);
} else if constexpr (details::supports_push_back<T>) {
return vec.push_back(std::forward<Args>(args)...);
} else if constexpr (details::supports_push<T>) {
return vec.push(std::forward<Args>(args)...);
} else if constexpr (details::supports_add<T>) {
return vec.add(std::forward<Args>(args)...);
} else if constexpr (details::supports_append<T>) {
return vec.append(std::forward<Args>(args)...);
} else if constexpr (details::supports_insert<T>) {
return vec.insert(std::forward<Args>(args)...);
// ....
auto arr = json.get_array()
for (auto v : arr) {
concepts::emplace_one(out, v.get<value_type>());
}
When you write:
struct GameData {
std::vector<int> scores; // Array-like → [1,2,3]
std::map<string, Player> players; // Map-like → {"Alice": {...}}
MyCustomContainer<Item> items; // Your container → Just works!
};
The magic:
Write once, works everywhere™
3.6 GB/s - 14x faster than nlohmann, 2.1x faster than Serde!
What is Ablation?
From neuroscience: systematically remove parts to understand function
Our Approach (Apple Silicon M3 MAX):
(Baseline - Disabled) / Disabled
Optimization | Twitter Contribution | CITM Contribution |
---|---|---|
Consteval | +100% (2.00x) | +141% (2.41x) |
SIMD Escaping | +42% (1.42x) | +4% (1.04x) |
Fast Digits | +6% (1.06x) | +34% (1.34x) |
The Insight: JSON field names are known at compile time!
Traditional (Runtime):
// Every serialization call:
write_string("\"username\""); // Quote & escape at runtime
write_string("\"level\""); // Quote & escape again!
With Consteval (Compile-Time):
constexpr auto username_key = "\"username\":"; // Pre-computed!
b.append_literal(username_key); // Just memcpy!
The Problem: JSON requires escaping "
, \
, and control chars
Traditional (1 byte at a time):
for (char c : str) {
if (c == '"' || c == '\\' || c < 0x20)
return true;
}
SIMD (16 bytes at once):
auto chunk = load_16_bytes(str);
auto needs_escape = check_all_conditions_parallel(chunk);
if (!needs_escape)
return false; // Fast path!
std::to_chars
std::to_string
std::to_string(value).length();
We've observed a 6% slow-down when compiling simdjson with static reflection enabled. (clang p2996 experimental branch).
error: invalid use of incomplete type 'std::reflect::member_info<
std::reflect::get_public_data_members_t<Person>[0]>'
in instantiation of function template specialization
'get_member_name<Person, 0>' requested here
note: in instantiation of function template specialization
'serialize_impl<Person>' requested here
note: while substituting template arguments for class template
With reflection and concepts, code is shorter and more general
Fast compile time
Compile-Time optimizations can be awesome
SIMD: String operations benefit
Many optimizations may help
C++ Reflection Paper Authors
Compiler Implementation Teams
Compiler Explorer Team
simdjson Community
Daniel Lemire and Francisco Geiman Thiesen
GitHub: github.com/simdjson/simdjson
Thank you!
The code was really painful to read, this is probably sufficient.
TODO: maybe add a reference to one of Herb's talks