Archetype Storage: How Bevy and Flecs Organize Entity Data

What an archetype is

An archetype is a unique combination of component types. Every entity belongs to exactly one archetype at any given time. If Entity A has (Position, Velocity) and Entity B has (Position, Velocity, Sprite), they live in different archetypes because their component sets differ.

struct Archetype {
    component_ids: Vec<ComponentId>,
    columns: Vec<BlobVec>,
    entities: Vec<Entity>,
    len: usize,
}

Each archetype stores its data in a "table" with one column per component type. All columns are the same length. Entity data is stored at matching indices across columns. If entity E is at row 5, then columns[0][5] is its Position, columns[1][5] is its Velocity, and so on.

Why this layout is fast for queries

When you run a query like Query<(&Position, &Velocity)>, the ECS finds every archetype that contains both Position and Velocity. For each matching archetype, it hands you two slices that you iterate in lockstep.

for archetype in matching_archetypes {
    let positions = archetype.column::<Position>();
    let velocities = archetype.column::<Velocity>();

    for i in 0..archetype.len() {
        velocities[i].x += positions[i].dx;
        velocities[i].y += positions[i].dy;
    }
}

This is an ideal access pattern for modern CPUs. You walk two arrays linearly, adjacent elements sit near each other in memory, and the hardware prefetcher sees the pattern immediately.

The cost of component changes

Archetypes get expensive when an entity changes its component set. Adding a component moves the entity from its current archetype to a new one. The ECS has to:

Find (or create) the target archetype
Copy every component from the old row to a new row in the target archetype
Remove the old row (swap-remove from the source archetype)
Update the entity-to-archetype index

void add_component(Entity e, ComponentId id, void* data) {
    auto& src = get_archetype(e);
    auto& dst = find_or_create_archetype(src.type | id);

    size_t dst_row = dst.allocate_row();
    for (auto& col : src.columns) {
        if (dst.has_column(col.id)) {
            memcpy(dst.column(col.id).ptr(dst_row),
                   col.ptr(src.row_of(e)),
                   col.element_size);
        }
    }

    dst.column(id).write(dst_row, data);
    src.swap_remove(src.row_of(e));
    entity_index[e] = { dst.id, dst_row };
}

For an entity with 10 components, adding one more means copying 10 component values. If you're doing this to thousands of entities per frame, it adds up fast.

Archetype graph edges

Mature ECS implementations cache the "add component X" and "remove component X" transitions between archetypes as graph edges. The first time an entity moves from (Position, Velocity) to (Position, Velocity, Sprite), the ECS does a lookup. After that, it caches the edge so later moves can follow a direct pointer.

Operation	First time	Subsequent
Add component	Hash lookup + maybe create archetype	Follow cached edge
Remove component	Hash lookup	Follow cached edge
Query match	Scan all archetypes	Cached archetype list

Bevy and Flecs both use this pattern. The archetype graph converges quickly because most games use a finite set of component combinations.

Table and sparse set hybrids

Bevy uses both storage strategies. Components marked with #[component(storage = "SparseSet")] are stored in sparse sets, so adding or removing them avoids an archetype move.

The heuristic: if a component is frequently added/removed (like a "Damaged" marker or "Selected" flag), sparse set storage avoids the move cost. If a component is long-lived and iterated in tight loops (like Position, Velocity), table storage gives better iteration throughput.

Practical advice

If you are using Bevy or Flecs, the main thing to watch for is unnecessary archetype fragmentation. If every entity has a slightly different component set, you end up with thousands of archetypes with a few entities each. Queries still work, while the cache benefits of long contiguous arrays fade.

Common causes of fragmentation:

Using many boolean marker components where an enum or bitflag component would fit
Splitting per-entity configuration across several tiny components
Adding debug/editor-only components in production builds
A bit niche, but if you're working with entity relationships, creating too many unique relationships can result in over-fragmentation for queries.

The fix is usually to merge related small components into a single larger one, or to use sparse set storage for components that create excessive fragmentation.

A large retro-futurist room filled with rows of computers — Systems get interesting once the structure starts to scale.