The Partridge Family were neither partridges nor a family. Discuss.
Post Reply
Posts: 27
Joined: February 12th, 2017, 1:50 pm


Post by Florian3321 » June 12th, 2019, 8:13 am

Hi there!
I am trying to implement an ECS (Entity Component System). Maybe someone has experience with that. And here comes the part I do not really get:
You have some kind of manager to store systems and components "ordered" by their type. And in the systems, you go through all components that the system needs to get its job done and it manipulates the components. If the system only uses one component type its fine in terms of cache, because the compoents that belong to that type are all stored "together". But if you use more then one, you constantly have to switch between component vectors because you get the first one of the first component type, the first one of the second and so one. Thats bad of the cache, isnt it? Another approach is to store the components for a system packed, in an AOS fashion. If a system uses for example a pos and move component, the vector would like this: pos0, move0, pos1, move1, ... and not like pos0, pos1, ... and move0, move1. But now the problem is that if two systems use the same component, this falls apart, because you cannot pack them together like that, because the same component is needed in different arrays and that does not work, because the data has to be the same to all systems. So I do not really get it. Maybe someone can clarify!


User avatar
Posts: 190
Joined: November 14th, 2014, 2:03 am


Post by cyboryxmen » June 12th, 2019, 9:32 am

The CPU is able to handle multiple streams at once. Streaming through 2 arrays is a piece of cake. Quad streaming should be perfectly fine too. Eventually though, you will hit a limit as to how many streams you can go through at once.

But yeah, traversing through 1 array will ultimately be more performant than traversing through 2 arrays. If the Physics system needs to traverse through the Position and Physics components but the Rendering system only needs to traverse through the Position components, you'll still get better performance by packing the two components into one. The overhead of having an additional array traversal is so great, the "wasted" cache space the Rendering system incurs by needlessly fetching the Physics component is negated. Not to mention that eventually, you'll want to create/destroy entities and erasing elements from one array is so much faster than erasing elements from multiple arrays. Imagine if an entity has 5 components. That's 5 arrays you'll have to go through.

Ultimately, it's a balancing act that you need to be aware of when optimising these programs. Always be ready to benchmark your program to see what works and what doesn't.

Posts: 4262
Joined: February 28th, 2013, 3:23 am
Location: Oklahoma, United States


Post by albinopapa » June 12th, 2019, 4:53 pm

The cpu reads system memory in groups of 64 bytes at a time. If you are only extracting a Vec3f ( 12 bytes ) then you should still fine switching arrays since the next 48 bytes will still be in L1 cache. In the AoS example, this works out for physics stuffs as you need to touch a position ( 12 bytes ), a velocity ( 12 bytes ) and a bounding volume ( 24 bytes ). That's 48 bytes for 3D and even if you need to include orientation in that, you'll only be at 60 bytes. Assuming you pack your data in this manner ( BoundingBox, Vec3f, Vec3f, Vec3f ), all that should be included in the cache line read.

One approach some take is to pass by value or copy the data to a struct before passing it to the function and return a struct so you only pay for system memory access at the beginning of the function. This way all the necessary data is fresh in CPU cache and the function should run as quick as possible.

As cyboryxmen states, if you have 5 arrays, searching each array for the element from which you want to erase is a bit of a waste when you can search one array and erase all components at once. If you can use a "swap and pop" technique, then you will be all the better off. Swap and pop puts your elements out of order by swapping the end with the element needing to be erased then popping the last one from the vector/array. This way all the other elements after the one being erased doesn't need to be moved to lower positions in the array. If you need the elements to remain in order, then this won't help any.
If you think paging some data from disk into RAM is slow, try paging it into a simian cerebrum over a pair of optical nerves. -

Post Reply