Map operations with AVX

The Partridge Family were neither partridges nor a family. Discuss.
Post Reply
User avatar
cyboryxmen
Posts: 190
Joined: November 14th, 2014, 2:03 am

Map operations with AVX

Post by cyboryxmen » November 23rd, 2018, 11:05 am

A map operation refers to an array operation where a transformation is made on each individual element of the array.

Code: Select all

for(auto& object : objects)
{
    object.update();
}
I decided to run benchmarks on different implementations of a map operation where you update a gameobject's acceleration, velocity and position. I've done implementations using both scalar operations and vectorized operations. For each of these tests, there will be 8'000'000 gameobjects to update.

Here is a link to the repository containing all of these implementations. Each implementation will have its own branch clearly labeled. You can try it out yourself to see how well they go. Be sure to run them using command prompt. Otherwise, the application would start up, print the results, then exit immediately. Also, your CPU needs to able to do AVX and FMA operations for the vectorized operations benchmarks to be meaningful.

tl;dr This

Code: Select all

struct vector3
{
    float x;
    float y;
    float z;
};

struct object
{
    vector3 position;
    vector3 velocity;
    vector3 acceleration;
};
is faster than this

Code: Select all

auto position_x = array<float>{};
auto position_y = array<float>{};
auto position_z = array<float>{};

auto velocity_x = array<float>{};
auto velocity_y = array<float>{};
auto velocity_z = array<float>{};

auto acceleration_x = array<float>{};
auto acceleration_y = array<float>{};
auto acceleration_z = array<float>{};
and when using vectorized operations, this

Code: Select all

struct simd_vector
{
    using underlying_type = __m256;
    using element_type = float;

    static constexpr auto size = sizeof(underlying_type) / sizeof(element_type);

    alignas(sizeof(underlying_type)) element_type elements[size];
};

struct vector3
{
    simd_vector x;
    simd_vector y;
    simd_vector z;
};

struct object
{
    vector3 acceleration;
    vector3 velocity;
    vector3 position;
};
is still faster than this

Code: Select all

float* acceleration_x = static_cast<float*>(_aligned_malloc(total_gameobjects * sizeof(float), sizeof(simd_vector)));
float* acceleration_y = static_cast<float*>(_aligned_malloc(total_gameobjects * sizeof(float), sizeof(simd_vector)));
float* acceleration_z = static_cast<float*>(_aligned_malloc(total_gameobjects * sizeof(float), sizeof(simd_vector)));

float* velocity_x = static_cast<float*>(_aligned_malloc(total_gameobjects * sizeof(float), sizeof(simd_vector)));
float* velocity_y = static_cast<float*>(_aligned_malloc(total_gameobjects * sizeof(float), sizeof(simd_vector)));
float* velocity_z = static_cast<float*>(_aligned_malloc(total_gameobjects * sizeof(float), sizeof(simd_vector)));

float* position_x = static_cast<float*>(_aligned_malloc(total_gameobjects * sizeof(float), sizeof(simd_vector)));
float* position_y = static_cast<float*>(_aligned_malloc(total_gameobjects * sizeof(float), sizeof(simd_vector)));
float* position_z = static_cast<float*>(_aligned_malloc(total_gameobjects * sizeof(float), sizeof(simd_vector)));
Zekilk

User avatar
chili
Site Admin
Posts: 3948
Joined: December 31st, 2011, 4:53 pm
Location: Japan
Contact:

Re: Map operations with AVX

Post by chili » November 23rd, 2018, 1:55 pm

get rekt DOD XD
Chili

albinopapa
Posts: 4373
Joined: February 28th, 2013, 3:23 am
Location: Oklahoma, United States

Re: Map operations with AVX

Post by albinopapa » November 23rd, 2018, 10:18 pm

I had made a similar claim in the Efficiency ( Not a Tutorial ) thread.
If you think paging some data from disk into RAM is slow, try paging it into a simian cerebrum over a pair of optical nerves. - gameprogrammingpatterns.com

Post Reply