thesmallcreeper wrote:No it doesnt. I use a Piledriver over here :p
In my case every time program creates a new thread for std::async I get a 0.7ms hit :/
On my code I just emplace_back std::async calls to a std::future vector
I didn't use emplace_back(), I presized the vector and just looped over the new elements.
Code: Select all
std::vector<std::future<std::pair<sse, sse>>> futures( 8 );
const int workload = int( input.size() );
const int job_size = ( workload / 8 ) & ~7;
std::pair<int, int>job_sizes[ 8 ] =
{
{0 * job_size,( ( 0 * job_size ) + job_size ) },
{1 * job_size,( ( 1 * job_size ) + job_size ) },
{2 * job_size,( ( 2 * job_size ) + job_size ) },
{3 * job_size,( ( 3 * job_size ) + job_size ) },
{4 * job_size,( ( 4 * job_size ) + job_size ) },
{5 * job_size,( ( 5 * job_size ) + job_size ) },
{6 * job_size,( ( 6 * job_size ) + job_size ) },
{7 * job_size,( ( 7 * job_size ) + job_size ) }
};
for( int i = 0; i < 8; ++i )
{
const sse* beg_iter =
reinterpret_cast< const sse* >( &input[ 0 ] + job_sizes[ i ].first);
const sse* end_iter =
reinterpret_cast< const sse* >( &input[ 0 ] + job_sizes[ i ].second );
futures[ i ] = std::async( run_threaded, beg_iter, end_iter );
}
If wondering, sse is just an alias for __m128i and the pair<int,int> array was left over debug code. Now that I look at it, I know the count of threads, I should have just used a C-array instead of vector like I did with the pair<int,int> array.
If you think paging some data from disk into RAM is slow, try paging it into a simian cerebrum over a pair of optical nerves. - gameprogrammingpatterns.com