Multithreading: Atomics and you!
Posted: April 15th, 2018, 8:21 am
"Atomicity" refers to a series of changes that, when made in a particular order and then "observed", you will observe the changes happening in that exact same order. These changes as a whole is referred to as an "atomic operation". It seems like a really redundant term to make at first since that's already what happens naturally. However, in the crazy new world of parallelism, what actually happens will not always be what you actually see. Take this example:
main.cpp
Thread 1
Thread 2
It's completely possible for thread 1 to see that x is 15 while thread 2 sees that x is 24 at the same time resulting in the console outputting both of these messages. To the observer seeing these messages, it'll be almost as if x has both the value of 24 and the value of 15! There's a bunch of these examples available online; each more mind bending than the previous one. Rather than try to understand how this could happen, it's best to just accept this fact and try to find ways to work around it.
The changes you made in one thread are always going to be observed to happen in the same order you made them in the same thread. The problem comes when another thread then tries to observe those changes. Not only is it not guaranteed to see the changes happen in the same order as the thread that made them, the changes it then makes may be observed to happen in a different order in the original thread too. Fortunately for us, C++ already has a construct that helps us maintain the atomicity of our changes across threads. Meet std::atomic! Atomic is a variable that gives you the guarantee that the changes made by any thread to that atomic variable is part of a single atomic operation that is shared among all threads. In other words, the changes any thread makes to it will be observed to happen in the same order for all threads. Not only that, it can also be customised to give you several additional levels of guarantees. std::memory_order_relaxed is the most basic guarantee the atomic provides. This guarantee is more than enough to fix our previous example:
main.cpp
Thread 1
Thread 2
Either one of these two messages will output or none of them will depending on how fast the threads are at setting x. No unreasonable outcomes where both of them are outputted! In other situations though, std::memory_order_relaxed does not give you a strong enough guarantee to work with most of the time. This is because std::memory_order_relaxed only guarantees that the changes made to the atomic variable are part of the same atomic operation. It does not extend that guarantee to other changes made around them.
main.cpp
Writer Thread
Reader Thread
By right, <value> should only ever be observed by the reader thread as 15 but it's completely possible for the reader thread to observe <value> as 0 instead. Since std::memory_order_relaxed does not maintain the order of changes made to variables other than the atomic, the reader thread could observe the change to the <written> first before observing the change to <value>. You might be wondering if making <value> an atomic variable would fix this problem. That wouldn't actually work since changes made to one atomic are part of a different atomic operation than changes made to another atomic. In this case, we need a stronger guarantee than what std::memory_order_relaxed provides. Enter std::memory_order_acquire and std::memory_order_release. When a thread makes a change to an atomic with std::memory_order_release, the atomic "releases" all changes that came before that's made by that thread . A thread that reads an atomic with std::memory_order_acquire then "acquires" all changes released by any thread through that same atomic. Knowing this, we can rewrite the code as such.
Writer Thread
Reader Thread
This is a common synchronisation technique that you'll see in multithreading. Release acquire pairs are commonly used to "propagate" changes made in one thread to other threads through a single atomic. A "release operation" is useless without an accompanying "acquire operation" and vice versa. You usually start by acquiring changes made by other threads and then releasing changes you made yourself. If however you ever want to acquire and release in a single change, you can use std::memory_order_acq_rel as demonstrated:
main.cpp
All threads
There is an std::memory_order that gives the strongest guarantee than any other std::memory_order in C++. A change made with std::memory_order_seq_cst not only does an acquire and a release, it also combines all the atomic operations of every atomic variable that use std::memory_order_seq_cst into a single global atomic operation. This is only effective however if these atomic variables use std::memory_order_seq_cst for all reads and writes. You cannot make half the reads and writes with only std::memory_order_relaxed or std::memory_order_release and the other half with std::memory_order_seq_cst. Personally, I haven't actually encountered a situation where I ever need such a strong guarantee. It is however used by other people that synchronise threads with multiple atomics instead of one. Here is an example of its use from the CPPReference.
main.cpp
Then there's std::memory_order_consume...we don't talk about std::memory_order_consume...
If you made it this far, congrats! You are one of the top 5% of programmers that actually understand the niche that is Lock Free Programming. Most people would just use std::mutex to "block" threads forcing only one thread be able to read and write shared values at a time. There's no denying the power of std::atomic provides by giving you the ability to make changes without having to block any thread. If used correctly, it can vastly improve the performance of your multithreaded code since every thread can continue to do work without blocking each other from progressing. Used incorrectly however, it can be much slower than simply blocking and more importantly, may cause huge untraceable bugs. Like all tools in C++, it must be used wisely.
main.cpp
Code: Select all
static int x = 0;
Code: Select all
// If thread 2 changed x first.
if ( x == 15 )
{
std::cout << "Thread 2 executed first!\n";
}
x = 24;
Code: Select all
// If thread 1 changed x first.
if ( x == 24 )
{
std::cout << "Thread 1 executed first!\n";
}
x = 15;
The changes you made in one thread are always going to be observed to happen in the same order you made them in the same thread. The problem comes when another thread then tries to observe those changes. Not only is it not guaranteed to see the changes happen in the same order as the thread that made them, the changes it then makes may be observed to happen in a different order in the original thread too. Fortunately for us, C++ already has a construct that helps us maintain the atomicity of our changes across threads. Meet std::atomic! Atomic is a variable that gives you the guarantee that the changes made by any thread to that atomic variable is part of a single atomic operation that is shared among all threads. In other words, the changes any thread makes to it will be observed to happen in the same order for all threads. Not only that, it can also be customised to give you several additional levels of guarantees. std::memory_order_relaxed is the most basic guarantee the atomic provides. This guarantee is more than enough to fix our previous example:
main.cpp
Code: Select all
static std::atomic<int> x { 0 };
Code: Select all
// If thread 2 changed x first.
if ( x.load ( std::memory_order_relaxed ) == 15 )
{
std::cout << "Thread 2 executed first!\n";
}
x.store ( 24, std::memory_order_relaxed );
Code: Select all
// If thread 1 changed x first.
if ( x.load ( std::memory_order_relaxed ) == 24 )
{
std::cout << "Thread 1 executed first!\n";
}
x.store ( 15, std::memory_order_relaxed );
main.cpp
Code: Select all
static int value = 0;
static std::atomic<bool> written { false };
Code: Select all
value = 15;
// Tell the reader thread that we have written the value.
written.store ( true, std::memory_order_relaxed );
Code: Select all
// Wait until the value is written by the writer thread.
while ( written.load ( std::memory_order_relaxed ) == false )
{
continue;
}
std::cout << value << '\n';
Writer Thread
Code: Select all
value = 15;
// Release our changes to the reader thread.
written.store ( true, std::memory_order_release );
Code: Select all
// Acquire the changes made by the writer thread.
while ( written.load ( std::memory_order_acquire ) == false )
{
continue;
}
std::cout << value << '\n';
main.cpp
Code: Select all
std::vector<Object> objects;
static std::atomic<std::size_t> num_threads_left { 0 };
Code: Select all
void update_objects ( const std::size_t begin, const std::size_t end )
{
// Update the part of the object vector assigned to this thread.
for ( auto it = begin; it != end; ++it )
{
it->update ( );
}
// Acquire changes other threads have made. Then, release our changes.
const auto old_value = num_threads_left.fetch_sub ( 1, std::memory_order_acq_rel );
const auto current_num_threads_left = old_value - 1;
// If we are the last thread to finish
if ( current_num_threads_left == 0 )
{
std::cout << "All threads have finished! Returning to main thread!\n";
}
// Otherwise, wait until all threads have finished.
else
{
while ( num_threads_left.load ( std::memory_order_acquire ) != 0 )
{
continue;
}
}
}
main.cpp
Code: Select all
#include <thread>
#include <atomic>
#include <cassert>
std::atomic<bool> x = {false};
std::atomic<bool> y = {false};
std::atomic<int> z = {0};
void write_x()
{
x.store(true, std::memory_order_seq_cst);
}
void write_y()
{
y.store(true, std::memory_order_seq_cst);
}
void read_x_then_y()
{
while (!x.load(std::memory_order_seq_cst))
;
if (y.load(std::memory_order_seq_cst)) {
++z;
}
}
void read_y_then_x()
{
while (!y.load(std::memory_order_seq_cst))
;
if (x.load(std::memory_order_seq_cst)) {
++z;
}
}
int main()
{
std::thread a(write_x);
std::thread b(write_y);
std::thread c(read_x_then_y);
std::thread d(read_y_then_x);
a.join(); b.join(); c.join(); d.join();
assert(z.load() != 0); // will never happen
}
If you made it this far, congrats! You are one of the top 5% of programmers that actually understand the niche that is Lock Free Programming. Most people would just use std::mutex to "block" threads forcing only one thread be able to read and write shared values at a time. There's no denying the power of std::atomic provides by giving you the ability to make changes without having to block any thread. If used correctly, it can vastly improve the performance of your multithreaded code since every thread can continue to do work without blocking each other from progressing. Used incorrectly however, it can be much slower than simply blocking and more importantly, may cause huge untraceable bugs. Like all tools in C++, it must be used wisely.