Asynchronous Multi-Thread Design Pattern with C++
Xiahua Liu October 19, 2024 #Design Pattern #C++This post does NOT talk about existing async frameworks in C++.
We will talk the basic concept of asynchronous multi-thread design pattern in C++ and provide a simple example program.
In C++ there is no built-in language support for something like asynchronous channels in Rust. And C++ beginners often find themselves lost to plan their multithread program.
Asynchronous Design Pattern
In traditional synchronous programming, the parent and child threads are usually interlocked with some kind of synchronization mechanism, such as a state machine. The parent the child threads share access to the same memory for communication.
The problem of synchronous pattern is, you need to program the synchronization steps very carefully, otherwise there will be data racing and deadlock situations. Moreover, for a growing software, the sychronization complexity grows exponentially when new steps are added. It has serious scalability issues, so most modern system already ditched this design pattern in their software.
The asynchronous pattern requires the parent not to interrupt the child thread at all, and only interact with the child threads with given I/Os, typical working process is:
- Parent thread sends a task data structure to one or a group of child threads.
- A child thread unblocks upon receiving task data and starts running the task.
- Child thread sends the result back to parent thread and waiting for the next task.
- After parent thread finishs dealing with other tasks, it blocks and waits for the result it asked a moment ago.
You may wonder where are the synchronization steps? Synchronization is done by the parent thread only, although you can add states to the child thread as well, it is generally a bad practice because these steps add unnecessary complexity.
Typically parent and child threads communicate through FIFO queues so the data can be temporarily stored in the queue while the reader is busy doing something else.
Example C++ Program
Here is an example program, the main thread want to distribute the math calculations to a group of worker threads (assume the calculation is just square()
function).
We can first define the FIFO queue as the following structure:
Notice that push()
and pop()
are both protected by the mutex_
so it can be used by both parent and child threads.
Now let's define the worker thread as:
void
Now for our main thread we can simply:
int
We can share the same WorkerChannel_T<float>
with more than one workers. Since only one of them can get the lock and pop()
a task from the queue. Also the output channel is shared between them as well.
You can find the whole example C++ code on Godbolt.org.
If you are familar with the other async frameworks you will notice that:
- The
in_ch.push()
thenin_ch.notify()
combination is actually same as you create apromise
orfuture
using the async framework. - The
out_ch.pop()
is actually same asawait
. The parent thread blocks on calling this function to wait the results became available.
We just create a tiny framework in a few lines with std::thread
! However the modern async framework usually is more complicate.
The executioner in the framework could be a thread pool or co-routines instead of a system thread.
One Step Forward
Just like shown in the above example, you can share a FIFO channel among multiple workers. Because we use a mutex to protect the queue, it is safe for all workers to block on one channel.
It is relatively easy to design the worker thread, since it typically only needs to work on a single I/O pair.
However you may also notice that the parent thread can only be blocked on the single channel, what if there are multiple channels that the parent thread needs to watch simultaneously?
Serialized Wait
This the most straight forward answer, that is the parent just wait the result one by one.
It may sound stupid, it is acutally very useful. The parent acutally only needs to wait until the slowest child finishes its task at most.
result1=out1_ch.;
result2=out2_ch.;
result3=out3_ch.;
result4=out4_ch.;
After serialized wait like shown above is finished, the parent is guranteed to have all the data ready for the next step.
This situation is usually used when the parent thread needs to have a set of data ready before moving on.
Wait until the first comes
There is another scenario that we want the parent thread to wait until the first result is availble.
The easiest solution is actually quite simple, we just need to use std::variant
and share the same channel (with std::variant
as storage type) among different threads. For parent thread we can use std::visit
to process the result further.
However the easiest solution is static, because you cannot change the channel during the runtime.
You can also have a dedicated channel. When a child pushes new result, the child uses this channel to report which channel number is ready for parent thread to read.