# Barrier Synchronization

Barrier synchronization is a problem that comes up in high-performance parallel computing. The Harmony model checker uses it. A barrier is almost the opposite of a critical section: the intention is to get a group of threads to run some code at the same time, instead of having them execute it one at a time. More precisely, with barrier synchronization, the threads execute in rounds. Between each round, there is a so-called barrier where threads wait until all threads have completed the previous round and reached the barrier---before they start the next round. For example, in an iterative matrix algorithm, the matrix may be cut up into fragments. During a round, the threads run concurrently, one for each fragment. The next round is not allowed to start until all threads have completed processing their fragment.

A barrier is used as follows:

• b = Barrier(n): initialize a barrier b for a collection of n threads;

• bwait(?b): wait until all threads have reached the barrier

barriertest.hny
import barrier

const NROUNDS = 4

invariant (max(round) - min(round)) <= 1

for r in {0..NROUNDS-1}:
barrier.bwait(?barr)
round[self] += 1



Figure 21.1 is a test program for barriers. It uses an integer array round with one entry per thread. Each thread, in a loop, waits for all threads to get to the barrier before incrementing its round number. If the barrier works as advertised, two threads should never be more than one round apart.

When implementing a barrier, a complication to worry about is that a barrier can be used over and over again. If this were not the case, then a solution based on a lock, a condition variable, and a counter initialized to the number of threads could be used. The threads would decrement the counter and wait on the condition variable until the counter reaches 0.

barrier.hny
from synch import *

def Barrier(required):
result = {
.mutex: Lock(), .cond: Condition(),
.required: required, .left: required, .cycle: 0
}

def bwait(b):
acquire(?b->mutex)
b->left -= 1
if b->left == 0:
b->cycle = (b->cycle + 1) % 2
b->left = b->required
notifyAll(?b->cond)
else:
let cycle = b->cycle:
while b->cycle == cycle:
wait(?b->cond, ?b->mutex)
release(?b->mutex)


Figure 21.2 shows how one might implement a reusable barrier. Besides a counter .left that counts how many threads still have to reach the barrier, it uses a counter .cycle that is incremented after each use of the barrier---to deal with the complication above. The last thread that reaches the barrier restores .left to the number of threads (.required) and increments the cycle counter. The other threads are waiting for the cycle counter to be incremented. The cycle counter is allowed to wrap around---in fact, a single bit suffices for the counter.

barriertest2.hny
import barrier

const NROUNDS = 4

invariant (max(round) - min(round)) <= 1

phase = 0

for r in {0..NROUNDS-1}:
if self == 0:                # coordinator prepares
phase += 1
barrier.bwait(?barr)         # enter parallel work
round[self] += 1
assert round[self] == phase
barrier.bwait(?barr)         # exit parallel work


21.1 Implement barrier synchronization for N threads with just three binary semaphores. Busy waiting is not allowed. Can you implement barrier synchronization with two binary semaphores? (As always, the Little Book of Semaphores is a good resource for solving synchronization problems with semaphores. Look for the double turnstile solution.)
21.2 Imagine a pool hall with N tables. A table is full from the time there are two players until both players have left. When someone arrives, they can join a table that is not full, preferably one that has a player ready to start playing. Implement a simulation of such a pool hall.