- 6.1 About Ethernet
- 6.2 About Hubs, Switches, and Routers
- 6.3 About TCP/IP
- 6.4 About Packets
- 6.5 About Remote Procedure Calls (RPCs)
- 6.6 Slop
- 6.7 Observing Network Traffic
- 6.8 Sample RPC Message Definition
- 6.9 Sample Logging Design
- 6.10 Sample Client-Server System Using RPCs
- 6.11 Sample Server Program
- 6.12 Spinlocks
- 6.13 Sample Client Program
- 6.14 Measuring One Sample Client-Server RPC
- 6.15 Postprocessing RPC Logs
- 6.16 Observations
- 6.17 Summary
- Exercises
6.12 Spinlocks
The spinlock protecting a software critical section is our fifth shared resource, along with the four hardware resources CPU, memory, disk, and network. There are many forms of software locks, with spinlock the simplest—whenever a thread cannot acquire the lock for a critical section of code, it simply loops (spins) trying over and over to acquire the lock, until eventually other threads free the lock and the subject thread successfully grabs it. Chapter 27 discusses locks in more detail.
The sample server code defines a C++ SpinLock class, which acquires a spinlock in its constructor and frees the lock in its destructor. Thus the code pattern
LockAndHist some_lock_name; ... { SpinLock sp(some_lock_name); <critical section code here> ... }
makes the inner block a critical section that can be executed by only one thread at a time. The C++ constructor/destructor mechanism for SpinLock guarantees to acquire the lock some_lock_name upon entry to the block and to release it upon exit from the block, even for an unexpected or exception exit. This design completely removes one source of programming error—processes that sometimes fail to release a lock.
The spinlock implementation also defines a small histogram to record lock acquisition times. This is another piece of designed-in observability. A common issue with software locks is that under some circumstances a program thread has to wait much too long to acquire a lock, resulting in long transaction latency on one thread whenever another thread holds a lock too long. A small histogram of lock-acquire times for each lock can tell you the normal time taken to acquire a contended lock and also can show how many much-longer times occur. If there are no long acquisition times for a given lock, then lock-waiting is not a cause for a long transaction latency, and you can look elsewhere.
The locking pseudocode looks like this:
start = __rdtsc() test-and-set loop to get lock stop = __rdtsc() elapsed_usec = (stop - start) / cyclesperusec hist[Floorlg(elapsed_usec)]++
where rdtsc reads the x86 cycle counter and Floorlg(x) takes floor(log2(x)), returning 0..31 for a 32-bit unsigned int x. The variable hist is a small array of counts. Logarithm base 2 is sufficient resolution to put long and short acquisition delays into different count buckets. The stats command mentioned earlier returns this histogram array.