TSO-CC: Consistency directed cache coherence for TSO

HPCA 2014 Paper

Abstract — Traditional directory coherence protocols are designed for the strictest consistency model, sequential consistency (SC). When they are used for chip multiprocessors (CMPs) that support relaxed memory consistency models, such protocols turn out to be unnecessarily strict. Usually this comes at the cost of scalability (in terms of per core storage), which poses a problem with increasing number of cores in today’s CMPs, most of which no longer are sequentially consistent.

Because of the wide adoption of Total Store Order (TSO) and its variants in x86 and SPARC processors, and existing parallel programs written for these architectures, we propose TSO-CC, a cache coherence protocol for the TSO memory consistency model. TSO-CC does not track sharers, and instead relies on self-invalidation and detection of potential acquires using timestamps to satisfy the TSO memory consistency model lazily. Our results show that TSO-CC achieves average performance comparable to a MESI directory protocol, while TSO-CC’s storage overhead per cache line scales logarithmically with increasing core count.

State Transition Table

A document containing the detailed state transition table of the full TSO-CC protocol (with all optimizations) can be found here.