A Decade of TTH: Its Selection and Uncertain Future
February 28, 2013 10 Comments
NMDC and ADC rely on the Tiger Tree Hash to identify files. DC requires a cryptographic hash function to avoid the previous morass of pervasive similar, but not identical, files. A bare cryptographic hash primitive such as SHA-1 did not suffice because not only did the files need identification as a whole but in separate parts, allowing reliable resuming and multi-source downloading, and per-segment integrity verification (RevConnect unsuccessfully attempted to reliably use multi-source downloading precisely because it could not rely on cryptographic hashes).
Looking for inspiration from other P2P software, I found that BitTorrent used (and uses) piecewise SHA-1 with per-torrent segment sizes. Since the DC share model asks that same hash function work across entire shares, this does not work. eDonkey2000 and eMule, with per-user shares similar to those of DC, resolved this with fixed, 9MB piecewise MD4, but this segment size scaled poorly, ensured that fixing corruption demanded at least 9MB of retransmission, and used the weak and soon-broken MD4. Gnutella, though, had found an elegant, scalable solution in TTH.
This Tiger Tree hash, which I thus copied from Gnutella, scales to both large and small files while depending on what was at the time a secure-looking Tiger hash function. It smoothly, adaptively sizes a hash tree while retaining interoperability between all such sizes of files files on a hub. By 2003, I had released BCDC++ which used TTH. However, the initial version of hash trees implemented by Gnutella and DC used the same hash primitive for leaf and internal tree nodes. This left it open to collisions, fixed by using different leaf and internal hash primitives. Both Gnutella and DC quickly adopted this fix and DC has followed this second version of THEX to specify TTH for the last decade.
Though it has served DC well, TTH might soon need a replacement. The Tiger hash primitive underlying it by now lists as broken due to a combination of a practical 1-bit pseudocollision attack on all rounds, a similarly feasible full collision on all but 5 of its 24 rounds, and full, albeit theoretical, 24-round pre-images (“Advanced Meet-in-the-Middle Preimage Attacks”, 2010, Guo et al). If one can collide or find preimages of Tiger, one can also trivially collide or find preimages of TTH. We are therefore investigating alternative cryptographic hash primitives to which we might transition as Tiger looks increasingly insecure and collision-prone, focusing on SHA-2 and SHA-3.