DC++ is 20 years old today

In the beginning there was NMDC, as its name says (Neo-Modus) a new way of file sharing. It was a quite good, if not revolutionary idea of its time but a bit clumsy and low-quality implementation of a business model that wanted to get revenue through displaying ads in its client software. NMDC could be used for sharing of files using a community hub capable of controlling direct file transfers between its online users and also relaying searches and instant messages. This system of direct file sharing built around online communities has quickly become a success at the end of the 90’s, despite its clumsiness and annoying limitations.

The early years

In the fall of 2001 one DC user, a secondary school teenager, thought he could easily make a much better, ad-free client for this network and that it would would be a fun project for him to improve his skills in C++ programming. So DC++, an open source replacement of the original Neo-Modus client has born, exactly 20 years ago this day. And the rest is history…

DC++ had rapidly become a success. Many users switched to it and enjoyed the new intuitive interface, the slick and fast look-and-feel, the new thoughtful functions like the ability of connecting multiple hubs in parallel. Neo-Modus had put out a new versions of its client as an answer, trying to amend the limitations of the original one but the effort was completely futile – by that time DC++ had already become the go-to client for the DC network.

As it happens with most open source development, with time, contributors appeared and helped to add their ideas and fix bugs in DC++. Many of them just came and went but some remained, giving more and more input and help for the original author to make DC++ better and better. Somehow, the changelog of DC++ preserved some of what that early development was like, it is a fun to read from the distance of so many years, especially for those who hadn’t been around DC that time.

But not all of those outside ideas and directions were accepted to DC++. Many people wanted to go to different ways and this can be easily done in open source; soon, there was no shortage of various forks of DC++, some existing just for the sake of a few additional functions while others went much further, to different directions adding complete set of new features and optimizations. But, with the exception of the few examples, most of them were still built around the code provided in DC++ as a base. Many forks were short-lived, having been abandoned within months or years but a few ones are still remained being developed or at least maintained these days.

These were the years when DC as a file sharing network flourished; public hubs with overall usercount in the hundred thousand magnitude and also a lot of smaller private communities.

On the pinnacle of file sharing

Once DC++ achieved the initial target of being a fast, full-featured, easy-to-use NMDC replacement, it was time to improve the initial system created by Neo-Modus. The protocol (1), (2), connections, file transfers were insecure, especially the latter; file identification and corruption problems were an everyday thing in DC. For example, files were identified by their names and sizes only so searches for other sources for the same file many times came up with another file of the same size, resulting a corrupted download.

This needed to be fixed and the fix came in the form of Tiger Tree Hashes that allowed the files to be properly identified, searched and verified after download so no corrupted or arbitrary content would arrive anymore to your computer. It’s still the same today; it comes with the need of hashing files before sharing, but it provides the ultimate safety and integrity. Some users and forks hated hashing and stayed behind – eventually, DC++ has become incompatible with these old clients and their stubborn users.

Interesting part of the story is that before the old ways of transfers without hash check is finally removed in 2006, the team has released DC++ v0.674, a version that’s become quite popular among large group of DC users – so much that even today it is still the most widely used old version of DC++ among those stubborn people mentioned above. Yes, this version was moderately stable at the time, an end result of an era in the development of DC++, still compatible with the old hashless ways. And since big changes were coming in the forthcoming releases, this one remained known as “the best” and “working” DC client for many. Nevertheless, DC++ 0.674 has soon become less and less secure and by today plenthora of vulnerabilities has been discovered in it. Also, being developed on a different era with the tools of the time, it isn’t that stable running on modern Windows versions, either. Our favorite support requests are when people demand to fix these instability issues on a 10+ year old version of the program when even most of the tools that used to build DC++ back then aren’t working anymore on operating systems of today. Of course the fix is available long time ago, only a version upgrade away.

Still leading the way to be secure

In the meantime, DC’s decline started to happen as in the middle of the 2000’s torrents became popular. The development of the Internet as a whole and the way torrents work fitted better for many file sharing users. In torrents, related group of files were bundled and client software were easier to set up and use, community members not needed to be online with a client software anymore to communicate with each other as messages were persistent on the web. IRC could be set up and used for those who missed instant messaging so this was a suitable replacement of earlier file sharing methods for many.

Yet the author of DC++ had his next big thing to realize. A complete change of the old commmunication protocol of DC, inherited from Neo-Modus, to a brand new one that is professionally designed, defined and documented; a standard protocol that is secure, aims to fix the design issues of the old one and is extensible with features, most notably with support of secure encrypted connections. The new protocol was named Advanced Direct Connect (ADC) and the first draft has been released in 2006. In parallel, with the help of many contributors, elements of the new protocol had been started to built into DC++ and also into its forks.

Thanks to ADC, by the end of the first decade of the new millenium Direct Connect was ready for the change to become a fully standardized file sharing system with safe and secure encrypted communications. Yet ADC has never taken off, really. Partly because it has came too late and the focus of file sharing has already moved elsewhere, partly because the reluctance of members of the DC network: key hub software developers, hub owners and hub list server maintainers. Many new ADC hubsoftware started to appear, written from scratch, some were just hobby projects while others showed promise and were high quality software. Since the DC network was reluctant to adapt to ADC, most of the new hub software were abandoned soon, and by now only a few that are still maintained. ADC has become popular only within small private DC communities due to its security and advanced integrity.

From development to maintenance

By 2008, DC++ had completely switched to free, open source build tools and libraries, not to rely on closed products of big tech companies. Meanwhile, inputs from the original author of DC++ started to phase out and eventually completely stopped. Under the control of a new leading developer DC++ had started to catch up with other DC clients in user-friendliness: new graphical UI elements, modern look-and-feel, easier setup and complete documentation of UI elements and functions, plenty of new functions like automated connectivity setup, secure encrypted private messages between users and so on.

And then, after a few years, the constant development that had characterized DC++ in its first 12 years of existence, just ended abruptly. In the following years DC++ had been slowly switched into maintenance mode, with mostly essential security and small bug fixes added to each release. Some other DC clients are still improving – changing and adding features to DC in their own ways but, at least to this point, remaining mostly compatible with DC++.

And this is where we are at today, 20 years after the start.

These above just semi-randomly picked important parts of the whole story. There were ups and downs, problems and solutions, you can find many more piece of the puzzle (mostly the technical aspects) throughout this blog. But the things mentioned here today are enough to show that key people created and worked on DC++ had been the most influential ones on the development of the DC network, at least in the best part of the last two decades. And while by now others shaping DC, almost everything is still based on the work of the people who have been in and around DC++ in these years.

And all the contributors to DC++, both ones who realized plenty of big ideas and ones with just small additions, they’ve done it mostly for having fun and to learn new things, improve themselves. They were many – you can find all the names preserved in the About box of DC++.

DC++ is still somewhat popular these days, around 10k people still interested on it in a course of a month. The program is still maintained, albeit in a slower speed and no ambitious feature updates in the plans. People remained with the project want to provide the safety, stability and compatibility and want to make sure that DC++ at least remains viable for some use cases. Hopefully, this will help users to keep having fun using DC++ for many more years.

Happy birthday DC++ and keep on sharing!

Mixed-hash DC hubs

They work fine if clients and hubs support both TTH and its successor adequately long.

While transitioning to a TTH successor, currently interoperable clients and hubs all supporting only TTH will diverge. In examining the consequences of such diversity, one can partition concerns into client-hub communication irrelevant to other clients; hub-mediated communication between two clients; and direct client-client communication. In each case, one can look at scenarios with complete, partial, and no supported hash function overlap. Complete overlap defines the all-TTH status quo and, clearly, works without complication for all forms of DC communication, so this post focuses on the remaining situations. In general,

Almost as straightforwardly, ADC but not NMDC client-hub communication irrelevant to other clients requires partial but not complete hash function overlap but only between each individual client/hub pair, and don’t create specific mixed-hash hub problems; otherwise, an ADC hub indicates STA error code 47. For ADC, This category consists of GPA, PAS, PID/CID negotiation (with length caveats as relate to other clients interpreting the resulting CID), and the establishment of a session hash function; NMDC does not depend on hashing at all for analogous functionality. Thus, for NMDC, no problems occur here. ADC’s greater usage of hashing requires correspondingly more care.

Specifically, GPA and PAS require that SUP had established some shared hash function between the client logging in and the hub, but otherwise have no bearing on mixed-hash-function DC hubs. Deriving the CID from the PID involves the session hash algorithm, which as with GPA/PAS merely requires partial hash function support overlap between each separate client and a hub. Length concerns do exist here, but become relevant only with hub-mediated communication between two clients.

Indeed, clients communicating via a hub comprise the bulk of DC client-hub communication. Of these, INF, SCH, and RES directly involve hashed content or CIDs. SCH ($Search) allows one to search by TTH and would also allow one to search by TTH’s successor. Such searches can only return results from clients which support the hash in question, so as before, partial overlap between clients works adequately. However, to avoid incentivizing clients which support both TTH and its successor to broadcast both searches and double auto-search bandwidth, a combined search method containing both hashes might prove useful. Similarly, RES specifies that clients must provide the session hash of their file, but also “are encouraged to supply additional fields if available”, which might include non-session hash functions they happen to support, such that as with the first client-hub communication category, partial hash function support overlap between any pair of clients suffices, but no overlap does not.

A more subtle and ADC-specific issue issue arises via RES’s U-type message header and INF’s ID field whereby ADC software commonly checks for exactly 39-byte CIDs. While clients need not support whatever specific hash algorithm produced a CID, the ADC specification requires that they support variable-length CIDs. Example of other hash function output lengths which, minimally, should be supported include:

Bits Bytes Bytes (base32) Supporting Hashes
192 24 39 Tiger
224 28 45 Skein, Keccak, other SHA-3 finalists, SHA-2
256 32 52 Skein, Keccak, other SHA-3 finalists, SHA-2
384 48 77 Skein, Keccak, other SHA-3 finalists, SHA-2
512 64 103 Skein, Keccak, other SHA-3 finalists, SHA-2

Finally, direct client-client communications introduces CSUP ($Supports), GET/GFI/SND ($Get/$Send) via the TTH/ share root or its successor, and filelists, all of which work if and only if partial hash function support overlap exists. CSUP otherwise fails with error code 54 and some subset of hash roots and hash trees regarding some filelist must be mutually understood, so as with the other cases, partial but not complete hash function support overlap between any given pair of clients is required.

Encouragingly, since together client-hub communication irrelevant to other clients; hub-mediated communication between two clients; and direct client-client communication cover all DC communication, partial hash function support overlap between any given pair of DC clients or servers suffices to ensure that all clients might fully functionally interact with each other. This results in a smooth, usable transition period for both NMDC and ADC so long as clients and hubs only drop TTH support once its successor becomes sufficiently ubiquitous. Further, relative to ADC, poy has observed that “all the hash function changes on NMDC is the file list (already a new, amendable format) and searches (an extension) so a protocol freeze shouldn’t matter there”, which creates an even easier transition than ADC in NMDC.

In service of such an outcome, I suggest two parallel sets of recommendations, one whenever convenient and the other closer to a decision on a TTH replacement. More short-term:

  • Ensure ADC software obeys “Clients must be prepared to handle CIDs of varying lengths.”
  • Create an ADC mechanism by which clients supporting both TTH and its successor can search via both without doubling (broadcast) search traffic. Otherwise, malincentives propagate.
  • Ensure BLOM scales to multiple hash functions.
  • Update phrasing in ADC specification to clarify that all known hashes for a file should be included in RES, not just session hash.

As the  choice of TTH’s successor approaches:

  • Disallow new hash function from being 192 bits to avoid ambiguity with Tiger or TTH hashes. I suggest 224 or 256-bit output; SHA-2 and all SHA-3 finalists (including Keccak and Skein) offer both sizes.
  • Pick either a single filelist with all supported hashes or multiple filelists, each of which only supports one hash. I favor the former; it especially helps during a transition period for even a client downloading via TTH’s successor to be able to autosearch and otherwise interact with clients which don’t yet support the new hash function, without needing to download an entire new filelist.
  • Barring a more dramatic break in Tiger than thus far seen, clients should retain TIGR support until the majority of ADC hubs and NMDC or ADC clients offer support for the successor hash function’s extension.

By doing so, clients both supporting only TTH and both TTH and new hash function should be capable of interacting without problems, transparently to end-users, while over time creating a critical mass of new hash function-supporting clients such that eventually client and hub software might outright drop Tiger and TTH support.

A Decade of TTH: Its Selection and Uncertain Future

NMDC and ADC rely on the Tiger Tree Hash to identify files. DC requires a cryptographic hash function to avoid the previous morass of pervasive similar, but not identical, files. A bare cryptographic hash primitive such as SHA-1 did not suffice because not only did the files need identification as a whole but in separate parts, allowing reliable resuming and multi-source downloading, and per-segment integrity verification (RevConnect unsuccessfully attempted to reliably use multi-source downloading precisely because it could not rely on cryptographic hashes).

Looking for inspiration from other P2P software, I found that BitTorrent used (and uses) piecewise SHA-1 with per-torrent segment sizes. Since the DC share model asks that same hash function work across entire shares, this does not work. eDonkey2000 and eMule, with per-user shares similar to those of DC, resolved this with fixed, 9MB piecewise MD4, but this segment size scaled poorly, ensured that fixing corruption demanded at least 9MB of retransmission, and used the weak and soon-broken MD4. Gnutella, though, had found an elegant, scalable solution in TTH.

This Tiger Tree hash, which I thus copied from Gnutella, scales to both large and small files while depending on what was at the time a secure-looking Tiger hash function. It smoothly, adaptively sizes a hash tree while retaining interoperability between all such sizes of files files on a hub. By 2003, I had released BCDC++ which used TTH. However, the initial version of hash trees implemented by Gnutella and DC used the same hash primitive for leaf and internal tree nodes. This left it open to collisions, fixed by using different leaf and internal hash primitives. Both Gnutella and DC quickly adopted this fix and DC has followed this second version of THEX to specify TTH for the last decade.

Though it has served DC well, TTH might soon need a replacement. The Tiger hash primitive underlying it by now lists as broken due to a combination of a practical 1-bit pseudocollision attack on all rounds, a similarly feasible full collision on all but 5 of its 24 rounds, and full, albeit theoretical, 24-round pre-images (“Advanced Meet-in-the-Middle Preimage Attacks”, 2010, Guo et al). If one can collide or find preimages of Tiger, one can also trivially collide or find preimages of TTH. We are therefore investigating alternative cryptographic hash primitives to which we might transition as Tiger looks increasingly insecure and collision-prone, focusing on SHA-2 and SHA-3.

How to crash DC++ 0.674

$ADCGET list //// 0 -1 ZL1|

A previous blog post mentions this, but apparently isn’t sufficiently explicit about what to send.

I aim to fix that.

Enjoy, all. This apparently works on DC++ clients older than 0.707 which still support $ADCGET.

DC++ 0.75 and older vulnerable to bzip2 filelist bomb

DC++ 0.75 and earlier can be remotely crashed via either bzip2 filelists (example filelist) or hublists (themselves compressed with bzip2). Such list downloads can be automatically triggered by automatic searches for alternate sources, so explicit user action is unnecessary [1]. Not every client seems to crash; the precise dependence on operating system or other factors remains unclear. However, crashes have been observed using both Windows XP and Windows 7.

As before, updating network-facing software remains important. Equally importantly, DC++ mod authors should attempt to update in a timely manner such as to avoid exposing their users to this bug.

[1] To catch a large number of clients in a hub in a relatively short period of time with no manual intervention, listen for searches (especially TTH searches) and always respond positively, such that clients try autosearching for alternate sources. Other tricks are possible as well, of course.

DC++ 0.75+1 Removes 35 Character Nick Limit

This commit by ullner, coming in the next DC++ release, causes some popular hubs perhaps unexpected problems.

Hub administrators should probably look into updating their hub software.

Enjoy, all.

DC++ Remote Crash/Exploit Disclosure

In the spirit of public disclosure to encourage users to move to recent versions and to encourage mod developers to fix their code, this post announces a well-known but not publicly announced somewhat recent remote DC++ exploit.

The DC++ NULL Pointer Remote Denial of Service Vulnerability involves sending an $ADCGET command such as “$ADCGET (%S) //+ 0 %-1 ZL1” to the other client along a client-client connection, which will promptly and reliably crash the latter client. This affects all recent versions until 0.707, so unless you’re running one of 0.707, 0.7091, 0.750, or a more recent development-snapshot, you’re probably vulnerable to this remote crash.

Furthermore, one doesn’t have to manually connect to another client for this crash to occur; a connection triggered by autosearch/add-queue is sufficient. Alternatively, one doesn’t even have to rely on that but can instead just send $(Rev)ConnectToMe commands to other clients to create client-client connections in the manner of some client-detection mods and systematically crash an entire hub of DC++-pre-0.707 users by connecting to them and sending them the example poison command.

This exploit is one example of why as with all network-facing software, one should keep DC++ updated.

Securing the unsecure ?!

Seasons greetings everyone from us at DCDev

Wanted to rant about private only scripts and the idiocy usage of em since a script really cant determine if its public or private.

Private only scripts are mainly used in Ptokax / Verlihub or any soft using lua scripting interface.

So what does the tag stand for in DC++

1/1/1

first digit is unregistered usually determined as public hub but all it really is is unregistered.

second digit is registered here is where most people thing it means private and hidden away in the deep corners of the direct connect network (it kinda amuses me) since it could be a public vip or just an account on hub that you registered yourself at.

Third and last is of course the operator digit this really doesn’t show if its private or public either.

and using hublist to check if same user is online at other hubs on NMDC is pretty dumb too since, It might be an shared connection.

What if the person has a brother or sister that uses DC++ and likes to be  on a public hub to talk about nothing and everything.

plus it really cracks me up that we are talking about securing private hubs that are communicating via cleartext protocol :D using NO ENCRYPTION thats really funny how secure is the hub in that regard let me tell you’ll.

Its really just a waste so in the holiday seasons i leave you with the words of wisdom your only as secure as your own box (hub).

P.S for all those people making these scripts for god sake do it right and put in a 60-90 second delay before kicking the poor sap since there can be a slight delay in updating myinfo on NMDC and for that matter ADC since i know the idiocy is gonna continue.


I help those who help themselves! – Zoidberg (Futurama)

DC++ pointing out the corrupted

One of the latest enhancements in DC++ is the hub referral on client-client connections, proposed by Jan Vidar Krey. The current bazaar trunk implements this mechanism and the next DC++ version that will be released soon will also have it. The purpose of this extension is to point out the corrupted hub that is sending the current client to a non DC client, with obvious malevolent purpose. This implies that the hub is either using exploitable software, or that it’s intentionally abusing the clients. Either way, the hubowners are solely responsible.

On connecting to the other party, DC++ will also send the hub URL that it used to connect to the hub sending out the CTM message. By packet inspection, an attacked party can figure out which is the corrupted hub (only a pointer is required, such that they have a point of reference ) . Another good part about this extension is that it works on both ADC and NMDC ( some workaround was found for NMDC: adding the url to the PK string since NMDC is not extensible nor flexible in this matter ) , with the least effort from the clients and it does not bother them in any way. A normal client should ignore the specific message ( I don’t find any particular usage for it ).

We strongly recommend all mods to inherit this extension and other clients out there to implement it so the CTM attacks impact on DC software will stop being so great.

DC++ CTM Proof

In a previous post I was wondering if the whole concept of centralization is obsolete or has major flaws. The problem that is bothering everybody in the last years is that clients can be used ( unwillingly ) as tools in distributed denial of service attacks. Jan Vidar Krey is proposing a hub refferal on c-c connections that can point to a source of CTM attacks via the messages that the client sends on first connection attempt. In this case an attacked entity can see the hub with problems/intentional flooding that is causing the attacks.

As a first step to prevent this kind of abuses in DC++, poy added a static IP protection for the major hublists that were attacked via the client. This kind of measure is just temporary since hublists can change IP anytime and it protects only them, not everybody else that can be attacked ( Also the fun part is that the hublist server is actually running a DC client and wants to download from other users, it can’t ! ) .  A second step was to dynamically resolve the hublist ip’s and block them for c-c connections.

The main idea that I considered is to practically check all the users on a specific hub to see if they actually are real. On CTM receive the client should not connect but send another CTM to see if that IP actually connects to them . This will make sure that the user is the actual owner of that specific IP address. Of course the biggest problem is if the user is passive, in which case it can’t send a CTM back. This could be against the protocol principles but it’s a solution to see if the other peer really exists. I don’t know if a RCM would do something good in this situation but it’s a start.

Another thing that should be done ( if not implemented already ) is that on c-c connections if the first attempt was unsuccessful then no further attempts should be done until the user at least reconnects or changes state ( passive/active ). Also the hubs should be trustworthy. In a previous post I suggested a way to make hubs trustful via a CA authority system, but most people were quite reticent about it. Perhaps this could be the only way to make hubs trustful. Warning messages will not help too much ( Strong DC implements such messages ) since most of the users either don’t read them or don’t care. We shouldn’t let users question this problem, but solve it for them. Continuous problems from the Direct Connect network might be a cause to mark DC software ( and DC++ ) as badware, which will definitely take down the network. It’s time to do something about it.

I’m hoping for more ideas how to make DC++ proof against CTM abuses and I’m waiting for opinions from you as well.