XML parsing of file lists

Many DC clients (and other software) have their own XML parser for parsing XML files and content. This means the parsers can be heavily specialized for performance (in the case of large file lists for instance) compared to just using a “standard parser” (i.e. one that has been used in multiple projects). However, building one’s own parser also means that the parser may be incorrect to a far greater extent, thereby increasing the risk that a malicious party (e.g. the one sending the file list) may try to remotely crash the receiver by sending incorrect files. Beyond the obvious concern for network security, clients may incorrectly allow files to be read or read incorrect data within those files.

I have compiled a list of potential errors that a file list may have, and generated file lists for each of those occurences. These file lists were then opened in DC++ (0.851) and verified to see what happened. This test should likely be done with all clients that don’t derive their own XML parsing with DC++’s (i.e., all DC++-mods will likely follow the below pattern).

I created multiple file lists (downloadable here), based on this file, generated with this C# snippet. Here are the results (Microsoft Office Excel file).

A summary of the results;

  • DC++ will parse invalid data (e.g. omission of data) and sometimes replace the faulty data with “something sensible”, although this is almost in all cases wrong.
  • In most cases where it is an invalid XML document, DC++ will ignore those sections or ignore the file altogether (this is good).
  • DC++ will not crash on invalid data.

Most of the issues found can be solved by performing a XML-sanitation check before reading the document, by validating against the XSD. DC++’s XML parser does not have any XSD validation, so it couldn’t be done at this point anyway, but should such a validation be implemented, it will cause a (small or big depends on the source file list) performance hit.

While I didn’t test it, parsing of the XML list for version.xml and any hublists will likely have the same issue(s) as mentioned above. At least we won’t crash DC++.

If someone has other software that they can test this with, please feel free to do so and let me know so I can update the Excel sheet. It’s also possible that the resulting files are named incorrectly (e.g. by not requiring a CID in the file name), so just run the snippet code.

(Note: The files in this post may have a file name such as “foo-zip.pdf”, and it is because the file is actually a zip file but this blog software couldn’t handle that, so just change the file-extension to the appropriate one.)

Addressing DC++’s service provider, SourceForge

There has been a lot of discussion regarding changes to SourceForge’s hosting practices [1][2][3]. There are two things that SourceForge have done; created an opt-in “revenue program” and begun taking over old or non-updating (or even non-existant) projects.

The opt-in program is DevShare and allow developers (project administrators) to receive revenue based on modified installers. FileZilla is one of the major projects that have done so. The modified installers embed additional programs, thereby acting as ad services. The developers can choose which type of ads/programs are suggested, although they cannot say exactly which may or may not show up. The developers do nothing extra to accomodate this feature. The difference, as noted by Ghacks.net is that SourceForge will change the appearance of the download page to highlight the ad-specific one whilst still having a link to the other one (albiet not as easy to see).

The DC++ administrators were sent an e-mail from SourceForge regarding the DevShare program whether DC++ should or should not also opt-in for the DevShare program. The DC++ administrators declined this offer as the additional revenue was not needed for any basic operation and it felt it might violate the integrety of the installers. This was just as the DevShare program had been announced. No further action for this has been taken and no additional requests from SourceForge have been made.

The second part of SourceForge’s changes are that of modifications to old projects or completely taking over the projects (or even creating them in the first place). This can be seen with e.g. GIMP. As long as DC++ does not become stale or otherwise non-active this will never affect DC++.

All of this have caused us (the developers of DC++) to review our stance with SourceForge. Some facts before I continue:

  • SourceForge have hosted DC++ (and other DC related software) since its inception (i.e. for several years) without any problems in this area.
  • SourceForge provides stable code repositories and website resources. Although the speed of SourceForge network may be questionable, it is able to withstand hard DDoS:ing.
  • DC++ hosts the source code repository, file downloads and website resources on SourceForge.
  • There are other DC related projects that are also hosted on SourceForge.
  • DC++ is considered a “valued projects” in that it has appeared on SourceForge’s project of the month as well as the DevShare offer. DC++ is also among the high-download projects at SourceForge.
  • DC++ will not be directly affected by DevShare as we have not accepted such an offer. (I must stress it is an opt-in offer.)
  • DC++ will not be directly affected by the abondoned projects changes as DC++ continue to be updated and will not qualify for such a change.
  • At least one browser plugin, uBlock, have started to block SourceForge as a whole, thereby potentially restricting users from accessing DC(++) resources.

So, in light of all of this, we have begun to look into other project repositories:

  • Launchpad – Already hosts other features for DC++, such as the bug tracker, but does not provide a sufficient code repository (Bazaar is near-dead), somewhat cumbersome download capabilities and no true website support.
  • Github – No real website support. This is more suited for just the code repository than a full-on project repository. We are more likely to host the source code on Github and proxy that through another service.
  • Bitbucket – Restricts number of contributors, no website support. poy suggests strongly that we do not move to Bitbucket.
  • Google Code – Recently closed registration of new projects. (Lacked anyway certain features.)

There are other project repositories available, although no one of us have experience with most of them.

It is important for us to move forward with this, so here is our plan forward:

  • Move (or at least parts of) source code repositories, websites and download facilities to our own hosting facilities. E.g., Rhodecode is being set up to address this for source code.
  • DC++ will continue to use SourceForge as a minimum as its backup service provider. It is important to note that we have had a relatively pleasant experience with SourceForge – as project administrators.
  • We will continue to monitor any further development in SourceForge management and changes.

We welcome suggestions, both from SourceForge and others, in how we can move forward.

DC++ 0.851

A new security & stability update of DC++ has been released today.

There are no user visible new features this time; besides the latest OpenSSL security fixes and hardening secure connection further by disallowing weak ciphersuites this DC++ version largely focuses on mitigating malicious situations where DC++ can be used for distributed denial of service (DDoS) attacks when beeing logged in to certain malevolent NMDC hubs.

Please note that most, if not all previous DC++ versions are affected of this problem therefore this release is highly recommended for everyone still using any older DC++ versions. Once all maintained NMDC hub software implements the mitigation for this problem it is highly probable that many existing hubs will require this DC++ release as the minimum version to use.

If no critical issues found, DC++ 0.851 should be marked as the new stable DC++ release within a short period of time.

For the complete list of changes in version 0.851, please explore the changelog.

DC++ 0.850

The first new DC++ release in the last nine months, version 0.850 fixes and hardens security related functions further notably to avoid all popular TLS exploits emerged since last April.

This release also contains stability and performance updates of various 3rd party libraries and improvements of the latest version of the compiler.

For complete list of fixes and upgraded libraries, please explore the changelog items and the linked bug discussions.

DC++ 0.842

The first stable release of the 0.840 series of DC++ is out. Besides a few SSL encryption related and stability fixes this version largely focuses on implementing various features asked for or recommended by the user community through our feature tracker.

The changelog shows all the implemented new features and fixes.

DC++ 0.842 also provides protection against the infamous “Heartbleed” OpenSSL vulnerability. This security hole has existed in DC++ since version 0.799.

There’s a high chance of version 0.842 is the last mainstream DC++ release that supports Windows XP.  Due to the still large userbase of the already unsupported operating system, security and major stability fixes are possible for a few more months using a separate branch targeting XP only. The update reminder system is modified so in case of any forthcoming version targeting Vista and later being released, XP users won’t see the notification dialog anymore.

From that time on people running Windows XP will see the update nag dialog only if there’s an update targeting their old OS. However, starting with version 0.840 every XP user gets a special reminder at startup about the EOS of DC++ in their operating system.

Due to the nice new features and security fixes the upgrade is highly recommended.

DC++ 0.831

A new bug fixing service release of DC++ has been released today fixing the following problems introduced with version 0.830:

  • One of the bugs, marked as critical, prevents DC++ to respond to TTH searches on NMDC hubs.
  • A problem with too small protocol command size limits can cause problems for hubs sending large user commands.
  • The newly introduced direct encrypted private message channels are getting disconnected after some idle time.

All the fixed problems exist in version 0.830 only thus older versions are not affected. For users running DC++ 0.830 the upgrade is highly recommended.

DC++ 0.830

Today we marked the first version of  the 0.83x series of DC++ as stable. The new release brings plenty of stability updates as well as introduces a new ADC feature to improve privacy.

The privacy improvement is actually an implementation of an ADC protocol extension called CCPM. Basically, it allows two peers to initate an SSL encrypted direct connection channel for sending and receiving private messages.

Until now, all private messages in the DC network has been gone through a hub where both users were logged in. While this method is great for controlling unwanted messages (spamming) it also makes possible for the hub owner to spy on any private communications.

Enter CCPM, a feature that still needs a hub to initiate the direct encrypted connection but the hub is needed only for the start. After the direct channel has been estabilished the messages go directly between the peers in an encrypted way. The channel initiation requires the two users to be logged on a secure ADC hub (ADCS).

The whole discussion of the protocol features and CCPM implementation can be found here (the implementation details with screenshots starts in this position of the thread). The built-in help of DC++ also describes the feature in the Private message window page and the availabe controlling options in the Certificates’ settings page (once updated, links will be added to  the web version of the DC++ help, too).

The list of other fixes in version 0.830 speak for themselves yet again this time, explore the changelog items and the linked bug discussions in them for more information.