DC++ 0.862

DC++ 0.682 released today and apart from some library updates it notably fixes an issue with the default Automatic connectivity setup. The automatic detection won’t work in certain cases where no automatic port mappers can be found so the final choice would be to settle in Passive mode.

For those who make use of the Automatic Connectivity Setup (should be the vast majority of users) the upgrade is highly recommended.

DC++ 0.861

The first new DC++ release in more than a year, version 0.861, brings plenty of enhancements and security updates. The following are the list of key fixes and improvements over version 0.851:

  • Just like as in the previous major release, version 0.850, there are new functions that has been requested by the users through the bug tracker. Such features are an option for autostart DC++ when Windows starts, quick-check hubs with encrypted connections in Search frame, search capability in the Notepad window, hub connectivity status icon in the public hub list and a text encoding setting for favorite NMDC hubs.
  • We’ve improved Windows 10 compatibility by fixing a visual bug in the chat and updating the UPnP mapper. The latter may fix reported issues with automatic connectivity setup under Windows 10.
  • Added an icon toolbar to the Download queue to make the control of the downloads and priorities easier.
  • Fixed security issues related to OpenSSL and also problems with keyprint validation and secure transfers.
  • As like any program that displays clickable links from outside sources should do, now DC++ also introduces a whitelist of URIs that it allows to be directly opened without an user prompt. It means that a confirmation dialog will appear before the actual opening of any type of links that’s not whitelisted. This prevents accidental launching of any 3rd party software that is registered to certain URIs in the system and might be used to exploit existing vulnerabilities or execute arbitrary code. The URI whitelist is freely configurable in the settings dialog. We’d like to thank Kacper Rybczynski for pointing out this issue and for working with us to help protect DC++ users.
  • There’s a new structure for manual connectivity settings and lots of new options available to fine tune IPv6 connectivity. The automatic connectivity setup now enables IPv6 connectivity if the bound network interface is assigned with a public v6 IP address. Note that all parts of the IPv6 connectivity is in an early beta stage and prone to failures and that v6 connections are only supported to ADC hubs and between ADC hub users.
  • With version 0.860, DC++ has ended Windows XP support and requires Windows Vista as a minimum Windows version to run. This has enabled a lot of cleaning in the code which also results performance improvements.
  • Version 0.861 introduces more significant performance improvements by being compiled with the latest MinGW technology as well as by requiring SSE2 CPU support. The latter brings extra preformance boost to 32-bit builds of DC++ in several areas, notably in the speed of hashing, download queue matching and respond to searches. This also means that DC++ requires Intel Pentium 4 / AMD Athlon64 or newer processors from now.

The list of complete changes with links to the discussions in the bug tracker are available here. Due to the nature of fixes an immediate upgrade from earlier versions of DC++ is highly recommended.

DC++ Will Require SSE2

The next version of DC++ will require SSE2 CPU support.

This represents no change for the 64-bit builds since x86-64 includes SSE2. The last widely used CPUs affected, lacking SSE2 support, are Athlon XPs the last of which were released in 2004. As such, not just DC++ but Firefox 49, Chrome on both Windows and Linux since 2014, IE 11 since 2013, and Windows 8 since 2012 all require SSE2. Empirically, Firefox developers found that just 0.4% of their users as of this May lacked SSE2 and Chrome developers measured 0.33% of their Windows stable population lacking SSE2 in 2014, suggesting that to the extent not requiring SSE2 imposes non-negligible development or runtime cost, one might find increasingly thin support for avoiding it.

A straightforward advantage SSE2 provides derives from non-SIMD 32-bit x86 supporting only arguably between 6 and 8 general-purpose 32-bit registers. SSE2 in 32-bit environments adds 8 additional registers, substantially increasing x86’s architecturally named registers.

Furthermore, these additional registers in 32-bit x86 are 128-bit, allowing 64-bit and 128-bit memory moves in single instructions, rather than multiple 32-bit mov instructions, which also enables each reg/mem move to more efficiently align on larger boundaries. Similarly, access to 64-bit arithmetic and comparisons on x86 allow native handling of all those 64-bit arithmetic, logic, and comparison operations which show up both in the Tiger hash code (designed for 64-bit CPUs and it shows) and the 64-bit file position handling pervasive in DC++.

Finally, there’s substantial use of 2-wide SIMD, especially when common patterns such as

foo += bar;
baz += foobar;

via SSE2 packed integer addition (e.g., paddq) or

foo -= bar;
baz -= foobar;

appear, using packed integer subtraction (e.g., psubq).

Putting all this together in one of the more dramatic improvements in generated code quality as a result of this change, one can watch as enabling SSE2 automatically transforms part of TigerHash::update(…) from:

193:dcpp/TigerHash.cpp **** 	}
movl	168(%esp), %edi	 # %sfp, x7
movl	172(%esp), %ebp	 # %sfp, x7
movl	440(%esp), %ebx	 # %sfp, x1
movl	444(%esp), %esi	 # %sfp, x1
movl	%edi, %eax	 # x7, tmp2058
movl	412(%esp), %edx	 # %sfp, x0
xorl	$-1515870811, %eax	 #, tmp2058
movl	%eax, 488(%esp)	 # tmp2058, %sfp
movl	%ebp, %eax	 # x7, tmp2059
movl	%ebx, %ecx	 # x1, tmp2062
xorl	$-1515870811, %eax	 #, tmp2059
movl	%esi, %ebx	 # x1, tmp2063
movl	156(%esp), %esi	 # %sfp, x2
movl	%eax, 492(%esp)	 # tmp2059, %sfp
movl	408(%esp), %eax	 # %sfp, x0
subl	488(%esp), %eax	 # %sfp, x0
sbbl	492(%esp), %edx	 # %sfp, x0
xorl	%eax, %ecx	 # x0, tmp2062
movl	%ecx, 384(%esp)	 # tmp2062, %sfp
xorl	%edx, %ebx	 # x0, tmp2063
movl	384(%esp), %edi	 # %sfp, x1
movl	%ebx, 388(%esp)	 # tmp2063, %sfp
movl	152(%esp), %ebx	 # %sfp, x2
movl	388(%esp), %ebp	 # %sfp, x1
movl	%edi, %ecx	 # x1, tmp2066
notl	%ecx	 # tmp2066
addl	%edi, %ebx	 # x1, x2
movl	%ecx, 496(%esp)	 # tmp2066, %sfp
movl	%ebp, %ecx	 # x1, tmp2067
adcl	%ebp, %esi	 # x1, x2
notl	%ecx	 # tmp2067
movl	%ebx, (%esp)	 # x2, %sfp
movl	%ecx, 500(%esp)	 # tmp2067, %sfp
movl	496(%esp), %ecx	 # %sfp, tmp1093
movl	%esi, 4(%esp)	 # x2, %sfp
movl	500(%esp), %ebx	 # %sfp,
movl	(%esp), %esi	 # %sfp, x2
movl	4(%esp), %edi	 # %sfp,
shldl	$19, %ecx, %ebx	 #, tmp1093,
movl	%esi, %ebp	 # x2, tmp2069
movl	460(%esp), %esi	 # %sfp, x3
sall	$19, %ecx	 #, tmp1093
xorl	%edi, %ebx	 #, tmp2070
xorl	%ecx, %ebp	 # tmp1093, tmp2069
movl	%ebp, 504(%esp)	 # tmp2069, %sfp
movl	%ebx, 508(%esp)	 # tmp2070, %sfp
movl	456(%esp), %ebx	 # %sfp, x3
subl	504(%esp), %ebx	 # %sfp, x3
sbbl	508(%esp), %esi	 # %sfp, x3
movl	%ebx, %edi	 # x3, x3

to something of comparative beauty:

193:dcpp/TigerHash.cpp **** 	}
movl	80(%esp), %eax	 # %sfp, tmp1091
movl	84(%esp), %edx	 # %sfp,
xorl	$-1515870811, %eax	 #, tmp1091
xorl	$-1515870811, %edx	 #,
movd	%eax, %xmm0	 # tmp1091, tmp1885
movd	%edx, %xmm1	 #, tmp1886
punpckldq	%xmm1, %xmm0	 # tmp1886, tmp1885
psubq	%xmm0, %xmm7	 # tmp1885, x0
movdqa	96(%esp), %xmm1	 # %sfp, tmp2253
pxor	%xmm7, %xmm1	 # x0, tmp2253
movdqa	%xmm1, %xmm0	 # x1, tmp1843
psrlq	$32, %xmm0	 #, tmp1843
movd	%xmm1, %edx	 # tmp21, tmp2105
notl	%edx	 # tmp2105
movd	%xmm0, %eax	 #, tmp2106
notl	%eax	 # tmp2106
paddq	%xmm1, %xmm6	 # x1, x2
movl	%edx, 192(%esp)	 # tmp2105, %sfp
movdqa	%xmm1, %xmm3	 # tmp2253, x1
movl	%eax, 196(%esp)	 # tmp2106, %sfp
movl	192(%esp), %eax	 # %sfp, tmp1093
movl	196(%esp), %edx	 # %sfp,
shldl	$19, %eax, %edx	 #, tmp1093,
sall	$19, %eax	 #, tmp1093
movd	%edx, %xmm1	 #, tmp1888
movd	%eax, %xmm0	 # tmp1093, tmp1887
punpckldq	%xmm1, %xmm0	 # tmp1888, tmp1887
pxor	%xmm6, %xmm0	 # x2, tmp1094
psubq	%xmm0, %xmm5	 # tmp1094, tmp2630

The register overflow spill/fills in the non-SSE version from %eax to 492(%esp) back to %edx three instructions later to enable %eax to be reused; from %ecx to 500(%esp) back to %ebx in another three instructions to enable 496(%esp) to be left-shifted a few instructions later; and between %edi, %ecx, and that same 496(%esp) because evidently, there’s not enough space to sort both %ecx and notl %ecx simultaneously with a half-dozen GPRs.

Virtually no spills/fills remain because there are now ample registers; the movdqa from 96(%esp) to %xmm1 replaces multiple 32-bit movl instructions; the ugly addl/adcl and subl/sbbl pairs emulating 64-bit addition and subtraction using 32-bit arithmetic disappear in lieu of natively 64-bit arithmetic; and each pair of 32-bit xorl instructions becomes a single pxor.

While TigerHash.cpp especially shows off SSE2’s advantage over i686-generation 32-bit x86, each of these improvements appears sprinked in thousands of places around DC++, in function prologues, every time certain Boost template functions shows up, every time _builtin_memcpy is called, and in dozens of other mundane yet common situations.

Setting up multiple-subdomain HTTPS with nginx, acme-tiny, and Lets Encrypt

This guide briefly describes aspects of setting up nginx and acme-tiny to automatically register and renew multiple subdomains.

acme-tiny (Debian, Ubuntu, Arch, OpenBSD, FreeBSD, and Python Package Index) provides a more verifiable and more easily customizable than the default Let’s Encrypt client. This proves especially useful in less mainstream contexts where either the main client works magically or fails magically, but tends to offer little between those two outcomes.

The first step is to create a multidomain CSR which informs Let’s Encrypt of which domains it should provide certificates for. When adding or removing subdomains, this needs to be altered:
# OpenSSL configuration to generate a new key with signing requst for a x509v3
# multidomain certificate
# openssl req -config bla.cnf -new | tee csr.pem
# or
# openssl req -config bla.cnf -new -out csr.pem
[ req ]
default_bits = 4096
default_md = sha512
default_keyfile = key.pem
prompt = no
encrypt_key = no

# base request
distinguished_name = req_distinguished_name

# extensions
req_extensions = v3_req

# distinguished_name
[ req_distinguished_name ]
countryName = "SE"
stateOrProvinceName = "Sollentuna"
organizationName = "Direct Connect Network Foundation"
commonName = "dcbase.org"

# req_extensions
[ v3_req ]
# https://www.openssl.org/docs/apps/x509v3_config.html
subjectAltName = DNS:dcbase.org,DNS:www.dcbase.org

Then, when one is satisfies with one’s changes:
openssl req -new -key domain.key -config ~/dcbase_openssl.cnf > domain.csr
in the appropriate directory to regenerate a CSR based on this configuration. One does not have to change this CSR unless the set of subdomains or other information contained within also changes. Simply renewing certificates does not require regenerating domain.csr.

Having created a CSR, one then needs to ensure Let’s Encrypt knows where to find it. The ACME protocol Let’s Encrypt uses specifies that this should be /.well-known/acme-challenge/ and per acme-tiny’s documentation:
# https://github.com/diafygi/acme-tiny#step-3-make-your-website-host-challenge-files
location /.well-known/acme-challenge/ {
alias $appropriate_challenge_location;

allow all;
log_not_found off;
access_log off;

try_files $uri =404;

Where this needs to be accessible via ordinary HTTP, port 80, to work most conveniently, even if the entire rest of the site is HTTPS-only. Furthermore, this needs to hold even for otherwise dynamically generated sites — e.g., http://build.dcbase.org/.well-known/acme-challenge/, http://builds.dcbase.org/.well-known/acme-challenge/, http://archive.dcbase.org/.well-known/acme-challenge/, and http://forum.dcbase.org/.well-known/acme-challenge/ would all need to point to that same challenge location, even if disparate PHP CMSes generate each or they ordinarily redirect to other sites (such as Google Drive).

If this works, then one sees:
Parsing account key...
Parsing CSR...
Registering account...
Already registered!
Verifying dcbase.org...
dcbase.org verified!
Verifying http://www.dcbase.org...
http://www.dcbase.org verified!
Signing certificate...
Certificate signed!

When running acme-tiny.

Once this works reliably, the whole process should be run automatically as a cron job often enough to stay ahead of Let’s Encrypt’s 90-day cycle. However, one cannot renew too often:

The main limit is Certificates per Registered Domain (20 per week). A registered domain is, generally speaking, the part of the domain you purchased from your domain name registrar. For instance, in the name http://www.example.com, the registered domain is example.com. In new.blog.example.co.uk, the registered domain is example.co.uk. We use the Public Suffix List to calculate the registered domain.

If you have a lot of subdomains, you may want to combine them into a single certificate, up to a limit of 100 Names per Certificate. Combined with the above limit, that means you can issue certificates containing up to 2,000 unique subdomains per week. A certificate with multiple names is often called a SAN certificate, or sometimes a UCC certificate.

Once Let’s Encrypt certificate renewal’s configured, Strong Ciphers for Apache, nginx and Lighttpd and BetterCrypto provide reasonable recommendations, while BetterCrypto’s Crypto Hardening guide discusses more deeply rationales behind these choices.

Finally, SSL Server Test and Analyse your HTTP response headers offer sanity checks for multiple successfully secured subdomains served by nginx over HTTPS using Let’s Encrypt certificates.

XML parsing of file lists

Many DC clients (and other software) have their own XML parser for parsing XML files and content. This means the parsers can be heavily specialized for performance (in the case of large file lists for instance) compared to just using a “standard parser” (i.e. one that has been used in multiple projects). However, building one’s own parser also means that the parser may be incorrect to a far greater extent, thereby increasing the risk that a malicious party (e.g. the one sending the file list) may try to remotely crash the receiver by sending incorrect files. Beyond the obvious concern for network security, clients may incorrectly allow files to be read or read incorrect data within those files.

I have compiled a list of potential errors that a file list may have, and generated file lists for each of those occurences. These file lists were then opened in DC++ (0.851) and verified to see what happened. This test should likely be done with all clients that don’t derive their own XML parsing with DC++’s (i.e., all DC++-mods will likely follow the below pattern).

I created multiple file lists (downloadable here), based on this file, generated with this C# snippet. Here are the results (Microsoft Office Excel file).

A summary of the results;

  • DC++ will parse invalid data (e.g. omission of data) and sometimes replace the faulty data with “something sensible”, although this is almost in all cases wrong.
  • In most cases where it is an invalid XML document, DC++ will ignore those sections or ignore the file altogether (this is good).
  • DC++ will not crash on invalid data.

Most of the issues found can be solved by performing a XML-sanitation check before reading the document, by validating against the XSD. DC++’s XML parser does not have any XSD validation, so it couldn’t be done at this point anyway, but should such a validation be implemented, it will cause a (small or big depends on the source file list) performance hit.

While I didn’t test it, parsing of the XML list for version.xml and any hublists will likely have the same issue(s) as mentioned above. At least we won’t crash DC++.

If someone has other software that they can test this with, please feel free to do so and let me know so I can update the Excel sheet. It’s also possible that the resulting files are named incorrectly (e.g. by not requiring a CID in the file name), so just run the snippet code.

(Note: The files in this post may have a file name such as “foo-zip.pdf”, and it is because the file is actually a zip file but this blog software couldn’t handle that, so just change the file-extension to the appropriate one.)

Organization meeting scheduled for 2016-01-10

An upcoming meeting for the Direct Connect Network Foundation is scheduled for 2016-01-10, at 19.00 CET. In the meeting, we will go over items from the past year and what we shall do in the coming year. The agenda will be according to the By-laws (https://www.dcbase.org/bylaws/) § 10.

You can see the previous meeting(s) here: https://www.dcbase.org/meetings/ so you can get a feel for the structure etc of the meeting. Feel free to suggest ways to improve the meeting process.

If you have additional items you wish the meeting to address (that people should think about beforehand), please post in this forum thread.

DCNF Resources

A new resources page at the dcbase.org site has now been created, where you can see all articles etc that relate to DC. Also, the page contains court cases where DC is involved in some way.