Archive for October, 2008

The case of a missing tree

October 24, 2008

Recently when I downloaded some files from an user I came across a situation which seemed a bit confusing at first sight. The user (I knew him before) is from another country and while he has a fast connection with nice upload speeds, the connection to him is usually not fully trouble free: my downloads from him are usually disonnect in every 10-20 minutes with timeout. This is a small problem I know, but this will be a key of this story as you’ll see later…

So I queued up some folders with larger files (there were no other sources available) and left DC++ to download in the background. Sometime later when I came back, I realized that only a couple of smaller files are downloaded so far, despite that the speed of the file transfer was still as nice as usual. When I checked the current transfer in the Connections Tab I found a very suprising fact : the file has been downloaded in one chunk and the chunk size was equal to the file size! Cheking the Finished Downloads window made the problem more mysterious : it said that 150% (!) of the current file has already been transferred…

Since segmented download method introduced DC++ does not create one large chunkfor a download except when segmented downloads are disabled. The size of the chunks are automatically adjusted depending on how fast are the transfers and how many percent left from the current download. Faster transfers result bigger chunk sizes but a chunk never reaches the overall file size unless the file is very small. In this case (as always) I had segmented downloads enabled.

As I thought that this can be a bug, I tried to disconnect the download manually to see if the chunk sizes go normal after reconnect. There came another suprise then : the download wasn’t resumed at all, it started from the beginning and still with that huge chunk size! At this point I understood why this download didn’t finished at all. As I mentioned before there are plenty of disconnects happened during the download from this user so the actual file (which was a pretty large one) wasn’t able to finish. But… why ?!?

The user had a fairly new DC client so incompatibility was ruled out. I checked the download from other users – they worked as they should: normal chunk sizes and successful resume on reconnect. Then I thought I try to get more files from this problematic user and… would you believe or not: some of the files are worked well while others still didn’t! However, this last strangeness started to ring the bell at last…

I asked the user to rebuild his share and… voilĂ  things started to work normally right away. The problem was with his hashdata file, it became partially corrupted. Hash trees of the shared (and queued) files are stored in the hashdata so the other client failed to provide the correct tree information.

Now we found the problem but you may ask: why the hash tree needed to resume an unfinished download? Or: why’s it needed to get smaller parts of a download from more sources at the same time?

Before segmented downloading there were two methods in DC++ for resume a download. Both became more or less obsolete when chunks arrived because since then, the downloaded part of an unfinished file isn’t a contigous data. There’s no certain point to resume the download from as before. Now the unfinished temporary file is fully allocated and it contains non-contigous segments of finished and unfinished data. The size of each segment is equal to or an exact multiply of the TTH leaf size (or block size) and when segments are just finished, their integrity checked at once using the hash(es) of the block(s). The offset and length of the already finished segments are stored in the download queue.

Now its clear that to be able to check the integrity of the finished segments DC++ needs the full Tiger tree of the download (faithful readers of this blog are already familiar with Tiger hashes and hash trees by a very explanatory earlier post). Since compatibility dropped with pre-TTH era DC clients, DC++ checks if the peer supports hashes and gets the full hash tree just before an actual download starts. (The only exception is when a file is smaller than the minimum leaf size – these files are downloaded in one go and checked by their TTH). One could think that if DC++ is unable to get the full tree then it won’t start to download the file. But actually this isn’t the case…

Instead, DC++ will start the download with the hope that it’ll find more sources, so it will be able to grab the full tree later from another source. Until then it uses the full file size as blocksize and TTH for checking the integrity – ofcourse its possible only when the whole download finished. This was a good strategy up until DC++ was able to resume a download without having the hash tree. As the good old rollback function is removed in 0.699 resume without the full tree became impossible.

Possibility of download without the tree is a nice feature for small or medium sized files. They are usually downloaded in short time, they can be checked by the TTH in the end, and thats it. However, it is a problem for huge files, especially if there’s no additional source to get the tree from. Even if they download all day long if it cannot finish in one go then its just a waste of time and bandwith. It will start all over again and again… And even if some other sources with free slots come around later (and the hash tree is successfully grabbed from them) these new sources won’t be used while the full size segment is running…