Old DC++ forums restored

DC++ used to have a forum where people would receive help, give suggestions on improvements and discuss protocol features. This forum migrated from SourceForge to the domain dcpp.net (now defunct, don’t use it). The entire site was then attacked and the forum was put offline. This was in 2007, and no forum has yet replaced the old DC++ forum as a whole.

The DCBase.org project is put in place to harmonize different content for Direct Connect. As such, the project host the DCBase forum (previously ADCPortal) where today’s discussions for (primarily but not exclusively) ADC development lies. However, it is also important to look in the past and what has been done and the discussions that were held then. As such, the old DC++ forum is now restored. This forum is now set up similar to the old forum, and the database is migrated as such. The entire forum is locked down (until someone want DC++ to regain that as a forum) so you can’t post anything.

I will probably create posts in the future where the old forum is referenced (in particular NMDC and ADC development and protocol discussions).

If anyone else have a forum, wiki or site that is now defunct, let me know. It is important that the content that we once produced isn’t completely lost.

ADC 1.0.2 released

A new version of the base ADC protocol is now released, version 1.0.2.

The document may look slightly different, especially with the addition of commands in the table of contents. The document itself (its content) is not that much modified (except for state management, see below).

An important part of the document is a new addition, a terminology section where difficult words or phrases are specified. This list is obviously meant to be much more than mere four items but it’s at least a start.

The STA previously didn’t specify who had the responsibility for action when a STA is sent with the severity Fatal (2). This has always been the originator of the message, and this is now explicit.

The state management is re-worded and restructured. All information about state has now been moved to its own section, allowing an implementator a quick and comprehensive overview on the requirements for the state management. Previously, the state management was sprinkled all across the document, making it difficult for a person to properly implement a state machine in their software. This has meant that state management information is now removed from each command (only thing remaining is an explicit note about in which state each command is used). Certain information is also clarified, such as what to call the parties in a client to client connection (“client party” and “server party”) and state transitions.

Version 1.0.1 of ADC was also ambiguous in state management when it came to one important part: who shall send the first INF in a client to client connection. This is important because it has the ramification that it makes multi-share difficult. The current specification is now not ambiguous, and makes the following stance: the first party to send the INF is the connecting party (“client party”). No known implementation suffer from this explicit note, as all manage this scenario just fine. Basically, this change means that multiple shares (per hub) may not be too far off.

The new version also brings in a new time where we can safely and appropriately update the base document. There was an announcement period when the document was going to be released which meant that developers have had time to adjust their software and give feedback in a timely manner.

DCDev archives published

I previously requested the DCDev archives, a repository of posts from DC developers. I was able to acquire the repository and it is now posted on DCBase.

There’s a lot of stuff in the posts, especially the initial parts with ADC. Enjoy.

Don’t forget that you can make topic suggestions for blog posts in our “Blog Topic Suggestion Box!”

Request for video about attacks spawned from DC

In a previous post, the security on DC was discussed as well as an attack on hublist.org, the then largest hublist for DC. This attack, as well as a generic view on attack on the web, was covered on a Finnish program (translated to Enligh as “MOT: Invisible plane hijacking”). This was broadcasted a few years ago (see the link) but wasn’t uploaded to their site.

My intention was to get this video and upload it to the Youtube page so everyone can view it. I have aquired a copy of it from a video recording, but I cannot simply (re)broadcast it because of obvious legal reasons.

This is a request for anyone who are able to come in contact with YLE.fi (the broadcasting company) and are able to persuade them to either upload the video or allow us to upload it.

I have previously made a request to YLE but was denied due to the fact that the video contains a trailer from the movie Die Hard. YLE claim they do not have the necessary resources to edit this film. (I was denied the ability to do the edits myself.)

YLE sales are in charge of the video and I can provide details of the person(s) I’ve been in contact with, if you think you have a better chance.

The road ahead: Post summary

This series I called “the road ahead” is simply my look into how we can improve and continue to develop the DC community and system. The posts should be readable as they are but can of course be read in conjunction with each other. There’s no priority in the suggestions I make. Instead, everyone should look at the posts and think about what small part they can do to help accomplish some of the suggestions (or if you have some of your own).

The posts:

  • Security and Integrity – About security and integrity in DC and how it affects us
  • Protocols – What the protocols should strive for and the tools the protocol community should provide
  • Competition – The different challenges that we face in Direct Connect in the battle of users
  • Software – What improvements we can make to software across the board
  • Widening the base – How the different information outlets we have can increase DC knowledge in the world
  • Infrastructure – How the infrastructure in DC can improve

The road ahead: Infrastructure

Direct Connect rests upon three parts: clients, hubs and hub lists. If there are no clients available then the community becomes stagnant and the appeal for new users diminishes.  Hubs must be available, else there’s no way for clients to connect to each other. The hubs provide the very community we have. The hublists provide clients with a sense of direction where there are other hubs with users.

These three parts are, in my mind, equally important and it is imperative that we have them in our infrastructure.

Direct Connect has had problems when hublists go down or becomes outdated etc, so the infrastructure we have should manage these types of problems.

The Direct Connect community is concentrated around the ability to provide a straight forward file sharing service all the while having a talk and discussion forum. These two parts are why DC is so great: you can have a discussion while sharing content that you and your peers like.

The infrastructure of DC should allow users to interact while they are not necessarily using a normal client. A website can serve as a client where you simply have a tab for chat and one for downloading etc.

The infrastructure should give users the ability to browse any type of software or discussion topic around DC, and this is the intent with DCBase.org. The idea is to create a central source for DC content. If we can gather information about each client in one place, users don’t need to go to different sites that have different appearances. users can go to a single site and if they choose so, they can continue on to the main page of the mentioned software or item of interest. The central source needn’t own each client or have the source code of each client, it simply needs to be able to refer to them.

Basing content around one place will also help making information unambiguous and not redundant. This infrastructure idea should allow developers control over their system while the face of DC can be unanimous.

An interesting aspect may be to create, say, a real non-profit organization. The organization could be recognized by a government, allowing the validity of DC to increase and potentially draw some attention from new users. The organization can serve as an ‘umbrella’ for donations, for example by distributing its donations to developers and their own infrastructure. It could also open up the possibility of receiving government funds to stabilize the DC infrastructure.

The infrastructure can provide websites, build and download repositories. It can allow specialized hubs dedicated to support and development.

The road ahead

The future is to try and merge many sources of information that needn’t be separated. For instance, the NMDC and ADC project can simply be a general “DC protocol project”, with minor branching. Whenever we can join resources, it will mean that there’s less time managing the multitude of sources and more time doing what we want — the further development of DC. If we can merge different functionality of software or if we can provide a clear interface for those interested in DC content, all the better.

The road ahead: Widening the base

We have some statistics on DC usage and while the interest has diminished somewhat over the years, there’s still a lot of people who use DC on a daily basis. Many believe that we should encourage users to spread word about DC in various places. The idea is then that other people who aren’t (regular) DC users become more interested.

A couple of these initiatives have already taken place with Twitter, Youtube and Reddit. No one has created a Facebook account for DC content and I frankly don’t think we need one at this time.

The intent with the Twitter account is that we should be able to get out information that isn’t worthy of a large blog post or article. The fact that one is restricted by 140 characters only makes it interesting.

The Youtube channel is there to provide new users of the software a simple yet effective way of going through a piece of software. The idea is that we can upload videos of other software instruction manual, and the videos doesn’t even have to be in English. If people want to provide translations and videos in another language, just let me know.

The subreddit DirectConnect serves as a point of interest for those who are already on Reddit for other reasons. Reddit provides a way of publishing information that needn’t be restricted to the confines of the forum or likewise. The subreddit can prove to be a great source to gather new users.

The road ahead

The idea is that more and more users should be made aware of Direct Connect and what it can offer. Users should be able to go to their own source of media and be able to pick up DC information if they choose to. Any time someone provides DC information in a new way or in a new place, then the base of DC widens.

The road ahead: Software

There’s a wide range of different applications that are for Direct Connect. There’s a bunch of clients, each with their own special niche or cool feature. The same applies for hubs.

Many applications promote the use of open-source, allowing multiple developers to add their thoughts and ideas to the product. Don’t like where the application is heading? Fork it and create your own. As is says in the DC++ documentation (paraphrasing); “eventually your modification may have a higher user count”.

The software that is produced are most often not code reviewed or tested very much. There’s only so many people that can write code or documentation, review code or test the application. The fact that people are doing much Direct Connect work for free on their spare time means that there’s only so much time in the ability to continue developing the software.

Most companies are structured around specific people writing code, specific people testing, specific people writing documentation, etc. This isn’t directly possible as people can’t be forced to do something. However, I believe that this means that people do what they like. By doing so, they don’t mind putting X hours into development of a feature if they feel that it was fun or interesting to do.

Goals

While it’s a novel thing to build the applications, it’s silly when people don’t envision the future. You don’t have to have a “business” sense or likewise to explain what you want with your product.

Each application should have a definite goal and specified stages where certain features are implemented. The goals can simply be “let us fix bugs X, Y and Z by the next version.” The goal doesn’t have to be “we must have X amount of users by the end of the year.” The goals should be sufficiently easy for someone else to do: that is, if you as a developer don’t have time to complete a feature, then someone else might.

Each application should strive for a certain type of freedom for its users. Lately, this freedom has come in the form of plugins for clients. While many clients have offered this ability in the past, adding this  ability in DC++ will probably increase the diversity of plugins. The current DC++ plugin interface is C, so the goal should be to provide an implementation for C# (through C++/CLI and/or Mono) or Java users. Python and other languages could probably be incorporated also through middle ware plugins. A clear goal would be to have wizards for Visual Studio or Eclipse or Netbeans, allowing developers little time in having to set up their environment. Base classes could also be added that help common operations for plugin developers.

If the software is provided with the source code, a clear goal should be to have clear and concise instructions for building your own version of the software. Project files and scripts for automatically downloading and building the software can greatly decrease the turn around time for development. In fact, you can increase your own productivity if you don’t have to do fifty things each time you need to compile or pull down the source code.

Diversity

While it is great that we have a lot of applications for the Windows platform, there’s still missing applications for Linux and Macintosh based systems. Applications like WINE on Linux help, but only so far. The need to have applications and developers for each platform is important.

An important part in today’s society is to provide applications for the mobile platform; phones and tablets. An interesting aspect would be to create native Blackberry, Windows Phone, iOS and Android interfaces. This would allow users to chat and share files through their mobile device. The network traffic cost could of course be an impact in the amount of users, but anyone with a flat rate plan would have no problem.

Easy as mobile devices are, another avenue may be adding support for Facebook, Spotify or other media directly in clients. Additionally, for example plugins in Facebook for DC could open up a new world of users. Just imagine that you can download and share photos from your hub straight onto Facebook, even while not being at home.

The road ahead

As more and more features get supported in each application, it is important to continually take a break  and make sure that each feature is properly implemented. Any new user is a potential source of questions and requests. The important part is to not bury the head in the sand, but to provide ample support for users whilst trying to continue developing the product. Any time a product has a capability that will allow the user to extend it (either through someone else’s plugin or themselves directly) it will mean that the user is much more interesting in the continuation of using the product.

An application signing can be a great way to provide a receipt that the software is genuine and not tampered with. While this may cost, it will increase users confidence in the authenticity of the software.

Videos, articles and other media that can be used for helping users (either starting users and long time users) will always be considered useful.

Going through the list of feature request on a regular basis may provide a good insight in what users want.

The road ahead: Competition

Nearly always when you have a component, resource of item you own, there’s someone else that want to compete with you and beat you with the better product.

In Direct Connect, we can clearly see this competition when it comes to client developers trying to get more users than other client developers. Hub developers vs other hub developers. Hub owners vs other hub owners. Protocol maintainers or proponents vs other protocol maintainers and proponents.

Competition comes from a desire to perform better than their counter-part. The end state is that users have more options (in software or elsewhere) to choose from, thereby allowing both niche and generalities in software and content.

Competition in the present

For example, the initial NMDC client could only connect to one hub at a time. As such, if you wanted to be in multiple hubs, you had to have multiple instances of the application open. When other clients started popping up, they supplied the ability to be connected to multiple hubs at the same time. This provided a clear competition between the two sets of developers as one party could say “hey, users, choose us since we can do this better and you’ll avoid hassle”. The end result was that every client got updated to have this functionality, which meant that users benefited greatly from this exchange. Now it’s almost unthinkable to have a client that can only connect to one hub at a time.

Direct competition can even come from those who help your product: say, if you say “no” to a particular feature, then they can create that feature themselves and distribute the new changes. This is how most client modifications of DC++ arose: the developers of DC++ felt that a feature was too experimental or didn’t follow according to the “mainstream” user, which meant that the feature wouldn’t get included. If a developer wanted this feature themselves, they could simply add that to their own version (and kind as they were, distribute it to others). For example, the use of hashing was an experimental feature that first saw the light of day in the modification BCDC++ but was eventually merged back into DC++.

Competition from the past

An often over-looked problem for software developers is that you do not only compete against the current set of applications or fellow developers. You compete with yesterday’s products. That includes your own product. DC++ has notoriously this problem: when hashing was introduced and became mandatory, a lot of users simply didn’t upgrade as they felt the feature(s) were not enough compared to the downside of hashing. For example, at LANs where you have no (less) bandwidth problems, the need for hashing may (seem to) become meaningless as you can simply transfer the files so quickly. The problem for the current client then is that users don’t want to upgrade to the flashy new version, as the old versions are perceived as better. This means that while you can out-smart your current competitors, you must not be able to manage your past self.

The competition from the past is also easy to spot when it comes to upgrading: upgrading is (considered) difficult and cumbersome. It takes time until people have moved from one (possibly insecure) version to another (hopefully better) version. Any form of automatic upgrade management can provide a good venue, like how most browsers do today (they upgrade without notification and without any form of user interaction).

An interesting management of handling past implementations are to provide an “in-between’ version, that incorporates the good from the past and the present. For example, the current DC++ client require that all files are hashed before they are shared. This is a perceived problem for LAN users (as explained above), but perhaps there’s ways around it. What if you hashed, say, the path of each file and called that ‘the file’s temporary hash’. These files and the temporary hash would then be included in the file list of that user. When a file is properly hashed, the temporary hash goes away. If another client connects and tries to get a file that only has a temporary hash, then that file moves up in the priority queue for files waiting to be hashed. When the file is hashed, the client can send back “hey, this is the new real hash”. That is, let all files be available for browsing and responding to (text) searches whilst not allowing people to actually download those files before the file’s been hashed. (I understand that this would be an undertaking in the software.) The out come would mean that you no longer compete against yourself.

Competition between users

In Direct Connect users also compete with other users when it comes to slots and hub bandwidth.

A slot is one potential upload channel. That is, if a user has three slots open, only three other users can download from them. That is, each user is competing with other users on the availability of slots within the system. Those users who have a fast connection are also able to quickly get their content, meaning that slots get more frequently available. The addition of including ticket systems and the ability for uploaders to decide who they grant additional slots to also provide a new dimension in the hunt for the slot. There are not many other file-sharing systems that behave this way, and I believe this is one of the reasons Direct Connect is prevalent as such: it promotes fast connections, small files and the ability (for the uploader) to manage their resources (bandwidth), all the while allowing the possibility of slow connections and large files.

The hub’s bandwidth is the primary reason that DC scales relatively poorly (compared to e.g. eDonkey); DC rests upon hubs broadcasting of information to most or all clients. That means that the bandwidth of a hub is crucial. For example, if the hub have 1 Mbit/s upload capability, then it is bound by a certain amount of users it can manage. Some hubs manage this resource by restricting how often you can perform certain actions. For example, the ability to search is often restricted to once or twice a minute. Sometimes only active users are allowed to search, etc. This means that as a simple user, you are competing against other users: if you can search, then that means another user might not, etc. There’s relatively little you can do as a user to fix this, beyond perhaps avoiding passive mode and encouraging the hub owner to get a better connections.

Competition from other systems

While the developers of the current system can discuss and argue about internal stuff, there’s an outside world as well. There’s a variety of different protocols and systems just waiting to (further) push down DC (sometimes, they can do so albeit unintentionally).

In the past months, we have seen more and more BitTorrent websites use magnet links. These have previously been an almost exclusively DC-resource. As such, DC client have owned the magnet link resource. As more and more BitTorrent sites require magnet links, so do the BitTorrent clients. That means that the DC clients now must compete against the BitTorrent clients for the ownership of magnet links. This is a battle I believe we cannot simply win, but I think there’s way we can still come out on top. DC and BitTorrent uses different information in their magnet links, and it’s easy to spot the differences. The DC clients (that previously at least) that own the link resource, should prompt its users about unknown magnet information. If the user can then specify that this new magnet information is actually for the BitTorrent client, then the DC client can simply redirect those types of links to the BitTorrent client. That means that DC owns the resource, while those who want to use BitTorrent isn’t left in the dust. Likewise, I believe the same option should go in the BitTorrent clients; if they discover a DC magnet link, then they should try and send that to the installed DC client.

While some systems do not have an intent on diminishing the DC user count, it may be in the system’s nature to do so. If the user isn’t using DC, they’re using something else.

The road ahead

There are no clear cut ways in steering clear of competition. The only way to stay ahead, is to invent features and come up with ideas before your adversaries. When it comes to other systems, the key is to provide ways of attracting users while still giving that system the small part of control.

The road ahead: Protocols

Direct Connect was started with the protocol “Neo-Modus Direct Connect” (NMDC). This was named after the only client and hub available at the time. Over time, this protocol grew as more client and hub developers followed. The protocol was initially sniffed out by various people as the original system was closed-source. Over the years, the client and hub developers grew and discussions commenced that the protocol had become unmaintainable. The protocol was considered bad in various aspects and the request for a new protocol was underway.

The initial discussion was whether the “new” protocol should be binary or text based: a binary is less resource intensive as you put much more care into what is being sent, while a text protocol (like NMDC) is easier to read and implement.

The discussion eventually came down to a “my client has the most users, so here’s my protocol and I’ll implement it in my client” call from Jacek Sieka, the DC++ author. The new protocol was called ADC, sometimes referred to as “Advanced Direct Connect”, while this has never been its official name. ADC took the same fundamental structure of Direct Connect as NMDC did, but with the intent of increased usability and extensability. A few of ADC’s aspects can even be traced back to the protocol suggestion (made around the same time) called “Direct Connct The Next Generation” (DCTNG), as is noted from ADC’s website.

ADC eventually grew and there’s now “competition” between NMDC and ADC (although I believe NMDC is still ‘winning’ in user counts).

There are now various resources that people can use to implement and support their own component for the NMDC and ADC protocols, although not enough in my mind.

The tools

There are, as of writing, very few tools for supporting protocol developers.

The tool Wireshark can provide a tremendous support in filtering what information is actually being sent on the network. Effectively, if you didn’t have the specification(s), you could create your own implementation by simply looking at the network traffic. Wireshark uses plugins that are protocol-specific sniffing implementations. However, no plugin has been fully implemented for either NMDC or ADC. The ADC Wireshark plugin attempts to do just this, but it isn’t complete (at the moment you have to compile things on your own etc). Having a plugin (for either protocol) could provide an excellent opportunity for developers to learn the protocol’s raw nature. There’s probably other similar types of applications like Wireshark, but it is probably the most widely known and used tool for providing network information.

There are few ways of actually testing your NMDC and ADC implementation. As a client developer, you need to connect to a “normal” hub and see if that hub accept your data and see if other clients can see your data. As a hub developer, you need to take “normal” clients and connect it to the hub and see if you properly route traffic. This means that while we can point to the specifications as the reference, most often developers need some form of software to actually verify their application. The proper tool for doing this would be a full reference implementation of a client or hub. This implementation doesn’t have to be fast or be able to handle lots of data. It should provide the basic protocol implementation together with a list of things it should send out to test the application (like sending out faulty data to verify that the system handles it). Ideally in my mind, a public hub should be set up to act as a reference implementation — that way you must manage connections that didn’t originate from your own computer or LAN.

While a reference implementation is the way to go, the next step is to have something called “stress tester”. This is an application that pushes the software to its limits. For a hub, the stress tester could simulate hundreds or thousands of users, seeing whether the hub can cope with the information. For a client, the stress tester could simulate lots of search results, managing lots of searches and simulating lots of connection attempts. The stress tester could also include faulty data, but the key of the application is to test whether the underlying service is able to handle a huge amount of data.

While we can provide tools for those who decide to implement the protocols themselves, we should also strive after providing reference implementation code. That is, people shouldn’t have to re-write boring protocol implementations all the time: they should just be able to take an already created one and use that. The ADC code snippets is such an attempt and in the future my idea was to have further code such as hash or magnet implementations, in addition to the protocol specific implementation. The idea is to create a basic foundation for Direct Connect software, similar to how FlowLib is managed. Of course, having general code extends also to NMDC.

Discussion

The idea is to promote any type of venue where people can interact and discuss further protocol enhancements and issues. The DCBase forum (previously ADCPortal) intends to provide such a functionality. There’s also the FlexHub forum, which also have NMDC and ADC sections.

I am not sure about the use of a wiki, as I think much of the content can be written elsewhere in a better way, but I see no problem in having wikis that explain various implementations in more detail or the use of “workgroups”.

Regardless of the venue, it is best if we can create a service that gathers protocol issues and content. This blog has served as such an information pool in the past for various parts, but I’d not be sad if we could have a better place that would be easier to manage.

The previous DCDev archives provide a good picture of the early discussions about NMDC and ADC, and I’m sure there’s a couple of gems that should be discussed further or at least lifted. Not only is that old resource important, the resources we have today are also important: the developer hub can be an important source and the future might should be post any protocol discussion to a forum or alike so other can read the discussions.

Any type of document that describe the DC protocols (or any other incarnation of DC) should be made public.

Documentation

The ADC protocol is (well?) documented and the main document and partner (“extensions”) should hold its merits on their own. However, I believe it would be good if others provided suggestions how we can improve the specification to be easier to read and be less ambiguous. The intent of the “recommendations” document should also provide information that may not warrant inclusion in the main documents but can serve as good-to-know information as people are reviewing the protocol. There have also been suggestions for state diagrams over the protocol as it should provide better insight into the flow of information.

The NMDC protocol was never officially documented. The documentation that exist was scraped together by various people based on the behaviour of the protocol implementations. There today a few resources left, but I would like everyone to acknowledge the NMDC project as its intent is to provide the same level of information as the ADC project. While the specification for NMDC should provide information how implementations should behave, I believe it should also specify how implementations have done in the past (if they have deviated and how).

Down the line, perhaps an official RFC document is the target for both NMDC and ADC (one for each, obviously).

The road ahead

The NMDC and ADC protocol should increase in its documentation and the support that can be provided for their implementators. The tools should provide better support for developing something new as well as simply not having to do the implementation at all.

Follow

Get every new post delivered to your Inbox.