Difference between revisions of "G2 specs part2"
|Line 1:||Line 1:|
Latest revision as of 03:49, 3 April 2005
NOTE: This is an old document and should not be used for new implementations. It is here because it may be useful reference material.
- 1 Background
- 2 What This Document Covers
- 3 The Gnutella1 Protocol
- 4 Gnutella2 Protocol Objectives
- 5 Proposed Gnutella2 Protocol
- 6 Samples
- 7 Higher Level Structures
There has been some interesting discussion about the merits of moving to a fresh new protocol vs. grafting things onto the original beta protocol through encoding extensions. I’m glad the issue is getting some consideration, because at times in the past it has almost seemed like a taboo subject. I firmly believe that a new protocol the right thing to do at this time, and have made a number of arguments for that case recently which I will summarise below. Of course there are also good arguments against it, so it’s a question of weighing everything up.
One thing I’m not questioning is that you don’t NEED a new protocol to do new things. The major changes that fall under the “Gnutella2” umbrella are fundamental architectural changes relating to search and other services, and the specifics of how you express them in bytes are of somewhat less importance. But when creating a whole new “high level” system, why would you want to express it using an old, limited format? Redesigning the “top” of Gnutella provides the perfect and logical opportunity to redesign the “bottom” too. And doing so allows us to break free from the strange idiosyncrasies we had to work with in the Gnutella1 beta protocol.
What This Document Covers
This document discusses the base-level packet structure upon which higher level services are built. It details the limitations of Gnutella1, the goals of a flexible new base protocol, the way Gnutella2 packets are structured and the way these structures are serialized in binary streams.
That’s the first half of the picture. The second half (which will follow soon) documents a set of base packet types using the new extensible tree structure. These are used to deliver distributed host discovery and object search services, and a few other interesting things. The general operation of the search algorithm itself has been set out in the previous document; however the new packet tree structures which implement it also contain a number of key optimisations and improvements over Gnutella1.
The Gnutella1 Protocol
The original Gnutella beta protocol is not the worst protocol in the world by any stretch, but it is far from optimal. It is, understandably, the type of protocol that software engineers create when they want to test an idea. The fact that it has been so widely adopted despite that fact reflects very well upon its many good points.
However, the purpose for which it was designed is quite different from the realities of Gnutella today. I’m referring not so much to its use of a broadcast architecture (as that’s a higher level construct), but more toward the changes in “Gnutella politics”. The original protocol was designed to communicate between distributed applications with a common origin. There was no need for interoperability between independent vendors doing slightly different things, only a theoretical need for backward compatibility within the one development stream.
When you have complete control over a protocol, things are obviously a lot simpler. You decide when and how new things are coded, you (should) know exactly how various changes or improvements you make will affect everyone else, because you coded the software everyone else is using. That’s the kind of scene the Gnutella protocol was designed for. But as we know, that’s not what happened – it became an open protocol. And problems arose.
The fact that these problems were largely overcome is a testament to the efforts of the GDF. Establishing reasonably standard ways of tacking new data onto old constructs without disturbing the existing parsing code of many different vendors in an environment lacking strong version control was a very impressive achievement. But the point is this – with an appropriately designed base protocol, these problems would never have arisen.
Granted, issues of incompatibility have been largely solved or avoided through the development of new extension encoding techniques. These new techniques are great, but they’re tacking their new stuff on to the old structures, sometimes going to unnecessary lengths to do so. Why don’t we just lose the old stuff and keep the new, well thought out stuff?
I believe that Gnutella is an ideal and a community, not a protocol. If the protocol should be better, we should change it. That doesn’t destroy what this community is; it just helps it get better.
Gnutella2 Protocol Objectives
So, if there was an ideal protocol for a distributed network of nodes running software developed by many different vendors, what would it be like? I drew up a list of key objectives (please feel free to add to them!):
It Should Not Be Too Specific
G1 was designed to support a broadcast and narrow return architecture, which is a specific model we are now departing from. Every packet has a GUID, a TTL and a hop count, which are all useful in that model and useless ballast in most other models.
An ideal protocol should focus on a standard way of expressing information, rather than worrying about what kind of information. If a GUID is needed in a particular packet type, put it in that packet, not in the header of every packet. In other words, it should be minimal and compact.
It Should Involve a Namespace
In a closed protocol with one controlled implementer, numeric op-codes or function numbers are meaningful. In an open environment with many different implementers working independently, namespaces become necessary. This is one of the key innovations in things like GGEP and vendor messages.
It Should Be Hierarchical
Hierarchical document structures are extremely powerful (eg XML), and if you can make them compact enough, make ideal packets. Why? Because they allow any piece of information – anywhere – to be extended or duplicated without disturbing its value. That means that any implementer can extend the work of any other implementer without compromising it. Take the following example:
+ Query Hit Packet | |-+ Node ID (standard) | |-+ Server Status (standard) | \-+ Shareaza Server Status (private extension) | |-+ Hit Object | |-+ URN (standard) | |-+ Descriptive name (standard) | | \-+ Alternate name list (extension) | |-+ URL (standard) | |-+ Priority indicator (private extension) | | \-+ Digital signature (private) | |-+ Alternate source summary (standard) | \-+ Available ranges (standard) | . \-+ Estimated completion time (private extension) | |-+ Selective digital signature (private?) | \-+ Routing tags
I made that example up (clearly), but it shows how a hierarchical/tree model is perfectly suited to structured expressions of information, such as those we find in P2P (collections of query results, for example). You can approximate the same effect with fixed structures, but a true tree allows you to add anything anywhere anytime, enabling types of extensions that were not anticipated.
It Should Be Independently Extensible
If developer A creates a child packet which expresses support for peer to peer chat, developer B may want to support it. But developer B has a voice chat feature too, and wants to be able to detect it. Does developer B create a whole new packet to advertise their specific type of chat, even though it’s compatible? Why not re-use developer A’s, but modify it? Modifying someone else’s packet endangers compatibility. The solution is to create a “voice chat capability” child node and re-use A’s “chat capability” packet.
The tree structure allows the format of individual packets (or nodes in tree-speak) to be owned by developers, while allowing other developers to transparently extend them. Any party can invent new concepts and share them with other developers without (necessarily) losing control over them.
It Should Be International
It makes sense to design for international character set support from the beginning, even if not all clients currently support it. Shareaza 1.7 for example is still ANSI, but supports Unicode in the protocol and performs the necessary conversion. If Unicode protocol support is required from the word go, we’re free to implement it independently at leisure with no incompatibility problems.
It Should Probably Support IPv6
This is a new one based on the discussion the other day – I don’t know much about IPv6 but this is a good opportunity to do something about it.
Proposed Gnutella2 Protocol
The low-level Gnutella2 protocol used by the current Shareaza 1.7 betas is a cross between GGEP and vendor messages, without the (now unnecessary) overheads, and in a freeform tree structure.
Every Gnutella2 packet has a qualified name, length and other flags. It can contain a payload and/or one or more child packets. Child packets are identical to their parents in capability, i.e. you can nest packets to infinity. Multiple instances of a packet type can occur at any level (for example, lists of nodes, queries, hits, people, etc).
The concept can be compared to an XML document tree. The “packets” are elements, which can in turn contain zero or more child elements (packets). The payload of a packet is like the attributes of an XML element. However as we know, serializing XML has a lot of overhead due to all the naming, even in a compact binary form. The Gnutella2 packet structure makes a compromise: it names elements (packets), allowing them to be globally recognized and understood without knowledge of their format – and stores attributes as binary payloads, requiring knowledge of their content to parse them.
Thus the element (packet or child packet) is the finite unit of comprehension. This system provides an excellent tradeoff between format transparency and compactness.
Packets are encoded with a single leading control byte, followed by one or more bytes of packet length, followed by one or more bytes of packet name/ID, followed by zero or more child packets (framed the same way), followed by zero or more bytes of payload:
+---------+----/----+----/----+--- | Control | Length_ | Name___ | children and/or payload +---------+----/----+----/----+---
All packets can contain a payload only, children and a payload, children only, or nothing at all. The total length of the packet header cannot exceed 12 bytes and cannot be less than 3 bytes.
The Control Byte
The control byte is always non-zero. A zero control byte identifies the end of a stream of packets, and thus has special meaning. It should not be used in root packet streams (which do not end).
Control bytes have the following format:
Bit 7 < - > Bit 0 +----+----+----+----+----+----+----+----+ | Len_Len | Name_Len – 1 | // | // | CF | +----+----+----+----+----+----+----+----+
Len_Len is the number of bytes in the length field of the packet, which immediately follows the control byte. There are two bits here which means the length field can be up to 3 bytes long. Note that Len_Len cannot be zero, which protects the control byte from ever being zero.
Name_Len is the number of bytes in the packet name field MINUS ONE, which follows the packet length field. There are three bits here which means that packet names can be 1 to 8 bytes long inclusive.
The three least significant bits of the control byte are reserved for flags. The LSB is defined as the compound packet flag. If this bit is set, the packet contains one or more child packets. If not set, the packet contains only its own payload (or nothing).
I am considering using the next bit to flag packets included in the digital signature, allowing selective signing of some parts of a packet tree – i.e. one signature which covers certain parts only. This allows other parts to be changed, added and removed without disturbing the integrity of the signature, which allows for some neat routing tricks.
The Length Field
The length field immediately follows the control byte, and can be 1 to 3 bytes long. Length bytes are stored least significant first, which allows them to be read very easily on little-endian machines.
The length value includes the payload of this packet AND any child packets in their entirety. This is obviously needed so that the entire packet can be detected and acquired from a stream. The length does not include the header (control byte, length, and name). The length field precedes the name field to allow it to be read faster from a stream when acquiring packets.
The Name Field
The name field immediately follows the length bytes, and can be from 1 to 8 bytes long. It follows a similar convention to the GGEP naming standard – bytes can be any value, but are generally alphanumeric.
Public packet types which form the core of important services should be agreed upon, and can use short names consisting of capital letters. For example general-purpose keep-alive ping/pongs use the names “PI” and “PO” (note that these are very different to G1 ping/pongs).
Private packet types or extensions should be prefixed with a namespace equal to the four letter vendor code, followed by lowercase. For example “RAZAgrp4”.
Naming convention is most important in the root namespace, and at other common points in common public packet types.
Child packets are only present if the “compound packet bit” is set in the control byte. If set, there is one or more child packet immediately following the end of the header. These child packets are included in the total length of their parent (along with the payload, which follows the child packets).
Child packets are framed exactly the same way, with a control byte, length, name, children and/or payload. When the compound bit is set, the first child packet must exist. Existing child packets may also exist, and are read in sequentially in the same way that they are read from a root packet stream. The end of the child packet stream is signaled by the presence of a zero control byte, OR the end of the parent packet’s length (in which case there is no payload). Including a terminating zero control byte when there is no payload is still valid, but unnecessary.
Payload may exist whenever the length field is non-zero. However, if the compound bit is set, one or more child packets must be read before the payload is reached. If there is no packet left after the end of the last child, there is no payload.
VERY IMPORTANT: The compound packet bit MUST be checked on every packet. It should be done in low-level decoding code to avoid accidental omission. Do not assume that a packet will not have children – it might not now, but no packets are sterile. Anything could be augmented. If you are not interested in children, skip them (which is easy, you don’t even need to recurse through their children).
The format of the payload is defined by the packet type, and may include various types of binary data. Payloads may or may not be a fixed length for a given packet type.
Multi-byte integers should always be expressed little-endian first, as the vast majority of users are using little-endian systems, and that is unlikely to change.
Strings can be expressed in Gnutella1 style, i.e. with a terminating null character. Alternatively the string is said to terminate if it reaches the end of the payload, avoiding an unnecessary null.
ALL STRINGS may be ANSI or Unicode. All software supporting the protocol must be prepared to accept either, performing appropriate conversion if necessary. Unicode software everywhere would be nice, but that’s easier said than done. Note that if you convert Unicode back to local ANSI, don’t retransmit it. Let the Unicode pass through.
ANSI strings are stored as consecutive bytes followed by a null byte (or end of packet). Unicode strings are stored as a 0xFF (255) marker byte, followed by 16-bit Unicode characters in little-endian form. The string is terminated by 0x0000 or the end of the payload. ALWAYS check for 0xFF and convert if not using Unicode strings internally.
In generating strings it is acceptable to check for extended characters and fall back to ANSI if none are present, to conserve space.
A simple ping (“PI”) packet with no payload:
Packet Length: 0 Packet Name: PI Control Byte: 01 001 000 = 0x48 (lenlen=1,namelen=2,cmp=0) Packet: 48 00 ‘P’ ‘I’
A pong packet containing a ping child (nonsensical but good example):
Packet Length: 4 Packet Name: PO Control Byte: 01 001 001 = 0x49 (lenlen=1,namelen=2,cmp=1) Packet: 49 04 ‘P’ ‘O’ 48 00 ‘P’ ‘I’
Now, what if it contains two ping children:
Packet: 49 08 ‘P’ ‘O’ 48 00 ‘P’ ‘I’ 48 00 ‘P’ ‘I’
Add a payload of “test” to the pong:
Packet: 49 0D ‘P’ ‘O’ 48 00 ‘P’ ‘I’ 48 00 ‘P’ ‘I’ 00 ‘t’ ‘e’ ‘s’ ‘t’
(Note the 00 terminating control byte after the children but before the payload)
Simple Packet Decoder Logic
The following is a simple packet header decoder in C that will work on any root or child packet. It assumes that the packet is present or that its input functions will block. In a production situation there would of course need to be more failure cases (data not available, malformatted packet). In particular, child packet bounds must not exceed the extent of the parent.
BYTE nInput = ReadNextByte(); if ( nInput == 0 ) return S_NO_MORE_CHILDREN; BYTE nLenLen = ( nInput & 0xC0 ) >> 6; BYTE nTypeLen = ( nInput & 0x38 ) >> 3; BYTE nFlags = ( nInput & 0x07 ); DWORD nPacketLength = 0; ReadBytes( (BYTE*)&nPacketLength, nLenLen ); CHAR szType; ReadBytes( (BYTE*)szType, nTypeLen + 1 ); szType[ nTypeLen + 1 ] = 0; BOOL bIsCompound = ( nFlags & 0x01 ) ? TRUE : FALSE;
Higher Level Structures
This document has covered the basic packet framing and structuring stuff only, without going into the higher level constructs to support discovery, identification, searching, etc. I will document that soon for further discussion, as there is definitely some cool stuff in there which takes advantage of the tree packet structures to deliver flexible and extensible services.
November 20, 2002