Client for Downloading: Difference between revisions

From Gnutella2
Jump to navigation Jump to search
m (typo)
No edit summary
 
Line 1: Line 1:
This is not a core part of the Gnutella2 architecture, and is discussed in depth
This is not a core part of the Gnutella2 architecture, and is discussed in depth elsewhere.
elsewhere. {TODO: it should still be documented here eventually or link to appropriate documents}


Filetransfer is basically the same as in the original Gnutella protocol. See the documentation [http://www.the-gdf.org/wiki/index.php?title=File_Transfer_%28was:_Requesting_a_download%29 here]
Filetransfer is basically the same as in the original Gnutella protocol. See the documentation [http://www.the-gdf.org/wiki/index.php?title=File_Transfer_%28was:_Requesting_a_download%29 here]
Source - [[http://www.the-gdf.org/wiki/index.php?title=File_Transfer_%28was:_Requesting_a_download%29]] version of 2005/08/22
Original Source - [[http://rfc-gnutella.sourceforge.net/src/rfc-0_6-draft.html Latest draft]]
'''2.8.1 Normal File Transfer'''
Once a servent receives a QueryHit message, it may initiate the
direct download of one of the files described by the message's Result
Set. Files are downloaded out-of-network i.e. a direct connection
between the source and target servent is established in order to
perform the data transfer. File data is never transferred over the
Gnutella network.
The file download protocol is HTTP. It is RECOMMENDED to use HTTP 1.1
(RFC 2616), but HTTP 1.0 (RFC 1945) can be used instead. The full
specifications are available in those RFCs. The following includes
only the basic things. The following examples assumes that HTTP 1.1
is used.
The servent initiating the download sends a request string on the
following form to the target server:
    GET /get/<File Index>/<File Name> HTTP/1.1<cr><lf>
    User-Agent: Gnutella<cr><lf>
    Host: 123.123.123.123:6346<cr><lf>
    Connection: Keep-Alive<cr><lf>
    Range: bytes=0-<cr><lf>
    <cr><lf>
where <File Index> and <File Name> are one of the File Index/File
Name pairs from a QueryHit message's Result Set. For example, if the
Result Set from a QueryHit message contained the entry
    File Index: 2468
    File Size: 4356789
    File Name: Foobar.mp3
then a download request for the file described by this entry would be
initiated as follows:
    GET /get/2468/Foobar.mp3 HTTP/1.1<cr><lf>
    User-Agent: Gnutella<cr><lf>
    Host: 123.123.123.123:6346<cr><lf>
    Connection: Keep-Alive<cr><lf>
    Range: bytes=0-<cr><lf>
    <cr><lf>
Servents MUST encode the filename in GET requests according to the
standard URL/URI encoding rules. Servents MUST accept URL-encoded GET
requests. Since some old servents do not support encoding, servents
SHOULD accept non-encoded requests and MAY try a non-encoded requests
if a 404 Not Found error is returned for the initial request.
The Host header is required by HTTP 1.1 and specifies what address
you have connected to. It is usually not used by the receiving
servent, but its presence is required by the protocol.
The allowable values of the User-Agent string are defined by the HTTP
standard. Servent developers cannot make any assumptions about the
value here. The use of 'Gnutella' is for illustration purposes only.
The server receiving this download request responds with HTTP 1.1
compliant headers such as
    HTTP/1.1 200 OK<cr><lf>
    Server: Gnutella<cr><lf>
    Content-type: application/binary<cr><lf>
    Content-length: 4356789<cr><lf>
    <cr><lf>
The file data then follows and should be read up to, and including,
the number of bytes specified in the Content-length provided in the
server's HTTP response.
Note: Servents SHOULD use HTTP version 1.1 for file transfer, but
some support only HTTP version 1.0. Servents MUST accept incoming
HTTP/1.0 requests, and SHOULD retry with HTTP/1.0 if the remote host
is not HTTP/1.1 compliant.
Though it is strongly RECOMMENDED to have full HTTP/1.1
support, some servents do not. The most important features for
Gnutella, range requests and Persistent Connections MUST be
supported. Some old servents, however, do not.
Range requests are on the form
    GET /get/2468/Foobar.mp3 HTTP/1.1<cr><lf>
    User-Agent: Gnutella<cr><lf>
    Host: 123.123.123.123:6346<cr><lf>
    Connection: Keep-Alive<cr><lf>
    Range: bytes=4932766-5066083<cr><lf>
    <cr><lf>
Note that the Range header does not have to specify both start and
end positions. The response is on the form
    HTTP/1.1 206 Partial Content<cr><lf>
    Server: Gnutella<cr><lf>
    Content-Type: audio/mpeg<cr><lf>
    Content-Length: 133318<cr><lf>
    Content-Range: bytes 4932766-5066083/5332732<cr><lf>
    <cr><lf>
The Connection header tells the remote host if the connection should
be closed when the transfer is finished or not. "Connection: close"
means that the connection MUST be closed after the transfer.
"Connection: Keep-Alive" or no Connection header means the connection
MUST be kept open. The client MAY then issue another request for
another range or another file. The request MAY be sent before the
previous transfer is finished. Persistent Connections is described in
section 8.1 of RFC 2616.
Headers unknown to the servent MUST be quietly ignored.
Servents SHOULD NOT attempt to download multiple files from the same
source at once. Files SHOULD be locally queued instead.
Servents are also RECOMMENDED to use and understand the HTTP extension
described in HUGE. (see Appendix 1)
'''2.8.2 Firewalled servents'''
[TODO: rewrite this]
It is not always possible to establish a direct connection to a
Gnutella servent in an attempt to initiate a file download. The
servent may, for example, be behind a firewall that does not permit
incoming connections to its Gnutella port. If a direct connection
cannot be established, the servent attempting the file download may
request that the servent sharing the file "push" the file instead. A
servent can request a file push by routing a Push request back to the
servent that sent the QueryHit message describing the target file.
The servent that is the target of the Push request (identified by the
Servent Identifier field of the Push message) SHOULD, upon receipt
of the Push message, attempt to establish a new TCP/IP connection
to the requesting servent (identified by the IP Address and Port
fields of the Push message). If this direct connection cannot be
established, then it is likely that the servent that issued the Push
request is itself behind a firewall. In this case, file transfer
cannot take place by the means of what is described in this document.
'''2.8.2.1 Usage of Push Messages'''
A servent may send a Push message if it receives a QueryHit
message from a servent that doesn't support incoming connections.
This might occur when the servent sending the QueryHit message is
behind a firewall (see the QueryHit EQHD in
2.4 Standard Message Arhitecture).  When a servent receives a Push
message, it SHOULD act upon the push request if and only if the
servent_Identifier field contains the value of its servent identifier.
The Message_Id field in the Message Header of the Push message SHOULD
not contain the same value as that of the associated QueryHit message,
but SHOULD contain a new value generated by the servent's Message_Id
generation algorithm.
Push messages are forwarded back to the originator of the Query Hits
message using the Servent Identifier value.  This means multiple Push
messages can have the same Servent Identifier.  Push messages MUST
only be considered as duplicates if the Message ID in the header is
the same.  Since Push messages are not broadcasted, duplicate
messages should be very rare.
'''2.8.2.2 - Pushing the file to the downloader'''
If a direct connection can be established from the firewalled servent
to the servent that initiated the Push request, the firewalled
servent should immediately send the following:
    GIV <File Index>:<Servent Identifier>/<File Name><lf><lf>
Where <File Index> and <Servent Identifier> are the values of the
File Index and Servent Identifier fields respectively from the Push
request received, and <File Name> is the name of the file in the
local file table whose file index number is <File Index>. The File
Name MAY be url/uri encoded. The servent that receives the GIV (the
servent that wants to receive a file) SHOULD ignore the File Index
and File Name, and request the file it wants to download. The
servent that sent the GIV MUST allow the client to request any
file, and not just the one specified in the Push message.  The GET
request and the remainder of the file download process is identical
to that described in the section 4.1 (Normal File Transfer) above.
The <Servent Identifier> is formatted as hexadecimal, and must
be read case-insensitively.  For instance:
    GIV 36:809BC12168A1852CFF5D7A785833F600/Foo.txt<lf><lf>
    GIV 124:d51dff817f895598ff0065537c09d503/Bar.html<lf><lf>
If the TCP connection is lost during a Push initiated file transfer,
it is strongly RECOMMENDED that the servent who initiated the TCP
connection (the servent providing the file) attempt to re-connect.
That is important, since the servent receiving the file might not be
able to get another Push message to the servent providing the file.
'''2.8.3 Busy Servents'''
Servents whose upload bandwidth is already saturated with transfers
MAY reject a download request by returning the 503 response code.
Servents MAY simply have a fixed number of available upload slots,
but SHOULD use a system that utilizes upload bandwidth better.
Allowing new downloads as long as 20% of total upload bandwidth is
unused is one possibility.
Busy servents receiving a Push message SHOULD connect to the host
requesting a push, and return the 503 Busy code when the remote host
has requested the file.
Servents MAY try requesting a download again when the servent
providing the file returns the busy code, but MUST not do so more
often than once per minute and file source. That means a Servent
MUST NOT open new connections to a remote host more than once per
minute. Servents SHOULD prevent other servents breaking the above
rule from increasing their chanses to downlaoad a file. This can for
example be archived by refusing any connection attempts from a
particular host if a download request has been denies less than 50
seconds ago, or by adding hosts that request too often to a ban list.
Servents MAY use queuing systems to allow downloaders to stand in
queue to download a file, see 4.5 Active Queuing Extension.
If a transfer is interrupted, the serving servent SHOULD keep the
allocated slot/bandwidth reserved for at least one minute. The
downloader would then be allowed to reconnect and resume the
transfer.
'''2.8.4 Sharing'''
Servents that are able to download files MUST also be able to share
files with others. Servents SHOULD encourage users to share files.
Servents SHOULD attempt to prevent programs that are not able to
share files from downloading files. This means that servent SHOULD
not allow uploads to web browsers and download accelerators. The
User-Agent http header tells what program the remote host is running.
Many servents return a html page instead, telling the user how
Gnutella works, and where to get a servent.
Servents MUST NOT give precedence to other users using the same
servent. They MUST answer Query messages and accept file download
requests using the same rules for all servents. Servents MAY,
however, attempt to block servents that do not follow the rules in
this protocol in way that seriously hurts others experience of the
Gnutella network.
Servents SHOULD, by default, share the directory where downloaded
files are placed. Servents SHOULD also share new downloaded files
without waiting for the servent to be restarted. Servents SHOULD
avoid changing the index numbers of shared files.
Servents MUST NOT share partially downloaded (incomplete) files as if
they were complete. This is often prevented by using a separate
directory for incomplete downloads. When the download finishes, the
file is moved to the downloads directory (that SHOULD be shared).
Partial files MAY be shared but it SHOULD be in a way that makes
it clear to other servents that the file is incomplete.
The transfer of partial files is described in the
[[Partial File Sharing Protocol]] (PFSP).

Latest revision as of 18:06, 7 September 2005

This is not a core part of the Gnutella2 architecture, and is discussed in depth elsewhere.

Filetransfer is basically the same as in the original Gnutella protocol. See the documentation here

Source - [[1]] version of 2005/08/22 Original Source - [Latest draft]


2.8.1 Normal File Transfer

Once a servent receives a QueryHit message, it may initiate the direct download of one of the files described by the message's Result Set. Files are downloaded out-of-network i.e. a direct connection between the source and target servent is established in order to perform the data transfer. File data is never transferred over the Gnutella network.

The file download protocol is HTTP. It is RECOMMENDED to use HTTP 1.1 (RFC 2616), but HTTP 1.0 (RFC 1945) can be used instead. The full specifications are available in those RFCs. The following includes only the basic things. The following examples assumes that HTTP 1.1 is used.

The servent initiating the download sends a request string on the following form to the target server:

   GET /get/<File Index>/<File Name> HTTP/1.1<cr><lf>
   User-Agent: Gnutella<cr><lf>
   Host: 123.123.123.123:6346<cr><lf>
   Connection: Keep-Alive<cr><lf>
   Range: bytes=0-<cr><lf>
   <cr><lf>

where <File Index> and <File Name> are one of the File Index/File Name pairs from a QueryHit message's Result Set. For example, if the Result Set from a QueryHit message contained the entry

   File Index: 2468
   File Size: 4356789
   File Name: Foobar.mp3

then a download request for the file described by this entry would be initiated as follows:

   GET /get/2468/Foobar.mp3 HTTP/1.1<cr><lf>
   User-Agent: Gnutella<cr><lf>
   Host: 123.123.123.123:6346<cr><lf>
   Connection: Keep-Alive<cr><lf>
   Range: bytes=0-<cr><lf>
   <cr><lf>

Servents MUST encode the filename in GET requests according to the standard URL/URI encoding rules. Servents MUST accept URL-encoded GET requests. Since some old servents do not support encoding, servents SHOULD accept non-encoded requests and MAY try a non-encoded requests if a 404 Not Found error is returned for the initial request.

The Host header is required by HTTP 1.1 and specifies what address you have connected to. It is usually not used by the receiving servent, but its presence is required by the protocol.

The allowable values of the User-Agent string are defined by the HTTP standard. Servent developers cannot make any assumptions about the value here. The use of 'Gnutella' is for illustration purposes only.

The server receiving this download request responds with HTTP 1.1 compliant headers such as

   HTTP/1.1 200 OK<cr><lf>
   Server: Gnutella<cr><lf>
   Content-type: application/binary<cr><lf>
   Content-length: 4356789<cr><lf>
   <cr><lf>

The file data then follows and should be read up to, and including, the number of bytes specified in the Content-length provided in the server's HTTP response.

Note: Servents SHOULD use HTTP version 1.1 for file transfer, but some support only HTTP version 1.0. Servents MUST accept incoming HTTP/1.0 requests, and SHOULD retry with HTTP/1.0 if the remote host is not HTTP/1.1 compliant.

Though it is strongly RECOMMENDED to have full HTTP/1.1 support, some servents do not. The most important features for Gnutella, range requests and Persistent Connections MUST be supported. Some old servents, however, do not.

Range requests are on the form

   GET /get/2468/Foobar.mp3 HTTP/1.1<cr><lf>
   User-Agent: Gnutella<cr><lf>
   Host: 123.123.123.123:6346<cr><lf>
   Connection: Keep-Alive<cr><lf>
   Range: bytes=4932766-5066083<cr><lf>
   <cr><lf>

Note that the Range header does not have to specify both start and end positions. The response is on the form

   HTTP/1.1 206 Partial Content<cr><lf>
   Server: Gnutella<cr><lf>
   Content-Type: audio/mpeg<cr><lf>
   Content-Length: 133318<cr><lf>
   Content-Range: bytes 4932766-5066083/5332732<cr><lf>
   <cr><lf>

The Connection header tells the remote host if the connection should be closed when the transfer is finished or not. "Connection: close" means that the connection MUST be closed after the transfer. "Connection: Keep-Alive" or no Connection header means the connection MUST be kept open. The client MAY then issue another request for another range or another file. The request MAY be sent before the previous transfer is finished. Persistent Connections is described in section 8.1 of RFC 2616.

Headers unknown to the servent MUST be quietly ignored.

Servents SHOULD NOT attempt to download multiple files from the same source at once. Files SHOULD be locally queued instead.

Servents are also RECOMMENDED to use and understand the HTTP extension described in HUGE. (see Appendix 1)


2.8.2 Firewalled servents

[TODO: rewrite this]

It is not always possible to establish a direct connection to a Gnutella servent in an attempt to initiate a file download. The servent may, for example, be behind a firewall that does not permit incoming connections to its Gnutella port. If a direct connection cannot be established, the servent attempting the file download may request that the servent sharing the file "push" the file instead. A servent can request a file push by routing a Push request back to the servent that sent the QueryHit message describing the target file. The servent that is the target of the Push request (identified by the Servent Identifier field of the Push message) SHOULD, upon receipt of the Push message, attempt to establish a new TCP/IP connection to the requesting servent (identified by the IP Address and Port fields of the Push message). If this direct connection cannot be established, then it is likely that the servent that issued the Push request is itself behind a firewall. In this case, file transfer cannot take place by the means of what is described in this document.

2.8.2.1 Usage of Push Messages

A servent may send a Push message if it receives a QueryHit message from a servent that doesn't support incoming connections. This might occur when the servent sending the QueryHit message is behind a firewall (see the QueryHit EQHD in 2.4 Standard Message Arhitecture). When a servent receives a Push message, it SHOULD act upon the push request if and only if the servent_Identifier field contains the value of its servent identifier. The Message_Id field in the Message Header of the Push message SHOULD not contain the same value as that of the associated QueryHit message, but SHOULD contain a new value generated by the servent's Message_Id generation algorithm.

Push messages are forwarded back to the originator of the Query Hits message using the Servent Identifier value. This means multiple Push messages can have the same Servent Identifier. Push messages MUST only be considered as duplicates if the Message ID in the header is the same. Since Push messages are not broadcasted, duplicate messages should be very rare.

2.8.2.2 - Pushing the file to the downloader

If a direct connection can be established from the firewalled servent to the servent that initiated the Push request, the firewalled servent should immediately send the following:

   GIV <File Index>:<Servent Identifier>/<File Name><lf><lf>

Where <File Index> and <Servent Identifier> are the values of the File Index and Servent Identifier fields respectively from the Push request received, and <File Name> is the name of the file in the local file table whose file index number is <File Index>. The File Name MAY be url/uri encoded. The servent that receives the GIV (the servent that wants to receive a file) SHOULD ignore the File Index and File Name, and request the file it wants to download. The servent that sent the GIV MUST allow the client to request any file, and not just the one specified in the Push message. The GET request and the remainder of the file download process is identical to that described in the section 4.1 (Normal File Transfer) above.

The <Servent Identifier> is formatted as hexadecimal, and must be read case-insensitively. For instance:

   GIV 36:809BC12168A1852CFF5D7A785833F600/Foo.txt<lf><lf>
   GIV 124:d51dff817f895598ff0065537c09d503/Bar.html<lf><lf>

If the TCP connection is lost during a Push initiated file transfer, it is strongly RECOMMENDED that the servent who initiated the TCP connection (the servent providing the file) attempt to re-connect. That is important, since the servent receiving the file might not be able to get another Push message to the servent providing the file.


2.8.3 Busy Servents

Servents whose upload bandwidth is already saturated with transfers MAY reject a download request by returning the 503 response code. Servents MAY simply have a fixed number of available upload slots, but SHOULD use a system that utilizes upload bandwidth better. Allowing new downloads as long as 20% of total upload bandwidth is unused is one possibility.

Busy servents receiving a Push message SHOULD connect to the host requesting a push, and return the 503 Busy code when the remote host has requested the file.

Servents MAY try requesting a download again when the servent providing the file returns the busy code, but MUST not do so more often than once per minute and file source. That means a Servent MUST NOT open new connections to a remote host more than once per minute. Servents SHOULD prevent other servents breaking the above rule from increasing their chanses to downlaoad a file. This can for example be archived by refusing any connection attempts from a particular host if a download request has been denies less than 50 seconds ago, or by adding hosts that request too often to a ban list.

Servents MAY use queuing systems to allow downloaders to stand in queue to download a file, see 4.5 Active Queuing Extension.

If a transfer is interrupted, the serving servent SHOULD keep the allocated slot/bandwidth reserved for at least one minute. The downloader would then be allowed to reconnect and resume the transfer.


2.8.4 Sharing

Servents that are able to download files MUST also be able to share files with others. Servents SHOULD encourage users to share files.

Servents SHOULD attempt to prevent programs that are not able to share files from downloading files. This means that servent SHOULD not allow uploads to web browsers and download accelerators. The User-Agent http header tells what program the remote host is running. Many servents return a html page instead, telling the user how Gnutella works, and where to get a servent.

Servents MUST NOT give precedence to other users using the same servent. They MUST answer Query messages and accept file download requests using the same rules for all servents. Servents MAY, however, attempt to block servents that do not follow the rules in this protocol in way that seriously hurts others experience of the Gnutella network.

Servents SHOULD, by default, share the directory where downloaded files are placed. Servents SHOULD also share new downloaded files without waiting for the servent to be restarted. Servents SHOULD avoid changing the index numbers of shared files.

Servents MUST NOT share partially downloaded (incomplete) files as if they were complete. This is often prevented by using a separate directory for incomplete downloads. When the download finishes, the file is moved to the downloads directory (that SHOULD be shared). Partial files MAY be shared but it SHOULD be in a way that makes it clear to other servents that the file is incomplete.

The transfer of partial files is described in the Partial File Sharing Protocol (PFSP).