IN previous article the tasks necessary to build an I2P router capable of participating in the network were considered, including interaction with other routers via the regular Internet, building tunnels of various types and collecting information about other network nodes. Despite the importance of these tasks, the I2P client, which performs only the functions of a router, from the user’s point of view is a “thing in itself”, since it does not do anything interesting to the user. This article is devoted to application level protocols designed for transmitting user data over the I2P network.
If the issues of how an I2P router works are more or less logically covered in the official documentation, then application protocols are a mishmash of different ideas, boiling down to the fact that everyone is free to implement their own protocol for their own application, using destination points as addresses. However, this does not bring us any closer to implementing our own client, since existing network resources already use some protocols and any new implementation must be able to work with them. As an example, you can look at description of "garlic"», from this page you can only understand how the “garlic” should be packaged and how to encrypt. What is transmitted between Alice and Bob, and how Bob knows that the answer should be sent to Alice, is completely unclear. It also contains a statement that in order to confirm delivery in one of the “garlics”, a DeliveryStatus message is transmitted indicating the sending router in the delivery instructions, thereby revealing the router on which the sender is sitting. Of course this is not true. Unfortunately, the only way to find out how things really were was to analyze the traffic generated by the official Java client.
Let's look at this issue in more detail, slightly modifying the original example and making it more practical. Let’s assume that Vasya Pupkin accesses the Flibusta website, which is now accessible only via I2P, while at the same time Vasya has his own website that offends someone’s feelings, which is why it is located in I2P. To do this, Vasya has a router running that constantly supports at least one outgoing and one incoming tunnel. Vasya created a separate destination for his website and published its address everywhere. To contact Flibusta, Vasya already knows her address, and his router knows her LeaseSet and can send a message there, the only problem is that Vasya needs to receive a response from Flibusta, and for this she needs to know Vasya’s address. For this purpose, another destination point is created on Vasya’s router, which serves as a return address for all connections initiated by Vasya. Not only for Flibusta, but also for all other sites. If necessary, you can create several such return addresses, but then you will also have to build LeaseSets with different non-overlapping sets of tunnels.
To transfer data between destinations, the I2NP Garlic message is used. The messages themselves are transmitted and encrypted between routers using a separate pair of encryption keys, the public key of which is transferred to the LeaseSet. This key does not coincide with the public key of the router, and unlike the latter, it is generated anew each time it starts. Inside, the message consists of “garlics”, each of which consists of an I2NP message and instructions for its delivery (delivery instructions) of 4 types:
As a rule, “garlics” are used, consisting of two “garlics”. The first is the I2NP DeliveryStatus message with the message number of the “garlic” itself and delivery to one of the incoming tunnels of the sender’s router. Used to confirm that all “garlic” has been delivered to its intended destination. The second is the I2NP Data message, which contains the transmitted data itself and delivery to the destination. Sometimes there is a third “garlic” - LeaseSet of the destination point (not the router) of the sender. In our example, this is the LeaseSet of Vasya’s return address. This “garlic” is present in two cases: in the very first message and when the LeaseSet is changed. This is done so that the router knows how to send a response to the sender, otherwise it would have to request the corresponding LeaseSet from the floodfill routers, which most likely will not be there, such as Vasya’s return address, but the LeaseSet of Vasya’s secret site, on the contrary, will be present there.
The question arises of how Flibusta knows that the answer should be sent to Vasya, and not to Petya or anyone else who addresses her at the same moment. Even if it were known that the “garlic” came from Vasya’s router, which, of course, is not the case, this would not help much, since on Vasya’s router, in addition to the return address, there is also his website and, possibly, much more . It turns out that this information must be contained within the data transmitted in the Data message, which indicates an unsuccessful design of the entire system, since it does not allow the isolation of protocols of different levels from each other. In other words, the destination address must be present both in the data itself and in the protocol for transmitting this data.
Initially, the I2CP protocol was developed exclusively for exchange between various applications and the router - its messages should not enter the I2P network itself. However, the contents of the SendMessageMessage and MessagePayloadMessage messages are transmitted over the network inside I2NP Data messages, and are gzip-archived data with a specially modified header of the form.
0x1F 0x8B 0x08 — gzip prefix
1 gzip flags byte
2 byte TCP or UDP sender port
2 byte TCP or UDP destination port
1 byte of additional gzip flags
1 protocol type byte: 6 - streaming, 17 - datagram, 18 - raw)
Thus, each Data message transmitted via “garlic” will always begin with 0x1F 0x8B 0x08 and is first unpacked by gzip and, depending on the protocol type, processed by the appropriate implementation.
The streaming protocol is similar to the TCP protocol and guarantees the consistency of data transfer. Messages consist of a header and actual data. The type of message is determined by the flags field; similar to TCP, there are SYN/FIN flags for establishing and terminating a connection. Unlike TCP, up to 255 NACKs can also be present - numbers of missed messages with a requirement to resend them, which is more typical for packet protocols and requires a more complex implementation. It also contains standard TCP fields: 4-byte source and destination ports, called streams, sequence numbers and acknowledgment numbers.
When a connection is established, messages are exchanged with the SYNCHRONIZE flag set, and the FROM_INCLUDED and SIGNATURE_INCLUDED flags must be set. The first means that the full 387-byte I2P address is present in the header, and the second that the entire message is signed with the private key of the sender's I2P address. In this way, the parties learn each other's I2P addresses, and signature verification ensures that these addresses are real. In other words, Vasya, connecting to Flibusta’s address, after establishing a connection, can be sure that it is Flibusta, and Flibusta will find out Vasya’s return address.
Thus, the streaming protocol interface can be implemented in the form of regular sockets, allowing the use of I2P in network applications with minimal changes.
If the issues of how an I2P router works are more or less logically covered in the official documentation, then application protocols are a mishmash of different ideas, boiling down to the fact that everyone is free to implement their own protocol for their own application, using destination points as addresses. However, this does not bring us any closer to implementing our own client, since existing network resources already use some protocols and any new implementation must be able to work with them. As an example, you can look at description of "garlic"», from this page you can only understand how the “garlic” should be packaged and how to encrypt. What is transmitted between Alice and Bob, and how Bob knows that the answer should be sent to Alice, is completely unclear. It also contains a statement that in order to confirm delivery in one of the “garlics”, a DeliveryStatus message is transmitted indicating the sending router in the delivery instructions, thereby revealing the router on which the sender is sitting. Of course this is not true. Unfortunately, the only way to find out how things really were was to analyze the traffic generated by the official Java client.
«Garlic" data transfer
Let's look at this issue in more detail, slightly modifying the original example and making it more practical. Let’s assume that Vasya Pupkin accesses the Flibusta website, which is now accessible only via I2P, while at the same time Vasya has his own website that offends someone’s feelings, which is why it is located in I2P. To do this, Vasya has a router running that constantly supports at least one outgoing and one incoming tunnel. Vasya created a separate destination for his website and published its address everywhere. To contact Flibusta, Vasya already knows her address, and his router knows her LeaseSet and can send a message there, the only problem is that Vasya needs to receive a response from Flibusta, and for this she needs to know Vasya’s address. For this purpose, another destination point is created on Vasya’s router, which serves as a return address for all connections initiated by Vasya. Not only for Flibusta, but also for all other sites. If necessary, you can create several such return addresses, but then you will also have to build LeaseSets with different non-overlapping sets of tunnels.
To transfer data between destinations, the I2NP Garlic message is used. The messages themselves are transmitted and encrypted between routers using a separate pair of encryption keys, the public key of which is transferred to the LeaseSet. This key does not coincide with the public key of the router, and unlike the latter, it is generated anew each time it starts. Inside, the message consists of “garlics”, each of which consists of an I2NP message and instructions for its delivery (delivery instructions) of 4 types:
- Local. The message is intended for the router itself. Typically someone's LeaseSet.
- Destination point. The message is destined for a destination connected to the router. Is the only way to send data to the destination.
- Tunnel. The message is intended to be sent to the specified incoming tunnel starting at the specified router. Used to confirm delivery of "garlic"».
- Router. The message is intended to be sent to another router. It is extremely dangerous from a security point of view if the specified router is different from your own or a trusted one. Never seen in practice.
As a rule, “garlics” are used, consisting of two “garlics”. The first is the I2NP DeliveryStatus message with the message number of the “garlic” itself and delivery to one of the incoming tunnels of the sender’s router. Used to confirm that all “garlic” has been delivered to its intended destination. The second is the I2NP Data message, which contains the transmitted data itself and delivery to the destination. Sometimes there is a third “garlic” - LeaseSet of the destination point (not the router) of the sender. In our example, this is the LeaseSet of Vasya’s return address. This “garlic” is present in two cases: in the very first message and when the LeaseSet is changed. This is done so that the router knows how to send a response to the sender, otherwise it would have to request the corresponding LeaseSet from the floodfill routers, which most likely will not be there, such as Vasya’s return address, but the LeaseSet of Vasya’s secret site, on the contrary, will be present there.
The question arises of how Flibusta knows that the answer should be sent to Vasya, and not to Petya or anyone else who addresses her at the same moment. Even if it were known that the “garlic” came from Vasya’s router, which, of course, is not the case, this would not help much, since on Vasya’s router, in addition to the return address, there is also his website and, possibly, much more . It turns out that this information must be contained within the data transmitted in the Data message, which indicates an unsuccessful design of the entire system, since it does not allow the isolation of protocols of different levels from each other. In other words, the destination address must be present both in the data itself and in the protocol for transmitting this data.
I2CP protocol data
Initially, the I2CP protocol was developed exclusively for exchange between various applications and the router - its messages should not enter the I2P network itself. However, the contents of the SendMessageMessage and MessagePayloadMessage messages are transmitted over the network inside I2NP Data messages, and are gzip-archived data with a specially modified header of the form.
0x1F 0x8B 0x08 — gzip prefix
1 gzip flags byte
2 byte TCP or UDP sender port
2 byte TCP or UDP destination port
1 byte of additional gzip flags
1 protocol type byte: 6 - streaming, 17 - datagram, 18 - raw)
Thus, each Data message transmitted via “garlic” will always begin with 0x1F 0x8B 0x08 and is first unpacked by gzip and, depending on the protocol type, processed by the appropriate implementation.
Streaming protocol)
The streaming protocol is similar to the TCP protocol and guarantees the consistency of data transfer. Messages consist of a header and actual data. The type of message is determined by the flags field; similar to TCP, there are SYN/FIN flags for establishing and terminating a connection. Unlike TCP, up to 255 NACKs can also be present - numbers of missed messages with a requirement to resend them, which is more typical for packet protocols and requires a more complex implementation. It also contains standard TCP fields: 4-byte source and destination ports, called streams, sequence numbers and acknowledgment numbers.
When a connection is established, messages are exchanged with the SYNCHRONIZE flag set, and the FROM_INCLUDED and SIGNATURE_INCLUDED flags must be set. The first means that the full 387-byte I2P address is present in the header, and the second that the entire message is signed with the private key of the sender's I2P address. In this way, the parties learn each other's I2P addresses, and signature verification ensures that these addresses are real. In other words, Vasya, connecting to Flibusta’s address, after establishing a connection, can be sure that it is Flibusta, and Flibusta will find out Vasya’s return address.
Thus, the streaming protocol interface can be implemented in the form of regular sockets, allowing the use of I2P in network applications with minimal changes.