Your browser doesn't support the features required by impress.js, so you are presented with a simplified version of this presentation.

For the best experience please use the latest Chrome, Safari or Firefox browser.

MTProto

Links

DOCS

MTProto Mobile Protocol

This page deals with the basic layer of MTProto encryption used for Cloud chats (server-client encryption)

General Description

General Description

The protocol is designed for access to a server API from applications running on mobile devices. It must be emphasized that a web browser is not such an application.

The protocol is subdivided into three virtually independent components:

  • High-level component (API query language): defines the method whereby API queries and responses are converted to binary messages
  • Cryptographic (authorization) layer: defines the method by which messages are encrypted prior to being transmitted through the transport protocol.
  • Transport component: defines the method for the client and the server to transmit messages over some other existing network protocol (such as, http, https, tcp, udp).

Note 1
Each plaintext message to be encrypted in MTProto always contains the following data to be checked upon decryption in order to make the system robust against known problems with the components:

  • server salt (64-Bit)
  • session id
  • message sequence number
  • message length
  • time

Brief Component Summary

High-Level Component (RPC Query Language/API)

High-Level Component (RPC Query Language/API)

From the standpoint of the high-level component, the client and the server exchange messages inside a session. The session is attached to the client device (the application, to be more exact) rather than a specific http/https/tcp connection. In addition, each session is attached to a user key ID by which authorization is actually accomplished.

Several connections to a server may be open; messages may be sent in either direction through any of the connections (a response to a query is not necessarily returned through the same connection that carried the original query, although most often, that is the case; however, in no case can a message be returned through a connection belonging to a different session). When the UDP protocol is used, a response might be returned by a different IP address than the one to which the query had been sent.

There are several types of messages:

  • RPC calls (client to server): calls to API methods
  • RPC responses (server to client): results of RPC calls
  • Message received acknowledgment (or rather, notification of status of a set of messages)
  • Message status query
  • Multipart message or container (a container that holds several messages; needed to send several RPC calls at once over an HTTP connection, for example; also, a container may support gzip).

From the standpoint of lower level protocols, a message is a binary data stream aligned along a 4 or 16-byte boundary. The first several fields in the message are fixed and are used by the cryptographic/authorization system.

Each message, either individual or inside a container, consists of a message identifier (64 bits, see below), a message sequence number within a session (32 bits), the length (of the message body in bytes; 32 bits), and a body (any size which is a multiple of 4 bytes). In addition, when a container or a single message is sent, an internal header is added at the top (see below), then the entire message is encrypted, and an external header is placed at the top of the message (a 64-bit key identifier and a 128-bit message key).

A message body normally consists of a 32-bit message type followed by type-dependent parameters. In particular, each RPC function has a corresponding message type. For more detail, see Binary Data Serialization, Mobile Protocol: Service Messages.

All numbers are written as little endian. However, very large numbers (2048-bit) used in RSA and DH are written in the big endian format because that is what the OpenSSL library does.

The protocol’s principal drawback is that an intruder passively intercepting messages and then somehow appropriating the authorization key (for example, by stealing a device) will be able to decrypt all the intercepted messages post factum. This probably is not too much of a problem (by stealing a device, one could also gain access to all the information cached on the device without decrypting anything); however, the following steps could be taken to overcome this weakness:

  • Session keys generated using the Diffie-Hellman protocol and used in conjunction with the authorization and the message keys to select AES parameters. To create these, the first thing a client must do after creating a new session is send a special RPC query to the server (“generate session key”) to which the server will respond, whereupon all subsequent messages within the session are encrypted using the session key as well.

  • Protecting the key stored on the client device with a (text) password; this password is never stored in memory and is entered by a user when starting the application or more frequently (depending on application settings).

  • Data stored (cached) on the user device can also be protected by encryption using an authorization key which, in turn, is to be password-protected. Then, a password will be required to gain access even to those data.

Authorization and Encryption

Authorization and Encryption

Prior to a message (or a multipart message) being transmitted over a network using a transport protocol, it is encrypted in a certain way, and an external header is added at the top of the message which is: a 64-bit key identifier (that uniquely identifies an authorization key for the server as well as the user) and a 128-bit message key. A user key together with the message key defines an actual 256-bit key which is what encrypts the message using AES-256 encryption. Note that the initial part of the message to be encrypted contains variable data (session, message ID, sequence number, server salt) that obviously influences the message key (and thus the AES key and iv). The message key is defined as the 128 lower-order bits of the SHA1 of the message body (including session, message ID, etc.). Multipart messages are encrypted as a single message.

For a technical specification, see Mobile Protocol: Detailed Description

The first thing a client application must do is create an authorization key which is normally generated when it is first run and almost never changes.

Time Synchronization

Time Synchronization

If client time diverges widely from server time, a server may start ignoring client messages, or vice versa, because of an invalid message identifier (which is closely related to creation time). Under these circumstances, the server will send the client a special message containing the correct time and a certain 128-bit salt (either explicitly provided by the client in a special RPC synchronization request or equal to the key of the latest message received from the client during the current session). This message could be the first one in a container that includes other messages (if the time discrepancy is significant but does not as yet result in the client’s messages being ignored).

Having received such a message or a container holding it, the client first performs a time synchronization (in effect, simply storing the difference between the server’s time and its own to be able to compute the “correct” time in the future) and then verifies that the message identifiers for correctness.

Where a correction has been neglected, the client will have to generate a new session to assure the monotonicity of message identifiers.

Transport

Enables the delivery of encrypted containers together with the external header (hereinafter, Payload) from client to server and back. There are three types of transport:

  • HTTP
  • TCP
  • UDP

HTTP Transport

HTTP Transport

Implemented over HTTP/1.1 (with keepalive) running over the traditional TCP Port 80. HTTPS is not used; the above encryption method is used instead.

An HTTP connection is attached to a session (or rather, to session + key identifier) specified in the most recent user query received; normally, the session is the same in all queries, but crafty HTTP proxies may corrupt that. A server may not return a message into an HTTP connection unless it belongs to the same session, and unless it is the server’s turn (an HTTP request had been received from the client to which a response has not been sent yet).

The overall arrangement is as follows. The client opens one or more keepalive HTTP connections to the server. If one or more messages need to be sent, they are made into a payload which is followed by a POST request to the URL/api to which the payload is transmitted as data. In addition, Content-Length, Keepalive, and Host are valid HTTP headers.

Having received the query, the server may either wait a little while (if the query requires a response following a short timeout) or immediately return a dummy response (only acknowledging the receipt of the container). In any case, the response may contain any number of messages. The server may at the same time send out any other messages it might be holding for the session.

In addition, there exists a special long poll RPC query (valid for HTTP connections only) which transmits maximum timeout T. If the server has messages for the session, they are returned immediately; otherwise, a wait state is entered until such time as the server has a message for the client or T seconds have elapsed. If no events occur in the span of T seconds, a dummy response is returned (special message).

If a server needs to send a message to a client, it checks for an HTTP connection that belongs to the required session and is in the “answering an HTTP request” state (including long poll) whereupon the message is added to the response container for the connection and sent to the user. In a typical case, there is some additional wait time (50 milliseconds) against the eventuality that the server will soon have more messages for the session.

If no suitable HTTP connection is available, the messages are placed in the current session’s send queue. However, they find their way there anyway until receipt is explicitly or indirectly confirmed by the client. For the HTTP protocol, sending the next query into the same HTTP connection is regarded as an implicit acknowledgment (not any more, the HTTP protocol also requires that explicit acknowledgments be sent); in other cases, the client must return an explicit acknowledgment within a reasonable time (it can be added to a container for the following request).

Important: if the acknowledgment fails to arrive on time, the message can be resent (possibly, in a different container). The parties must autonomously be ready for this and must store the identifiers of the most recent messages received (and ignore such duplicates rather than repeat actions). In order not to have the identifiers stored forever, there exist special garbage collection messages that take advantage of message identifier monotonicity.

If the send queue overflows or if messages stay in the queue for over 10 minutes, the server forgets them (or sends them to swap, no genius required). This may happen even faster, if the server is running out of buffer space (for example, because of serious network issues resulting in a large number of connections becoming severed).

TCP Transport

TCP Transport

This is very similar to the HTTP transport. May also be implemented over Port 80 (to penetrate all firewalls) and even use the same server IP addresses. In this situation, the server understands whether HTTP or TCP protocol must be used for the connection, based on the first four incoming bytes (for HTTP, it is POST).

When a TCP connection is created, it is assigned to the session (and the authorization key) transmitted in the first user message, and subsequently used exclusively for this session (multiplexing arrangements are not allowed).

If a payload (packet) needs to be transmitted from server to client or from client to server, it is encapsulated as follows: 4 length bytes are added at the front (to include the length, the sequence number, and CRC32; always divisible by 4) and 4 bytes with the packet sequence number within this TCP connection (the first packet sent is numbered 0, the next one 1, etc.), and 4 CRC32 bytes at the end (length, sequence number, and payload together).

There is an abridged version of the same protocol: if the client sends 0xef as the first byte (important: only prior to the very first data packet), then packet length is encoded by a single byte (0x01..0x7e = data length divided by 4; or 0x7f followed by 3 length bytes (little endian) divided by 4) followed by the data themselves (sequence number and CRC32 not added). In this case, server responses look the same (the server does not send 0xefas the first byte).

In case 4-byte data alignment is needed, an intermediate version of the original protocol may be used: if the client sends 0xeeeeeeee as the first int (four bytes), then packet length is encoded always by four bytes as in the original version, but the sequence number and CRC32 are omitted, thus decreasing total packet size by 8 bytes.

The full, the intermediate and the abridged versions of the protocol have support for quick acknowledgment. In this case, the client sets the highest-order length bit in the query packet, and the server responds with a special 4 bytes as a separate packet. They are the 32 higher-order SHA1 bits of the encrypted portion of the packet with the most significant bit set to make clear that this is not the length of a regular server response packet; if the abridged version is used, bswap is applied to these four bytes.

There are no implicit acknowledgments for the TCP transport: all messages must be acknowledged explicitly. Most frequently, acknowledgments are placed in a container with the next query or response if it is transmitted in short order. For example, this is almost always the case for client messages containing RPC queries: the acknowledgment normally arrives with the RPC response.

In the event of an error, the server may send a packet whose payload consists of 4 bytes as the error code. For example, Error Code 403 corresponds to situations where the corresponding HTTP error would have been returned by the HTTP protocol.

Mobile Protocol: Detailed Description

Prior to a message (or a multipart message) being transmitted over a network using a transport protocol, it is encrypted in a certain way, and an external header is added at the top of the message which is: a 64-bit key identifier (that uniquely identifies an authorization key for the server as well as the user) and a 128-bit message key.

A user key together with the message key define an actual 256-bit key and a 256-bit initialization vector, which is what encrypts the message using AES-256 encryption with infinite garble extension (IGE). Note that the initial part of the message to be encrypted contains variable data (session, message ID, sequence number, server salt) that obviously influences the message key (and thus the AES key and iv). The message key is defined as the 128 lower-order bits of the SHA1 of the message body (including session, message ID, etc.) Multipart messages are encrypted as a single message.

Terminology

Authorization Key

Authorization Key

a 2048-bit key shared by the client device and the server, created upon user registration directly on the client device be exchanging Diffie-Hellman keys, and never transmitted over a network. Each authorization key is user-specific. There is nothing that prevents a user from having several keys (that correspond to “permanent sessions” on different devices), and some of these may be locked forever in the event the device is lost. See also Creating an Authorization Key.

Server Key

Server Key

a 2048-bit RSA key used by the server digitally to sign its own messages while registration is underway and the authorization key is being generated. The application has a built-in public server key which can be used to verify a signature but cannot be used to sign messages. A private server key is stored on the server and changed very infrequently.

Key Identifier

Key Identifier

The 64 lower-order bits of the SHA1 hash of the authorization key are used to indicate which particular key was used to encrypt a message. Keys must be uniquely defined by the 64 lower-order bits of their SHA1, and in the event of a collision, an authorization key is regenerated. A zero key identifier means that encryption is not used which is permissible for a limited set of message types used during registration to generate an authorization key based on a Diffie-Hellman exchange.

Session

Session

a (random) 64-bit number generated by the client to distinguish between individual sessions (for example, between different instances of the application, created with the same authorization key). The session in conjunction with the key identifier corresponds to an application instance. The server can maintain session state. Under no circumstances can a message meant for one session be sent into a different session. The server may unilaterally forget any client sessions; clients should be able to handle this.

Server Salt

Server Salt

a (random) 64-bit number periodically (say, every 24 hours) changed (separately for each session) at the request of the server. All subsequent messages must contain the new salt (although, messages with the old salt are still accepted for a further 300 seconds). Required to protect against replay attacks and certain tricks associated with adjusting the client clock to a moment in the distant future.

Message Identifier (msg_id)

Message Identifier (msg_id)

a (time-dependent) 64-bit number used uniquely to identify a message within a session. Client message identifiers are divisible by 4, server message identifiers modulo 4 yield 1 if the message is a response to a client message, and 3 otherwise. Client message identifiers must increase monotonically (within a single session), the same as server message identifiers, and must approximately equal unixtime*2^32. This way, a message identifier points to the approximate moment in time the message was created. A message is rejected over 300 seconds after it is created or 30 seconds before it is created (this is needed to protect from replay attacks). In this situation, it must be re-sent with a different identifier (or placed in a container with a higher identifier). The identifier of a message container must be strictly greater than those of its nested messages.

Important: to counter replay-attacks the lower 32 bits of msg_id passed by the client must not be empty and must present a fractional part of the time point when the message was created. At some point in the nearest future the server will start ignoring messages, in which the lower 32 bits of msg_id contain too many zeroes.

A message requiring an explicit acknowledgment. These include all the user and many service messages, virtually all with the exception of containers and acknowledgments.

Message Sequence Number (msg_seqno)

Message Sequence Number (msg_seqno)

a 32-bit number equal to twice the number of “content-related” messages (those requiring acknowledgment, and in particular those that are not containers) created by the sender prior to this message and subsequently incremented by one if the current message is a content-related message. A container is always generated after its entire contents; therefore, its sequence number is greater than or equal to the sequence numbers of the messages contained in it.

Message Key

Message Key

The lower-order 128 bits of the SHA1 hash of the part of the message to be encrypted (including the internal header and excluding the alignment bytes).

Internal (cryptographic) Header

Internal (cryptographic) Header

A header (16 bytes) added before a message or a container before it is all encrypted together. Consists of the server salt (64 bits) and the session (64 bits).

External (cryptographic) Header

External (cryptographic) Header

A header (24 bytes) added before an encrypted message or a container. Consists of a key identifier (64 bits) and a message key (128 bits).

Payload

Payload

External header + encrypted message or container.

Defining AES Key and Initialization Vector

Defining AES Key and Initialization Vector

The 2048-bit authorization key (auth_key) and the 128-bit message key (msg_key) are used to compute a 256-bit AES key (aes_key) and a 256-bit initialization vector (aes_iv) which are subsequently used to encrypt the part of the message to be encrypted (i. e. everything with the exception of the external header which is added later) with AES-256 in infinite garble extension (IGE) mode.

The algorithm for computing aes_key and aes_iv from auth_key and msg_key is as follows:

  • sha1_a = SHA1 (msg_key + substr (auth_key, x, 32));
  • sha1_b = SHA1 (substr (auth_key, 32+x, 16) + msg_key + substr (auth_key, 48+x, 16));
  • sha1_с = SHA1 (substr (auth_key, 64+x, 32) + msg_key);
  • sha1_d = SHA1 (msg_key + substr (auth_key, 96+x, 32));
  • aes_key = substr (sha1_a, 0, 8) + substr (sha1_b, 8, 12) + substr (sha1_c, 4, 12);
  • aes_iv = substr (sha1_a, 8, 12) + substr (sha1_b, 0, 8) + substr (sha1_c, 16, 4) + substr (sha1_d, 0, 8);

where x = 0 for messages from client to server and x = 8 for those from server to client.

The lower-order 1024 bits of auth_key are not involved in the computation. They may (together with the remaining bits or separately) be used on the client device to encrypt the local copy of the data received from the server. The 512 lower-order bits of auth_key are not stored on the server; therefore, if the client device uses them to encrypt local data and the user loses the key or the password, data decryption of local data is impossible (even if data from the server could be obtained).

When AES is used to encrypt a block of data of a length not divisible by 16 bytes, the data is padded with random bytes to the smallest length divisible by 16 bytes immediately prior to being encrypted.

Important Tests

Important Tests

When an encrypted message is received, it must be checked that msg_key is in fact equal to the 128 lower-order bits of the SHA1 hash of the previously encrypted portion, and that msg_id has even parity for messages from client to server, and odd parity for messages from server to client.

In addition, the identifiers (msg_id) of the last N messages received from the other side must be stored, and if a message comes in with msg_id lower than all or equal to any of the stored values, the message is to be ignored. Otherwise, the new message msg_id is added to the set, and, if the number of stored msg_id values is greater than N, the oldest (i. e. the lowest) is forgotten.

In addition, msg_id values that belong over 30 seconds in the future or over 300 seconds in the past are to be ignored. This is especially important for the server. The client would also find this useful (to protect from a replay attack), but only if it is certain of its time (for example, if its time has been synchronized with that of the server).

Certain client-to-server service messages containing data sent by the client to the server (for example, msg_id of a recent client query) may, nonetheless, be processed on the client even if the time appears to be “incorrect”. This is especially true of messages to change server_salt and notifications of invalid client time. See Mobile Protocol: Service Messages.

Storing an Authorization Key on a Client Device

Storing an Authorization Key on a Client Device

It may be suggested to users concerned with security that they password protect the authorization key in approximately the same way as in ssh. This is accomplished by adding the SHA1 of the key to the front of the key, following which the entire string is encrypted using AES in CBC mode and a key equal to the user’s (text) password. When the user inputs the password, the stored protected password is decrypted and verified by being compared with SHA1. From the user’s standpoint, this is practically the same as using an application or a website password.

Unencrypted Messages

Unencrypted Messages

Special plain-text messages may be used to create an authorization key as well as to perform a time synchronization. They begin with auth_key_id = 0 (64 bits) which means that there is no auth_key. This is followed directly by the message body in serialized format without internal or external headers. A message identifier (64 bits) and body length in bytes (32 bytes) are added before the message body.

Only a very limited number of messages of special types can be transmitted as plain text.

Schematic Presentation of Messages

Schematic Presentation of Messages

Encrypted Message

auth_key_id msg_key encrypted_data
int64 int128 bytes

Encrypted Message: encrypted_data

Contains the cypher text for the following data:

data type
salt int64
session_id int64
message_id int64
seq_no int32
message_data_length int32
message_data bytes
padding 0..15 bytes

Unencrypted Message

data type
auth_key_id = 0 int64
message_id int64
message_data_length int32
message_data bytes

Creating an Authorization Key

Creating an Authorization Key

An authorization key is normally created once for every user during the application installation process immediately prior to registration. Registration itself, in actuality, occurs after the authorization key is created. However, a user may be prompted to complete the registration form while the authorization key is being generated in the background. Intervals between user key strokes may be used as a source of entropy in the generation of high-quality random numbers required for the creation of an authorization key.

See Creating an Authorization Key.

During the creation of the authorization key, the client obtains its server salt (to be used with the new key for all communication in the near future). The client then creates an encrypted session using the newly generated key, and subsequent communication occurs within that session (including the transmission of the user’s registration information and phone number validation) unless the client creates a new session. The client is free to create new or additional sessions at any time by choosing a new random session_id.

Service Messages

Service Messages

Response to an RPC query

Response to an RPC query

A response to an RPC query is normally wrapped as follows:

rpc_result#f35c6d01 req_msg_id:long result:Object = RpcResult;

Here req_msg_id is the identifier of the message sent by the other party and containing an RPC query. This way, the recipient knows that the result is a response to the specific RPC query in question.
At the same time, this response serves as acknowledgment of the other party’s receipt of the req_msg_id message.

Note that the response to an RPC query must also be acknowledged. Most frequently, this coincides with the transmission of the next message (which may have a container attached to it carrying a service message with the acknowledgment).

RPC Error

RPC Error

The result field returned in response to any RPC query may also contain an error message in the following format:

rpc_error#2144ca19 error_code:int error_message:string = RpcError;

Cancellation of an RPC Query

Cancellation of an RPC Query

In certain situations, the client does not want to receive a response to an already transmitted RPC query, for example because the response turns out to be long and the client has decided to do without it because of insufficient link capacity. Simply interrupting the TCP connection will not have any effect because the server would re-send the missing response at the first opportunity. Therefore, the client needs a way to cancel receipt of the RPC response message, actually acknowledging its receipt prior to it being in fact received, which will settle the server down and prevent it from re-sending the response. However, the client does not know the RPC response’s msg_id prior to receiving the response; the only thing it knows is the req_msg_id. i. e. the msg_id of the relevant RPC query. Therefore, a special query is used:

rpc_drop_answer#58e4a740 req_msg_id:long = RpcDropAnswer;

The response to this query returns as one of the following messages wrapped in rpc_result and requiring an acknowledgment:

rpc_answer_unknown#5e2ad36e = RpcDropAnswer;
rpc_answer_dropped_running#cd78e586 = RpcDropAnswer;
rpc_answer_dropped#a43ad8b7 msg_id:long seq_no:int bytes:int = RpcDropAnswer;

The first version of the response is used if the server remembers nothing of the incoming req_msg_id (if it has already been responded to, for example). The second version is used if the response was canceled while the RPC query was being processed (where the RPC query itself was still fully processed); in this case, the same rpc_answer_dropped_running is also returned in response to the original query, and both of these responses require an acknowledgment from the client. The final version means that the RPC response was removed from the server’s outgoing queue, and its msg_id, seq_no, and length in bytes are transmitted to the client.

Note that rpc_answer_dropped_running and rpc_answer_dropped serve as acknowledgments of the server’s receipt of the original query (the same one, the response to which we wish to forget). In addition, same as for any RPC queries, any response to rpc_drop_answer is an acknowledgment for rpc_drop_answer itself.

As an alternative to using rpc_drop_answer, a new session may be created after the connection is reset and the old session is removed through destroy_session.

Request for several future salts

Request for several future salts

The client may at any time request from the server several (between 1 and 64) future server salts together with their validity periods. Having stored them in persistent memory, the client may use them to send messages in the future even if he changes sessions (a server salt is attached to the authorization key rather than being session-specific).

get_future_salts#b921bd04 num:int = FutureSalts;
future_salt#0949d9dc valid_since:int valid_until:int salt:long = FutureSalt;
future_salts#ae500895 req_msg_id:long now:int salts:vector future_salt = FutureSalts;

The client must check to see that the response’s req_msg_id in fact coincides with msg_id of the query for get_future_salts. The server returns a maximum of num future server salts (may return fewer). The response serves as the acknowledgment of the query and does not require an acknowledgment itself.

Ping Messages (PING/PONG)

Ping Messages (PING/PONG)

ping#7abe77ec ping_id:long = Pong;

A response is usually returned to the same connection:

pong#347773c5 msg_id:long ping_id:long = Pong;

These messages do not require acknowledgments. A pong is transmitted only in response to a ping while a ping can be initiated by either side.

Deferred Connection Closure + PING

Deferred Connection Closure + PING

ping_delay_disconnect#f3427b8c ping_id:long disconnect_delay:int = Pong;

Works like ping. In addition, after this is received, the server starts a timer which will close the current connection disconnect_delay seconds later unless it receives a new message of the same type which automatically resets all previous timers. If the client sends these pings once every 60 seconds, for example, it may set disconnect_delay equal to 75 seconds.

Request to Destroy Session

Request to Destroy Session

Used by the client to notify the server that it may forget the data from a different session belonging to the same user (i. e. with the same auth_key_id). The result of this being applied to the current session is undefined.

destroy_session#e7512126 session_id:long = DestroySessionRes;
destroy_session_ok#e22045fc session_id:long = DestroySessionRes;
destroy_session_none#62d350c9 session_id:long = DestroySessionRes;

New Session Creation Notification

New Session Creation Notification

The server notifies the client that a new session (from the server’s standpoint) had to be created to handle a client message. If, after this, the server receives a message with an even smaller msg_id within the same session, a similar notification will be generated for this msg_id as well. No such notifications are generated for high msg_id values.

new_session_created#9ec20908 first_msg_id:long unique_id:long server_salt:long = NewSession

The unique_id parameter is generated by the server every time a session is (re-)created.

This notification must be acknowledged by the client. It is necessary, for instance, for the client to understand that there is, in fact, a “gap” in the stream of long poll notifications received from the server (the user may have failed to receive notifications during some period of time).

Notice that the server may unilaterally destroy (close) any existing client sessions with all pending messages and notifications, without sending any notifications. This happens, for example, if the session is inactive for a long time, and the server runs out of memory. If the client at some point decides to send new messages to the server using the old session, already forgotten by the server, such a “new session created” notification will be generated. The client is expected to handle such situations gracefully.

Containers

Containers

Containers are messages containing several other messages. Used for the ability to transmit several RPC queries and/or service messages at the same time, using HTTP or even TCP or UDP protocol. A container may only be accepted or rejected by the other party as a whole.

Simple Container

Simple Container

A simple container carries several messages as follows:

msg_container#73f1f8dc messages:vector message = MessageContainer;

Here message refers to any message together with its length and msg_id:

message msg_id:long seqno:int bytes:int body:Object = Message;

bytes is the number of bytes in the body serialization.

All messages in a container must have msg_id lower than that of the container itself. A container does not require an acknowledgment and may not carry other simple containers. When messages are re-sent, they may be combined into a container in a different manner or sent individually.

Empty containers are also allowed. They are used by the server, for example, to respond to an HTTP request when the timeout specified in http_wait expires, and there are no messages to transmit.

Message Copies

Message Copies

In some situations, an old message with a msg_id that is no longer valid needs to be re-sent. Then, it is wrapped in a copy container:

msg_copy#e06046b2 orig_message:Message = MessageCopy;

Once received, the message is processed as if the wrapper were not there. However, if it is known for certain that the message orig_message.msg_id was received, then the new message is not processed (while at the same time, it and orig_message.msg_id are acknowledged). The value of orig_message.msg_id must be lower than the container’s msg_id.

This is not used at this time, because an old message can be wrapped in a simple container with the same result.

The following special service query not requiring an acknowledgement (which must be transmitted only through an HTTP connection) is used to enable the server to send messages in the future to the client using HTTP protocol:

http_wait#9299359f max_delay:int wait_after:int max_wait:int = HttpWait;

Packed Object

Packed Object

Used to replace any other object (or rather, a serialization thereof) with its archived (gzipped) representation:

gzip_packed#3072cfa1 packed_data:string = Object;

At the present time, it is supported in the body of an RPC response (i.e., as result in rpc_result) and generated by the server for a limited number of high-level queries. In addition, in the future it may be used to transmit non-service messages (i. e. RPC queries) from client to server.

HTTP Wait/Long Poll

HTTP Wait/Long Poll

When such a message (or a container carrying such a message) is received, the server either waits max_delay milliseconds, whereupon it forwards all the messages that it is holding on to the client if there is at least one message queued in session (if needed, by placing them into a container to which acknowledgments may also be added); or else waits no more than max_wait milliseconds until such a message is available. If a message never appears, an empty container is transmitted.

The max_delay parameter denotes the maximum number of milliseconds that has elapsed between the first message for this session and the transmission of an HTTP response. The wait_after parameter works as follows: after the receipt of the latest message for a particular session, the server waits another wait_after milliseconds in case there are more messages. If there are no additional messages, the result is transmitted (a container with all the messages). If more messages appear, the wait_after timer is reset.

At the same time, the max_delay parameter has higher priority than wait_after, and max_wait has higher priority than max_delay.

This message does not require a response or an acknowledgement. If the container transmitted over HTTP carries several such messages, the behavior is undefined (in fact, the latest parameter will be used).

If no http_wait is present in container, default values max_delay=0 (milliseconds), wait_after=0 (milliseconds), and max_wait=25000 (milliseconds) are used.

If the client’s ping of the server takes a long time, it may make sense to set max_delay to a value that is comparable in magnitude to ping time.

Service Messages about Messages

Service Messages about Messages

Acknowledgment of Receipt

Acknowledgment of Receipt

Receipt of virtually all messages (with the exception of some purely service ones as well as the plain-text messages used in the protocol for creating an authorization key) must be acknowledged.
This requires the use of the following service message (not requiring an acknowledgment):

msgs_ack#62d6b459 msg_ids:Vector long = MsgsAck;

A server usually acknowledges the receipt of a message from a client (normally, an RPC query) using an RPC response. If a response is a long time coming, a server may first send a receipt acknowledgment, and somewhat later, the RPC response itself.

A client normally acknowledges the receipt of a message from a server (usually, an RPC response) by adding an acknowledgment to the next RPC query if it is not transmitted too late (if it is generated, say, 60-120 seconds following the receipt of a message from the server). However, if for a long period of time there is no reason to send messages to the server or if there is a large number of unacknowledged messages from the server (say, over 16), the client transmits a stand-alone acknowledgment.

Notice of Ignored Error Message

Notice of Ignored Error Message

In certain cases, a server may notify a client that its incoming message was ignored for whatever reason. Note that such a notification cannot be generated unless a message is correctly decoded by the server.

bad_msg_notification#a7eff811 bad_msg_id:long bad_msg_seqno:int error_code:int = BadMsgNotification;
bad_server_salt#edab447b bad_msg_id:long bad_msg_seqno:int error_code:int new_server_salt:long = BadMsgNotification;

Here, error_code can also take on the following values:

  • 16: msg_id too low (most likely, client time is wrong; it would be worthwhile to synchronize it using msg_id notifications and re-send the original message with the “correct” msg_id or wrap it in a container with a new msg_id if the original message had waited too long on the client to be transmitted)
  • 17: msg_id too high (similar to the previous case, the client time has to be synchronized, and the message re-sent with the correct msg_id)
  • 18: incorrect two lower order msg_id bits (the server expects client message msg_id to be divisible by 4)
  • 19: container msg_id is the same as msg_id of a previously received message (this must never happen)
  • 20: message too old, and it cannot be verified whether the server has received a message with this msg_id or not
  • 32: msg_seqno too low (the server has already received a message with a lower msg_id but with either a higher or an equal and odd seqno)
  • 33: msg_seqno too high (similarly, there is a message with a higher msg_id but with either a lower or an equal and odd seqno)
  • 34: an even msg_seqno expected (irrelevant message), but odd received
  • 35: odd msg_seqno expected (relevant message), but even received
  • 48: incorrect server salt (in this case, the bad_server_salt response is received with the correct salt, and the message is to be re-sent with it)
  • 64: invalid container.

The intention is that error_code values are grouped (error_code >> 4): for example, the codes 0x40 - 0x4f correspond to errors in container decomposition.

Notifications of an ignored message do not require acknowledgment (i.e., are irrelevant).

Important: if server_salt has changed on the server or if client time is incorrect, any query will result in a notification in the above format. The client must check that it has, in fact, recently sent a message with the specified msg_id, and if that is the case, update its time correction value (the difference between the client’s and the server’s clocks) and the server salt based on msg_id and the server_salt notification, so as to use these to (re)send future messages. In the meantime, the original message (the one that caused the error message to be returned) must also be re-sent with a better msg_id and/or server_salt.

In addition, the client can update the server_salt value used to send messages to the server, based on the values of RPC responses or containers carrying an RPC response, provided that this RPC response is actually a match for the query sent recently. (If there is doubt, it is best not to update since there is risk of a replay attack).

Request for Message Status Information

Request for Message Status Information

If either party has not received information on the status of its outgoing messages for a while, it may explicitly request it from the other party:

msgs_state_req#da69fb52 msg_ids:Vector long = MsgsStateReq;

The response to the query contains the following information:
see later

Informational Message regarding Status of Messages

Informational Message regarding Status of Messages

msgs_state_info#04deb57d req_msg_id:long info:string = MsgsStateInfo;

Here, info is a string that contains exactly one byte of message status for each message from the incoming msg_ids list:

  • 1 = nothing is known about the message (msg_id too low, the other party may have forgotten it)
  • 2 = message not received (msg_id falls within the range of stored identifiers; however, the other party has certainly not received a message like that)
  • 3 = message not received (msg_id too high; however, the other party has certainly not received it yet)
  • 4 = message received (note that this response is also at the same time a receipt acknowledgment)
  • +8 = message already acknowledged
  • +16 = message not requiring acknowledgment
  • +32 = RPC query contained in message being processed or processing already complete
  • +64 = content-related response to message already generated
  • +128 = other party knows for a fact that message is already received

This response does not require an acknowledgment. It is an acknowledgment of the relevant msgs_state_req, in and of itself.

Note that if it turns out suddenly that the other party does not have a message that looks like it has been sent to it, the message can simply be re-sent. Even if the other party should receive two copies of the message at the same time, the duplicate will be ignored. (If too much time has passed, and the original msg_id is not longer valid, the message is to be wrapped in msg_copy).

Voluntary Communication of Status of Messages

Voluntary Communication of Status of Messages

Either party may voluntarily inform the other party of the status of the messages transmitted by the other party.

msgs_all_info#8cc0d131 msg_ids:Vector long info:string = MsgsAllInfo

All message codes known to this party are enumerated, with the exception of those for which the +128 and the +16 flags are set. However, if the +32 flag is set but not +64, then the message status will still be communicated.

This message does not require an acknowledgment.

Extended Voluntary Communication of Status of One Message

Extended Voluntary Communication of Status of One Message

Normally used by the server to respond to the receipt of a duplicate msg_id, especially if a response to the message has already been generated and the response is large. If the response is small, the server may re-send the answer itself instead. This message can also be used as a notification instead of resending a large message.

msg_detailed_info#276d3ec6 msg_id:long answer_msg_id:long bytes:int status:int = MsgDetailedInfo;
msg_new_detailed_info#809db6df answer_msg_id:long bytes:int status:int = MsgDetailedInfo;

The second version is used to notify of messages that were created on the server not in response to an RPC query (such as notifications of new messages) and were transmitted to the client some time ago, but not acknowledged.

Currently, status is always zero. This may change in future.

This message does not require an acknowledgment.

Explicit Request to Re-Send Messages

Explicit Request to Re-Send Messages

msg_resend_req#7d861a08 msg_ids:Vector long = MsgResendReq;

The remote party immediately responds by re-sending the requested messages, normally using the same connection that was used to transmit the query. If at least one message with requested msg_id does not exist or has already been forgotten, or has been sent by the requesting party (known from parity), MsgsStateInfo is returned for all messages requested as if the MsgResendReq query had been a MsgsStateReq query as well.

Explicit Request to Re-Send Answers

Explicit Request to Re-Send Answers

msg_resend_ans_req#8610baeb msg_ids:Vector long = MsgResendReq;

The remote party immediately responds by re-sending answers to the requested messages, normally using the same connection that was used to transmit the query. MsgsStateInfo is returned for all messages requested as if the MsgResendReq query had been a MsgsStateReq query as well.

Diffie—Hellman key exchange

Diffie—Hellman key exchange

We use DH key exchange in two cases:

  • Creating an authorization key
  • Establishing Secret Chats with end-to-end encryption

In both cases, there are some verifications to be done whenever DH is used

Validation of DH parameters

Validation of DH parameters

Client is expected to check whether p = dh_prime is a safe 2048-bit prime (meaning that both p and (p-1)/2 are prime, and that 2^2047 < p < 2^2048), and that g generates a cyclic subgroup of prime order (p-1)/2, i.e. is a quadratic residue mod p. Since g is always equal to 2, 3, 4, 5, 6 or 7, this is easily done using quadratic reciprocity law, yielding a simple condition on p mod 4g — namely, p mod 8 = 7 for g = 2; p mod 3 = 2 for g = 3; no extra condition for g = 4; p mod 5 = 1 or 4 for g = 5; p mod 24 = 19 or 23 for g = 6; and p mod 7 = 3, 5 or 6 for g = 7. After g and p have been checked by the client, it makes sense to cache the result, so as not to repeat lengthy computations in future.

If the verification takes too long (which is the case for older mobile devices), one might initially run only 15 Miller—Rabin iterations (use parameter 30 in Java) for verifying primeness of p and (p - 1)/2 with error probability not exceeding one billionth, and do more iterations in the background later.

Another way to optimize this is to embed into the client application code a small table with some known “good” couples (g,p) (or just known safe primes p, since the condition on g is easily verified during execution), checked during code generation phase, so as to avoid doing such verification during runtime altogether. The server rarely changes these values, thus one usually needs to put the current value of server’s dh_prime into such a table. For example, the current value of dh_prime equals (in big-endian byte order)

C7 1C AE B9 C6 B1 C9 04 8E 6C 52 2F 70 F1 3F 73 98 0D 40 23 8E 3E 21 C1 49 34 D0 37 56 3D 93 0F 48 19 8A 0A A7 C1 40 58 22 94 93 D2 25 30 F4 DB FA 33 6F 6E 0A C9 25 13 95 43 AE D4 4C CE 7C 37 20 FD 51 F6 94 58 70 5A C6 8C D4 FE 6B 6B 13 AB DC 97 46 51 29 69 32 84 54 F1 8F AF 8C 59 5F 64 24 77 FE 96 BB 2A 94 1D 5B CD 1D 4A C8 CC 49 88 07 08 FA 9B 37 8E 3C 4F 3A 90 60 BE E6 7C F9 A4 A4 A6 95 81 10 51 90 7E 16 27 53 B5 6B 0F 6B 41 0D BA 74 D8 A8 4B 2A 14 B3 14 4E 0E F1 28 47 54 FD 17 ED 95 0D 59 65 B4 B9 DD 46 58 2D B1 17 8D 16 9C 6B C4 65 B0 D6 FF 9C A3 92 8F EF 5B 9A E4 E4 18 FC 15 E8 3E BE A0 F8 7F A9 FF 5E ED 70 05 0D ED 28 49 F4 7B F9 59 D9 56 85 0C E9 29 85 1F 0D 81 15 F6 35 B1 05 EE 2E 4E 15 D0 4B 24 54 BF 6F 4F AD F0 34 B1 04 03 11 9C D8 E3 B9 2F CC 5B

g_a and g_b validation

g_a and g_b validation

Apart from the conditions on the Diffie-Hellman prime dh_prime and generator g, both sides are to check that g, g_a and g_b are greater than 1 and less than dh_prime - 1. We recommend checking that g_a and g_b are between 2^{2048-64} and dh_prime - 2^{2048-64} as well.

Checking SHA1 hash values

Checking SHA1 hash values

Once the client receives a server_DH_params_ok answer in step 5) of the Authorization Key generation protocol and decrypts it obtaining answer_with_hash, it MUST check that

answer_with_hash := SHA1(answer) + answer + (0-15 random bytes)

In other words, the first 20 bytes of answer_with_hash must be equal to SHA1 of the remainder of the decrypted message without the padding random bytes.

Checking nonce, server_nonce and new_nonce fields

Checking nonce, server_nonce and new_nonce fields

When the client receives and/or decrypts server messages during creation of Authorization Key, and these messages contain some nonce fields already known to the client from messages previously obtained during the same run of the protocol, the client is to check that these fields indeed contain the values previosly known.

Using secure pseudorandom number generator to create DH secret parameters a and b

Using secure pseudorandom number generator to create DH secret parameters a and b

Client must use a cryptographically secure PRNG to generate secret exponents a or b for DH key exchange. For secret chats, the client might request some entropy (random bytes) from the server while invoking messages.getDhConfig and feed these random bytes into its PRNG (for example, by PRNG_seed if OpenSSL library is used), but never using these “random” bytes by themselves or replacing by them the local PRNG seed. One should mix bytes received from server into local PRNG seed.

MTProto Encrypted Messages

Some important checks are to be done while sending and especially receiving encrypted MTProto messages

Checking SHA1 hash value of msg_key

Checking SHA1 hash value of msg_key

msg_key is used not only to compute the AES key and IV to decrypt the received message. After decryption, the client MUST check that msg_key is indeed equal to SHA1 of the plaintext obtained as the result of decryption (without the final padding bytes).

If an error is encountered before this check could be performed, the client must perform the msg-key check anyway before returning any result. Note that the response to any error encountered before the msg_key check must be the same as the response to a failed msg_key check.

Checking message length

Checking message length

The client is to check that the length of the message or container obtained from the decrypted message (computed from its length field) does not exceed the total size of the plaintext, and that the difference is not more than 15 bytes. Apart from this, knowing the total length is important for the previous verification.

The length should be always divisible by 4 and non-negative. On no account the client is to access data past the end of the decryption buffer containing the plaintext message.

Checking session_id

Checking session_id

The client is to check that the session_id field in the decrypted message indeed equals to that of an active session created by the client.

Checking msg_id

Checking msg_id

The client must check that msg_id has even parity for messages from client to server, and odd parity for messages from server to client.

In addition, the identifiers (msg_id) of the last N messages received from the other side must be stored, and if a message comes in with an msg_id lower than all or equal to any of the stored values, that message is to be ignored. Otherwise, the new message msg_id is added to the set, and, if the number of stored msg_id values is greater than N, the oldest (i. e. the lowest) is discarded.

In addition, msg_id values that belong over 30 seconds in the future or over 300 seconds in the past are to be ignored (recall that msg_id approximately equals unixtime * 2^32). This is especially important for the server. The client would also find this useful (to protect from a replay attack), but only if it is certain of its time (for example, if its time has been synchronized with that of the server).

Certain client-to-server service messages containing data sent by the client to the server (for example, msg_id of a recent client query) may, nonetheless, be processed on the client even if the time appears to be “incorrect”. This is especially true of messages to change server_salt and notifications about invalid time on the client.

Behavior in case of mismatch

Behavior in case of mismatch

If one of the checks listed above fails, the client is to completely discard the message obtained from server. We also recommend closing and reestablishing the TCP connection to the server, then retrying the operation or the whole key generation protocol.

No information from incorrect messages can be used. Even if the application throws an exception and dies, this is much better than continuing with invalid data.

Notice that invalid messages will infrequently appear during normal work even if no malicious tampering is being done. This is due to network transmission errors. We recommend ignoring the invalid message and closing the TCP connection, then creating a new TCP connection to the server and retrying the original query.

Meta

Стили

C6 width

//