How are calls end-to-end encrypted?
Call flows in Teams are based on the Session Description Protocol (SDP) [
RFC 4566] offer/answer model over HTTPS. Once the callee accepts an incoming call, the session parameters are agreed between the caller and callee and encrypted media starts flowing between the caller and callee using secure real-time transport protocol (SRTP).
In normal call flows, negotiation of the encryption key occurs over the call signaling channel. In an end-to-end encrypted call, the signaling flow is the same as a regular one-to-one Teams call. However, Teams uses DTLS to derive an encryption key based on per-call certificates generated on both client endpoints. Since DTLS derives the key based on client certificates, the key is opaque to Microsoft. Once both clients agree upon the key, the media begins to flow using this DTLS-negotiated encryption key over SRTP.
To protect against a man-in-the-middle attack between the caller and callee, Teams derives a 20-digit security code from the SHA-256 thumbprints of the caller’s and callee’s endpoint call certificates. The caller and callee can validate the 20-digit security codes by reading them to each other to see if they match. If the codes don’t match, then the connection between the caller and callee has been intercepted by a man-in-the-middle attack. If the call has been compromised, users can terminate the call manually.