How does email work ? An introduction for developers

Matera
6 min readApr 18, 2023

--

We all know and love email, but do we know how it works?

  • Which protocol does it use
  • What are the content of those emails under the hood
  • Are there any security checks

Here is a short deep dive into the emailing world.

Email sending (how it works)

At Matera we use SMTP to send emails. SMTP stands for Simple Mail Transfer Protocol — Simple because its a text-based protocol over TCP and the content of the message is not encrypted. Here is a basic SMTP discussion (C stands for Client, S stand for Server), we gonna see what’s going on in this discussion afterward:

# First step, servers connect to each other using TCP

S: 220 smtp.example.com ESMTP Postfix
C: HELO relay.example.org


# then comes the SMTP introduction

S: 250 Hello relay.example.org, I am glad to meet you
C: MAIL FROM:<bob@example.org>
S: 250 Ok
C: RCPT TO:<alice@example.com>
S: 250 Ok
C: RCPT TO:<theboss@example.com>
S: 250 Ok

# Client sends the email envelope information to the server: sender email address (here: bob@example.org), receiver email address (here: alice@example.com + theboss@example.com)

C: DATA
S: 354 End data with <CR><LF>.<CR><LF>
C: From: "Bob Example" <bob@example.org>
C: To: "Alice Example" <alice@example.com>
C: Cc: theboss@example.com
C: Date: Tue, 15 Jan 2008 16:02:43 -0500
C: Subject: Test message
C:
C: Hello Alice.
C: This is a test message with 5 header fields and 4 lines in the message body.
C: Your friend,
C: Bob
C: .

# Client gives the content of the email message to the server,(here FROM, TO, CC, Date, Subject and content)

S: 250 Ok: queued as 12345
C: QUIT
S: 221 Bye
{The server closes the connection}

# Ending conversation between Client and Server

For more details about SMTP, the full protocol is described in RFC 5321. This protocol enable us to send a message (an email) which is itself described in another RFC, The RFC 5322.

Structure / fields

As we saw in the example above there are multiple fields when sending the content of an email to a SMTP server, some are required, some aren’t. Looking at the RFC 5322 — Internet Message Format let’s see in detail some of the most important ones:

Every message must include the Date and From headers.

  • Date will contain the date the mail is created (mostly to be the date of the day of the sending)
  • From will be the sender email address

Surprisingly some basic field are not required by the RFC 5322, even though we found them in most mail

  • To -> Recipient email address
  • Cc -> Carbon Copy, another recipient email address
  • Bcc headers do exist, but according to the RFC 5322 it’s removed by the emitting server before each recipients of the message receives a copy of it
  • Subject -> the subject of the email
  • Message-ID -> unique identifier for an email, generated on the email client right before sending the email, created from time and domain name (eg: 950124.162336@example.com)

Reply-To directs replies to an address other than that specified in the From header.

The Sender header usually indicates that the message is being delivered by one person/party on behalf of another, as seen with a mailing list or delegated authority.

The presence of In-Reply-To and References headers indicate that the message is a reply to a previous message:

  • References header makes threaded mail reading and per-discussion archival possible, it’s basically an ordered list of Message-ID (the first Message-ID of the list will be the first one of the thread). When replying to an email the email client will append the References header with the Message-ID of the new email. That header will grow in size with each reply in a series of messages
  • In-Reply-To contains the Message-ID of the message to which this one is a reply (the parent message) if the parent message has no Message-ID then the reply has no Reply-To field.

As we have seen, both emails and SMTP transactions are very simple and clear text protocols. It’s incredibly easy to change values in different fields which opens a lot of doors for wannabe attackers to mess up communications. Over the years, a number of security mechanism have been designed on top of these to improve email delivery security.

Security checks

When sending and receiving email there are multiple security mechanisms that can be used to prevent spam, spoofing, or phishing. SPF, DKIM, and DMARC are security standards that offer a layer of security for email recipients.

These checks have become so prominent that today most servers & relays receiving emails will check these and store their output in the Authentication-Results header.

SPF

Sender Policy Framework is an email authentication method that helps protect senders reputation and recipients from spam, spoofing, and phishing. It enables a server to check that the IP address trying to deliver an email over SMTP on behalf of a domain is actually authorized to do so.

SPF authentication flow

Here is an example of how it works: We have a server A with IP address 1.2.3.4, this server sends an email and that email has bounces@matera.eu in Return-Path header. The receiving server gets the Return-Path domain (matera.eu) and checks out the domain’s DNS record for matera.eu, looking for the SPF TXT record. Once it finds the SPF TXT record, it checks if the IP of the sender (1.2.3.4) matches the IP listed as a valid sender. If the address of the sender is in the list the SPF result will be “pass” otherwise it will be “fail”.

A failing SPF check means that the email might come from an unauthorized / unknown source. The receiving server could reject the mail.

DKIM

DomainKeys Identified Mail Signature, is an email security standard that helps detect whether messages are altered in transit between sending and receiving mail servers.

DKIM security flow

Receiving Mail Servers will extract the dkim-signature from the email headers, get the dkim record entry from the DNS, calculate the email signature, and validate it using the public key from dkim dns entry. If the message hasn’t changed DKIM value will be “pass” otherwise it will be “fail”.

When the DKIM check fails, it means that the message has been altered between sending and receiving, someone could have added a malicious link or script in it, or altered its headers…

DMARC

Domain-based Message Authentication, Reporting and Conformance, is a standard that prevents spammers from using your domain to send email without your permission (spoofing). They used to do this by changing the FROM address so that the email looks like it’s coming from your domain while it’s not.

DMARC is built on top of DKIM and SPF to perform the actual security check. The standard also defines a way for domains to suggest an action to take with an email if the check fails. This enables ISPs to have way deal with emails that looks fraudulent. Let’s see how it does that;

DMARC validation flow

On the Receiving Email Server, an email will be DMARC valid when SPF and DKIM validate the email and also when the domain name is the same on the Mail From and the FROM address. When the domain name isn’t the same the email will not be DMARC valid.

Depending on DMARC policies indicated by the domain that claims to send to send this emails some action can be done, if the check failed:

  • REJECT: the email is not delivered
  • QUARANTINE: the email is sent to the spam folder
  • NONE: the email is delivered to the recipient.

The standard also defines a way for servers to report failing checks to the claiming domain so that misconfigurations of servers that send emails can be adjusted.

This was a short deep dive into the emailing world, we saw that email is using SMTP to communicate, have some specific fields to store the data to send and have some security checks through SPF, DKIM and DMARC checks. I hoped you enjoyed this article!

References

--

--

No responses yet