SMTP protocol basics from scratch in Go: receiving email from Gmail


[ comments ]

I've never run my own mail server before. Before today I had no clue how email worked under the hood other than the very few times I've set up mail clients.

I've heard no few times how hard it is to send mail from a self-hosted server (because of spam filters). But how hard can it be to hook up DNS to my personal server and receive email to my domain sent from Gmail or another real-world client?

I knew it would be simpler to just send local mail to a local mail server with a local mail client but that didn't seem as real. If I could send email from my Gmail account and receive it in my server I'd be happy.

I spent the afternoon digging into this. All code is available on Github. The "live stream" is in the Multiprocess Discord's #hacking-networks channel.

DNS

First I bought a domain. (I needed to be able to mess around with records without blowing up anything important.)

I knew that MX records controlled where mail for a domain is sent. But I had to look up the specifics. You need to create an MX record that points to an A or AAAA record. So you need both an MX record and an A or AAAA record.

MX and A record settings

Done.

Firewall

The firewall on Fedora is aggressive. Gotta open up port 25.

$ sudo firewall-cmd --zone=dmz --add-port=25/tcp --permanent
$ sudo firewall-cmd --zone=public --add-port=25/tcp --permanent
$ sudo firewall-cmd --reload

I don't understand what zones are here.

What protocols?

I knew that you send email with SMTP and you read it with POP3 or IMAP. But it hadn't clicked before that the mail server has to speak SMTP and if you only ever read on the server (which is of course impractical in the real world) you don't need POP3 or IMAP.

SMTP vs POP3

So to meaningfully receive email from Gmail all I needed to do was implement SMTP.

SMTP

First I found the RFC for SMTP (or one of them anyway) and the wikipedia page for it.

First off I'd need to run a TCP server on port 25.

package main
import (
        "errors"
        "log"
        "net"
        "strconv"
        "strings"
)
func logError(err error) {
        log.Printf("[ERROR] %s\n", err)
}
func logInfo(msg string) {
        log.Printf("[INFO] %s\n", msg)
}
type message struct {
        clientDomain string
        smtpCommands  map[string]string
        atmHeaders   map[string]string
        body         string
        from         string
        date         string
        subject      string
        to           string
}
type connection struct {
        conn net.Conn
        id   int
        buf  []byte
}
// TODO
func (c *connection) handle() {
        // TODO
}
func main() {
        l, err := net.Listen("tcp", "0.0.0.0:25")
        if err != nil {
                panic(err)
        }
        defer l.Close()
        logInfo("Listening")
        id := 0
        for {
                conn, err := l.Accept()
                if err != nil {
                        logError(err)
                        continue
                }
                id += 1
                c := connection{conn, id, nil}
                go c.handle()
        }
}

Just a basic TCP server that passes off connections inside a goroutine.

Greeting

After starting a connection, the server must send a greeting. The successful greeting response code is 220. It can optionally be followed by additional text. Like most commands in SMTP it must be ended with CRLF (\r\n).

So we'll add a helper function for writing lines that end in CRLF:

func (c *connection) writeLine(msg string) error {
        msg += "\r\n"
        for len(msg) > 0 {
                n, err := c.conn.Write([]byte(msg))
                if err != nil {
                        return err
                }
                msg = msg[n:]
        }
        return nil
}

And then we'll send that 220 in the handle function.

func (c *connection) handle() {
        defer c.conn.Close()
        c.logInfo("Connection accepted")
        err := c.writeLine("220")
        if err != nil {
                c.logError(err)
                return
        }
        c.logInfo("Awaiting EHLO")
        // TODO

EHLO

Next we need to be able to read requests from the client. We'll write a helper that reads until the next CRLF. We'll keep a buffer of unread bytes in case we accidentally get bytes past the next CRLF. We'll store that buffer in the connection object.

func (c *connection) readLine() (string, error) {
        for {
                b := make([]byte, 1024)
                n, err := c.conn.Read(b)
                if err != nil {
                        return "", err
                }
                c.buf = append(c.buf, b[:n]...)
                for i, b := range c.buf {
                        // If end of line
                        if b == '\n' && i > 0 && c.buf[i-1] == '\r' {
                                // i-1 because drop the CRLF, no one cares after this
                                line := string(c.buf[:i-1])
                                c.buf = c.buf[i+1:]
                                return line, nil
                        }
                }
        }
}

Now back in the handle-er we can read a line from the client. From the RFC we can see it should be HELO or EHLO. Both sendmail locally and Gmail only send EHLO though so we'll just check for that.

EHLO response format

So we'll validate the message sent is an EHLO and then we'll send back a 250 with a space after it. We can ignore the rest of that response grammar since we don't have additional keywords we want to send to the client.

        ...
        c.logInfo("Awaiting EHLO")
        line, err := c.readLine()
        if err != nil {
                c.logError(err)
                return
        }
        if !strings.HasPrefix(line, "EHLO") {
                c.logError(errors.New("Expected EHLO got: " + line))
                return
        }
        msg := message{
                smtpCommands: map[string]string{},
                atmHeaders:  map[string]string{},
        }
        msg.clientDomain = line[len("EHLO "):]
        c.logInfo("Received EHLO")
        err = c.writeLine("250 ")
        if err != nil {
                c.logError(err)
                return
        }
        c.logInfo("Done EHLO")
        // TODO

Additional commands

Next up there are a few commands we need to read before we get to the message body. These include the recipient and the sender address. These are formatted vaguely similar to HTTP headers. They have a key on the left side of a colon and a value on the right. They may have a required order too, I'm not sure.

In response to the commands we'll send a 250 OK, although I'm not sure where in the RFC that is suggested.

In our code we'll just keep looking for these commands until we find the DATA command. This indicates the body is to follow. And to this command we respond with a 354 instead of a 250 OK.

DATA response

In code:

        ...
        c.logInfo("Done EHLO")
        for line != "" {
                line, err = c.readLine()
                if err != nil {
                        c.logError(err)
                        return
                }
                pieces := strings.SplitN(line, ":", 2)
                smtpCommand := strings.ToUpper(pieces[0])
                // Special command without a value
                if smtpCommand == "DATA" {
                        err = c.writeLine("354")
                        if err != nil {
                                c.logError(err)
                                return
                        }
                        break
                }
                smtpValue := pieces[1]
                msg.smtpCommands[smtpCommand] = smtpValue
                c.logInfo("Got command: " + line)
                err = c.writeLine("250 OK")
                if err != nil {
                        c.logError(err)
                        return
                }
        }
        c.logInfo("Done SMTP commands, reading ARPA text message headers")
        // TODO

Message body, headers

Now that we've seen the DATA command we are within a message body. Within this body we still have to read some additional headers.

Through trial-and-error I know to look for some headers like Subject. By searching the RFC I noticed a reference to RFC 822 where these headers are defined.

ARPA text message headers

These are ARPA internet text message headers apparently. They also look like HTTP headers but unlike HTTP headers they can span multiple lines. This stumped me for a bit.

Multi-line headers

I decided to write a new readLine function that would specifically look for these possibly multi-line headers where a CRLF followed by whitespace isn't a line delimiter.

func (c *connection) readMultiLine() (string, error) {
        for {
                noMoreReads := false
                for i, b := range c.buf {
                        if i > 1 &&
                                b != ' ' &&
                                b != '\t' &&
                                c.buf[i-2] == '\r' &&
                                c.buf[i-1] == '\n' {
                                // i-2 because drop the CRLF, no one cares after this
                                line := string(c.buf[:i-2])
                                c.buf = c.buf[i:]
                                return line, nil
                        }
                        noMoreReads = c.isBodyClose(i)
                }
                if !noMoreReads {
                        b := make([]byte, 1024)
                        n, err := c.conn.Read(b)
                        if err != nil {
                                return "", err
                        }
                        c.buf = append(c.buf, b[:n]...)
                        // If this gets here more than once it's going to be an infinite loop
                }
        }
}
func (c *connection) isBodyClose(i int) bool {
        return i > 4 &&
                c.buf[i-4] == '\r' &&
                c.buf[i-3] == '\n' &&
                c.buf[i-2] == '.' &&
                c.buf[i-1] == '\r' &&
                c.buf[i-0] == '\n'
}

Now back in the handle function we can read through all of these headers. According to RFC 822, we're done when we see a double CRLF, which in our code will show up as an empty line.

        ...
        c.logInfo("Done SMTP headers, reading ARPA text message headers")
        for {
                line, err = c.readMultiLine()
                if err != nil {
                        c.logError(err)
                        return
                }
                if strings.TrimSpace(line) == "" {
                        break
                }
                pieces := strings.SplitN(line, ": ", 2)
                atmHeader := strings.ToUpper(pieces[0])
                atmValue := pieces[1]
                msg.atmHeaders[atmHeader] = atmValue
                if atmHeader == "SUBJECT" {
                        msg.subject = atmValue
                }
                if atmHeader == "TO" {
                        msg.to = atmValue
                }
                if atmHeader == "FROM" {
                        msg.from = atmValue
                }
                if atmHeader == "DATE" {
                        msg.date = atmValue
                }
        }
        c.logInfo("Done ARPA text message headers, reading body")
        // TODO

Body, for real

We're finally at the email body as the user typed it. According to the SMTP RFC the body ends with a CRLF followed by a dot (period) followed by a CRLF.

So we'll write another helper to read until this marker.

func (c *connection) readToEndOfBody() (string, error) {
        for {
                for i := range c.buf {
                        if c.isBodyClose(i) {
                                return string(c.buf[:i-4]), nil
                        }
                }
                b := make([]byte, 1024)
                n, err := c.conn.Read(b)
                if err != nil {
                        return "", err
                }
                c.buf = append(c.buf, b[:n]...)
        }
}

And we can finish up the handle function.

        c.logInfo("Done ARPA text message headers, reading body")
        msg.body, err = c.readToEndOfBody()
        if err != nil {
                c.logError(err)
                return
        }
        c.logInfo("Got body (%d bytes)", len(msg.body))
        err = c.writeLine("250 OK")
        if err != nil {
                c.logError(err)
                return
        }
        c.logInfo("Message:\n%s\n", msg.body)
        c.logInfo("Connection closed")
}

Compile, setcap, run, and send

$ go build
$ sudo setcap 'cap_net_bind_service=+ep' ./gomail
$ ./gomail

And send an email in Gmail! It can be to any user since we haven't implemented anything regarding users. I'll send What hath god wrought as the subject and message to morse@binutils.org.

And I see:

2022/02/21 02:17:19 [INFO] Listening
2022/02/21 02:19:13 [INFO] [1: 209.85.222.47:40695] Connection accepted
2022/02/21 02:19:13 [INFO] [1: 209.85.222.47:40695] Awaiting EHLO
2022/02/21 02:19:13 [INFO] [1: 209.85.222.47:40695] Received EHLO
2022/02/21 02:19:13 [INFO] [1: 209.85.222.47:40695] Done EHLO
2022/02/21 02:19:13 [INFO] [1: 209.85.222.47:40695] Got header: MAIL FROM:
2022/02/21 02:19:13 [INFO] [1: 209.85.222.47:40695] Got header: RCPT TO:
2022/02/21 02:19:13 [INFO] [1: 209.85.222.47:40695] Done SMTP headers, reading ARPA text message headers
2022/02/21 02:19:13 [INFO] [1: 209.85.222.47:40695] Done ARPA text message headers, reading body
2022/02/21 02:19:13 [INFO] [1: 209.85.222.47:40695] Got body (256 bytes)
2022/02/21 02:19:13 [INFO] [1: 209.85.222.47:40695] Message:
--000000000000c4758905d87ddb81
Content-Type: text/plain; charset="UTF-8"
What hath god wrought
--000000000000c4758905d87ddb81
Content-Type: text/html; charset="UTF-8"
What hath god wrought
--000000000000c4758905d87ddb81-- 2022/02/21 02:19:13 [INFO] [1: 209.85.222.47:40695] Connection closed

Which is pretty sweet!

Multipart wut

Ok this body still clearly has some format. And if we dump the ARPA text message headers we notice that Gmail 1) sets a Content-Type header and 2) it's value is multipart/alternative. I don't know where Content-Type as a valid header is defined because it's not in RFC 822. Maybe it's some "new-fangled" adhoc standard or maybe it's just in an extension RFC.

In any case this looks like multipart bodies in HTTP. I don't want to deal with that so I'm just going to stop here.

But I am curious about text-only email systems. So I sudo dnf install php sendmail and write a quick PHP script (thanks to @Josh on Discord for the suggestion):

morse@binutils.org", "What hath god wrought", "What hath god wrought", "");
?>

And run it:

$ php test.php

And in my gomail window I see:

2022/02/21 02:24:17 [INFO] Listening
2022/02/21 02:24:18 [INFO] [1: 127.0.0.1:45102] Connection accepted
2022/02/21 02:24:18 [INFO] [1: 127.0.0.1:45102] Awaiting EHLO
2022/02/21 02:24:18 [INFO] [1: 127.0.0.1:45102] Received EHLO
2022/02/21 02:24:18 [INFO] [1: 127.0.0.1:45102] Done EHLO
2022/02/21 02:24:18 [INFO] [1: 127.0.0.1:45102] Got header: MAIL From:
2022/02/21 02:24:18 [INFO] [1: 127.0.0.1:45102] Got header: RCPT To:
2022/02/21 02:24:18 [INFO] [1: 127.0.0.1:45102] Done SMTP headers, reading ARPA text message headers
2022/02/21 02:24:18 [INFO] [1: 127.0.0.1:45102] Done ARPA text message headers, reading body
2022/02/21 02:24:18 [INFO] [1: 127.0.0.1:45102] Got body (21 bytes)
2022/02/21 02:24:18 [INFO] [1: 127.0.0.1:45102] Message:
What hath god wrought
2022/02/21 02:24:18 [INFO] [1: 127.0.0.1:45102] Connection closed

And I'm happy to call it a night.

I wrote a new blog post on building an SMTP server from scratch in Go that is correctly enough hooked up you can receive emails sent from Gmail to it!

Good fun and some learning too.https://t.co/8pYkkAbFnI

— Phil Eaton (@phil_eaton) February 21, 2022

p.s. if you want to see more networking software/hardware internals check out /r/NetworkDevelopment.

Feedback

As always, please email or tweet me with questions, corrections, or ideas!

[ comments ]