Skip to content

Commit

Permalink
smtpsrv: Strict CRLF enforcement in DATA contents
Browse files Browse the repository at this point in the history
The RFCs are very clear that in DATA contents:

> CR and LF MUST only occur together as CRLF; they MUST NOT appear
> independently in the body.

https://www.rfc-editor.org/rfc/rfc5322#section-2.3
https://www.rfc-editor.org/rfc/rfc5321#section-2.3.8

Allowing "independent" CR and LF can cause a number of problems.

In particular, there is a new "SMTP smuggling attack" published recently
that involves the server incorrectly parsing the end of DATA marker
`\r\n.\r\n`, which an attacker can exploit to impersonate a server when
email is transmitted server-to-server.

https://www.postfix.org/smtp-smuggling.html
https://sec-consult.com/blog/detail/smtp-smuggling-spoofing-e-mails-worldwide/

Currently, chasquid is vulnerable to this attack, because Go's standard
libraries net/textproto and net/mail do not enforce CRLF strictly.

This patch fixes the problem by introducing a new "dot reader" function
that strictly enforces CRLF when reading dot-terminated data, used in
the DATA input processing.

When an invalid newline terminator is found, the connection is aborted
immediately because we cannot safely recover from that state.

We still keep the internal representation as LF-terminated for
convenience and simplicity.

However, the MDA courier is changed to pass CRLF-terminated lines, since
that is an external program which could be strict when receiving email
messages.

See #47 for more details and
discussion.
  • Loading branch information
albertito committed Dec 23, 2023
1 parent e03594a commit 8f19a0f
Show file tree
Hide file tree
Showing 15 changed files with 422 additions and 86 deletions.
8 changes: 7 additions & 1 deletion internal/courier/mda.go
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@ import (
"unicode"

"blitiri.com.ar/go/chasquid/internal/envelope"
"blitiri.com.ar/go/chasquid/internal/normalize"
"blitiri.com.ar/go/chasquid/internal/trace"
)

Expand Down Expand Up @@ -60,7 +61,12 @@ func (p *MDA) Deliver(from string, to string, data []byte) (error, bool) {
ctx, cancel := context.WithTimeout(context.Background(), p.Timeout)
defer cancel()
cmd := exec.CommandContext(ctx, p.Binary, args...)
cmd.Stdin = bytes.NewReader(data)

// Pass the email data via stdin. Normalize it to CRLF which is what the
// RFC-compliant representation require. By doing this at this end, we can
// keep a simpler internal representation and ensure there won't be any
// inconsistencies in newlines within the message (e.g. added headers).
cmd.Stdin = bytes.NewReader(normalize.ToCRLF(data))

output, err := cmd.CombinedOutput()
if ctx.Err() == context.DeadlineExceeded {
Expand Down
21 changes: 21 additions & 0 deletions internal/normalize/normalize.go
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@
package normalize

import (
"bytes"
"strings"

"blitiri.com.ar/go/chasquid/internal/envelope"
Expand Down Expand Up @@ -72,3 +73,23 @@ func DomainToUnicode(addr string) (string, error) {
domain, err := Domain(domain)
return user + "@" + domain, err
}

// ToCRLF converts the given buffer to CRLF line endings. If a line has a
// preexisting CRLF, it leaves it be. It assumes that CR is never used on its
// own.
func ToCRLF(in []byte) []byte {
b := bytes.NewBuffer(nil)
b.Grow(len(in))
for _, c := range in {
switch c {
case '\r':

This comment has been minimized.

Copy link
@ThinkChaos

ThinkChaos Dec 23, 2023

Contributor

Maybe this should return an error since that's never supposed to happen and means there's a bug in chasquid?

This comment has been minimized.

Copy link
@albertito

albertito Dec 23, 2023

Author Owner

I am not convinced about this.

"Lonely CR" characters absolutely should not happen, but that is not what this case is checking.

CR characters as part of a CRLF sequence (which will enter this case) should not normally happen because of the endpoint normalization to LF-terminated lines; however there are places where they may sneak in, like hook output.

While it is probably a good idea to close those gaps, I don't think taking such a hard stance of erroring out at this point in time is practical.

We could emit a warning instead, but that also requires care on how, etc..

Or we could separately enforce the LF-terminated internal representation and erroring here would make a lot of sense as a defense-in-depth element of that enforcement; but that is a more comprehensive amount of work that will take more time and care.

So I'm more inclined to leave this as-is for now, and in the future (and with a lot more time), see if introducing a more structured and enforced LF-terminated internal representation throughout the code is worth it.

Does that make sense to you?

Thank you!

This comment has been minimized.

Copy link
@ThinkChaos

ThinkChaos Dec 23, 2023

Contributor

Yes that all sounds good. Thanks for the thorough response!

// Ignore CR, we'll add it back later. It should never appear
// alone in the contexts where this is function is used.

This comment has been minimized.

Copy link
@znerol

znerol Dec 23, 2023

Contributor

Small typo in comment. Should read where this function is used..

This comment has been minimized.

Copy link
@albertito

albertito Dec 23, 2023

Author Owner

Thank you, fixed it!

case '\n':
b.Write([]byte("\r\n"))
default:
b.WriteByte(c)
}
}
return b.Bytes()
}
16 changes: 16 additions & 0 deletions internal/normalize/normalize_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -129,6 +129,22 @@ func TestDomainToUnicode(t *testing.T) {
}
}

func TestToCRLF(t *testing.T) {
cases := []struct {
in, out string
}{
{"", ""},
{"a\nb", "a\r\nb"},
{"a\r\nb", "a\r\nb"},
}
for _, c := range cases {
got := string(ToCRLF([]byte(c.in)))
if got != c.out {
t.Errorf("ToCRLF(%q) = %q, expected %q", c.in, got, c.out)
}
}
}

func FuzzUser(f *testing.F) {
f.Fuzz(func(t *testing.T, user string) {
User(user)
Expand Down
42 changes: 13 additions & 29 deletions internal/smtpsrv/conn.go
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,6 @@ import (
"math/rand"
"net"
"net/mail"
"net/textproto"
"os"
"os/exec"
"strconv"
Expand Down Expand Up @@ -312,6 +311,9 @@ loop:
if err != nil {
break
}
} else if code < 0 {
// Negative code means that we have to break the connection.
break
}
}

Expand Down Expand Up @@ -638,19 +640,19 @@ func (c *Conn) DATA(params string) (code int, msg string) {
// one, we don't want the command timeout to interfere.
c.conn.SetDeadline(c.deadline)

// Create a dot reader, limited to the maximum size.
dotr := textproto.NewReader(bufio.NewReader(
io.LimitReader(c.reader, c.maxDataSize))).DotReader()
c.data, err = io.ReadAll(dotr)
// Read the data. Enforce CRLF correctness, and maximum size.
c.data, err = readUntilDot(c.reader, int(c.maxDataSize))

This comment has been minimized.

Copy link
@ThinkChaos

ThinkChaos Dec 23, 2023

Contributor

int(c.maxDataSize) could overflow and I don't see a downside to pushing the cast down into readUntilDot.

This comment has been minimized.

Copy link
@albertito

albertito Dec 23, 2023

Author Owner

Thanks for catching this!

readUntilDot now takes an int64, and it does an int -> int64 cast inside of it instead.

if err != nil {
if err == io.ErrUnexpectedEOF {
// Message is too big already. But we need to keep reading until we see
// the "\r\n.\r\n", otherwise we will treat the remanent data that
// the user keeps sending as commands, and that's a security
// issue.
readUntilDot(c.reader)
if err == errMessageTooLarge {
// Message is too big; excess data has already been discarded.
return 552, "5.3.4 Message too big"
}
if err == errInvalidLineEnding {
// We can't properly recover from this, so we have to drop the
// connection.
c.writeResponse(521, "5.5.2 Error reading DATA: invalid line ending")
return -1, "Invalid line ending, closing connection"
}
return 554, fmt.Sprintf("5.4.0 Error reading DATA: %v", err)
}

Expand Down Expand Up @@ -952,24 +954,6 @@ func boolToStr(b bool) string {
return "0"
}

func readUntilDot(r *bufio.Reader) {
prevMore := false
for {
// The reader will not read more than the size of the buffer,
// so this doesn't cause increased memory consumption.
// The reader's data deadline will prevent this from continuing
// forever.
l, more, err := r.ReadLine()
if err != nil {
break
}
if !more && !prevMore && string(l) == "." {
break
}
prevMore = more
}
}

// STARTTLS SMTP command handler.
func (c *Conn) STARTTLS(params string) (code int, msg string) {
if c.onTLS {
Expand Down
53 changes: 0 additions & 53 deletions internal/smtpsrv/conn_test.go
Original file line number Diff line number Diff line change
@@ -1,10 +1,8 @@
package smtpsrv

import (
"bufio"
"net"
"os"
"strings"
"testing"

"blitiri.com.ar/go/chasquid/internal/domaininfo"
Expand Down Expand Up @@ -87,57 +85,6 @@ func TestIsHeader(t *testing.T) {
}
}

func TestReadUntilDot(t *testing.T) {
// This must be > than the minimum buffer size for bufio.Reader, which
// unfortunately is not available to us. The current value is 16, these
// tests will break if it gets increased, and the nonfinal cases will need
// to be adjusted.
size := 20
xs := "12345678901234567890"

final := []string{
"", ".", "..",
".\r\n", "\r\n.", "\r\n.\r\n",
".\n", "\n.", "\n.\n",
".\r", "\r.", "\r.\r",
xs + "\r\n.\r\n",
xs + "1234\r\n.\r\n",
xs + xs + "\r\n.\r\n",
xs + xs + xs + "\r\n.\r\n",
xs + "." + xs + "\n.",
xs + ".\n" + xs + "\n.",
}
for _, s := range final {
t.Logf("testing %q", s)
buf := bufio.NewReaderSize(strings.NewReader(s), size)
readUntilDot(buf)
if r := buf.Buffered(); r != 0 {
t.Errorf("%q: there are %d remaining bytes", s, r)
}
}

nonfinal := []struct {
s string
r int
}{
{".\na", 1},
{"\n.\na", 1},
{"\n.\nabc", 3},
{"\n.\n12345678", 8},
{"\n.\n" + xs, size - 3},
{"\n.\n" + xs + xs, size - 3},
{"\n.\n.\n", 2},
}
for _, c := range nonfinal {
t.Logf("testing %q", c.s)
buf := bufio.NewReaderSize(strings.NewReader(c.s), size)
readUntilDot(buf)
if r := buf.Buffered(); r != c.r {
t.Errorf("%q: expected %d remaining bytes, got %d", c.s, c.r, r)
}
}
}

func TestAddrLiteral(t *testing.T) {
// TCP addresses.
casesTCP := []struct {
Expand Down
103 changes: 103 additions & 0 deletions internal/smtpsrv/dotreader.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,103 @@
package smtpsrv

import (
"bufio"
"bytes"
"errors"
"io"
)

var (
errMessageTooLarge = errors.New("message too large")
errInvalidLineEnding = errors.New("invalid line ending")
)

// readUntilDot reads from r until it encounters a dot-terminated line, or we
// read max bytes. It enforces that input lines are terminated by "\r\n", and
// that there are not "lonely" "\r" or "\n"s in the input.
// It returns \n-terminated lines, which is what we use for our internal
// representation for convenience (same as textproto DotReader does).
func readUntilDot(r *bufio.Reader, max int) ([]byte, error) {
buf := make([]byte, 0, 1024)
n := 0

// Little state machine.
const (
prevOther = iota
prevCR
prevCRLF
)
// Start as if we just came from a '\r\n'; that way we avoid the need
// for special-casing the dot-stuffing at the very beginning.
prev := prevCRLF
last4 := make([]byte, 4)
skip := false

loop:
for {
b, err := r.ReadByte()
if err == io.EOF {
return buf, io.ErrUnexpectedEOF
} else if err != nil {
return buf, err
}
n++

switch b {
case '\r':
if prev == prevCR {
return buf, errInvalidLineEnding
}
prev = prevCR
// We return a LF-terminated line, so skip the CR. This simplifies
// internal representation and makes it easier/less error prone to
// work with. It is converted back to CRLF on endpoints (e.g. in
// the couriers).
skip = true
case '\n':
if prev != prevCR {
return buf, errInvalidLineEnding
}
// If we come from a '\r\n.\r', we're done.
if bytes.Equal(last4, []byte("\r\n.\r")) {
break loop
}

// If we are only starting and see ".\r\n", we're also done; in
// that case the message is empty.
if n == 3 && bytes.Equal(last4, []byte("\x00\x00.\r")) {
return []byte{}, nil
}
prev = prevCRLF
default:
if prev == prevCR {
return buf, errInvalidLineEnding
}
if b == '.' && prev == prevCRLF {
// We come from "\r\n" and got a "."; as per dot-stuffing
// rules, we should skip that '.' in the output.
// https://www.rfc-editor.org/rfc/rfc5321#section-4.5.2
skip = true
}
prev = prevOther
}

// Keep the last 4 bytes separately, because they may not be in buf on
// messages that are too large.
copy(last4, last4[1:])
last4[3] = b

if len(buf) < max && !skip {
buf = append(buf, b)
}
skip = false
}

if n > max {
return buf, errMessageTooLarge
}

This comment has been minimized.

Copy link
@znerol

znerol Dec 23, 2023

Contributor

This seems to be outside the loop. Shouldn't that be inside?

This comment has been minimized.

Copy link
@albertito

albertito Dec 23, 2023

Author Owner

No, it is very important that this is outside.

Even if we reach the maximum message size, we need to keep reading until we get to the final "." before we return an error, so the SMTP dialog can continue.

If we break early, the remainder of the email is interpreted as part of the SMTP dialog (and exposing ourselves to smuggling attacks).

This was already done before, in a separate function. I decided to put it here because in the end the logic ends up simpler.

But since you noticed, this is not obvious at all, so I added a comment.

Thank you!

This comment has been minimized.

Copy link
@ThinkChaos

ThinkChaos Dec 23, 2023

Contributor

I'm also curious of why the len(buf) < max isn't just checking n directly and exiting early. A comment would be welcome IMO.

My best guess is to support pipelining: returning errMessageTooLarge at the end allows the caller to continue parsing the following bytes as a different message.
errInvalidLineEnding exits early since it's fatal.

This comment has been minimized.

Copy link
@albertito

albertito Dec 23, 2023

Author Owner

I'm also curious of why the len(buf) < max isn't just checking n directly and exiting early. A comment would be welcome IMO.

I added one in a new version of the patch as mentioned above:
eaa3139#diff-9e2090c5edaae9127658538272db08fc16517b2390d6b32f6c72787d40291fe5R96

My best guess is to support pipelining: returning errMessageTooLarge at the end allows the caller to continue parsing the following bytes as a different message. errInvalidLineEnding exits early since it's fatal.

This is exactly why.

If we don't keep reading until the "\r\n.\r\n", the rest of the message will be interpreted as SMTP commands.

As mentioned above, this is not new behaviour: the code always did it this way, that's what the old ReadUntilDot helper function did (now removed in favour of this, completely new and rewritten implementation).

I hope this helps clarify but please let me know if it's still not clear or you have any further suggestion!

This comment has been minimized.

Copy link
@ThinkChaos

ThinkChaos Dec 23, 2023

Contributor

Ah I missed your comment since it was posted after I loaded the page. Thanks for the extra details!


// If we made it this far, buf naturally ends in "\n" because we skipped
// the '.' due to dot-stuffing, and skip "\r"s.
return buf, nil
}
Loading

0 comments on commit 8f19a0f

Please sign in to comment.