The occasional ECONNRESET

102 points23 comments11 hours ago
smarks

Part 2 shows this comment from the Linux TCP code:

    /* As outlined in RFC 2525, section 2.17, we send a RST here because
     * data was lost. To witness the awful effects of the old behavior of
     * always doing a FIN, run an older 2.1.x kernel or 2.0.x, start a bulk
     * GET in an FTP client, suspend the process, wait for the client to
     * advertise a zero window, then kill -9 the FTP client, wheee...
     * Note: timeout is always zero in such a case.
     */
Ok, so the RST is explained and well justified by the literature. But what are the “awful effects” of sending FIN instead? Can someone explain?
show comments
toast0

Might want to read the section on Lingering Close from here:

https://httpd.apache.org/docs/2.4/misc/perf-tuning.html

show comments
kune

The RST (Reset) is sent to inform the client that the data it sent was not read by the server. The RST avoids here the 4-way handshake for the TCP connection closure and the long wait times, if the client doesn't behave normal.

For the case here the server should call shutdown with SHUT_WR after sending the data and then drain the incoming data before closing the socket.

bayesnet

Really love this article. Opens with the problem statement and jumps straight into the investigation. Thanks for a very enjoyable read (and an rss feed!)

gunsch

A few months ago I was debugging a similar issue in a Go-based service layer, where frequent HTTP requests to the same domain kept making fresh TCP connections when I was expecting TCP conn reuse.

In this situation we were discarding the HTTP response without reading it before closing, which kept Go from reusing the connection. I didn't dig quite as deep as this post's author, but I imagine the same RST behavior was happening under the hood.

Joker_vD

> Send off the data and close the socket. If there's data still pending to be read, this will cause a RST, I think.

Um, yes? That's how TCP has been universally implemented for more than 30 years. See [0], 2.17 for discussion.

[0] https://www.rfc-editor.org/rfc/rfc2525#page-50

show comments
jcalvinowens

As others have noted, this usually happens because both sides wrote data and one side didn't read it before calling close().

Here's a little reproducer: https://gist.github.com/jcalvinowens/da57edda9a01ca9f4c4088a...

    $ gcc -O2 test.c -o test
    
    $ strace -e socket,connect,write,accept,read,close ./test --rx        
    <...>                                                                           
    socket(AF_INET, SOCK_STREAM, IPPROTO_IP) = 3                                                                           
    accept(3, NULL, NULL)                   = 4                                                                            
    close(3)                                = 0                                                                            
    read(4, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 4096) = 4096
    <...>
    read(4, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 4096) = 4096
    read(4, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 4096) = 3035
    read(4, "", 4096)                       = 0
    close(4)                                = 0
    +++ exited with 0 +++

    $ strace -e socket,connect,write,accept,read,close ./test --tx
    <...>
    socket(AF_INET, SOCK_STREAM, IPPROTO_IP) = 3
    connect(3, {sa_family=AF_INET, sin_port=htons(31337), sin_addr=inet_addr("127.0.0.1")}, 16) = 0
    write(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 600000) = 600000
    close(3)                                = 0
    +++ exited with 0 +++
...versus:

    $ gcc -O2 -DWRITE_TO_SOCKET_BEFORE_READ test.c -o test
    
    $ strace -e socket,connect,write,accept,read,close ./test --rx
    <...>
    socket(AF_INET, SOCK_STREAM, IPPROTO_IP) = 3
    accept(3, NULL, NULL)                   = 4
    close(3)                                = 0
    write(4, "\250\3\0\0\0\0\0\0\250\3\0\0\0\0\0\0$\0\0\0\0\0\0\0$\0\0\0\0\0\0\0"..., 4096) = 4096
    read(4, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 4096) = 4096
    <...>
    read(4, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 4096) = 997
    read(4, 0x7ffd45c2d3c0, 4096)           = -1 ECONNRESET (Connection reset by peer)
    <...>
    +++ exited with 1 +++
    
    $ strace -e socket,connect,write,accept,read,close ./test --tx
    <...>
    socket(AF_INET, SOCK_STREAM, IPPROTO_IP) = 3
    connect(3, {sa_family=AF_INET, sin_port=htons(31337), sin_addr=inet_addr("127.0.0.1")}, 16) = 0
    write(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 600000) = 600000
    close(3) 
    +++ exited with 0 +++