https://gitlab.synchro.net/main/sbbs/-/commit/ca7ab040466b030281a9aaca
Modified Files:
src/sbbs3/main.cpp
Log Message:
Add a 60-second timeout to sbbs_t::passthru_socket_activate()
Keyop reported an issue via irc whereby a user that failed to download a file would leave the node "hung" in "downloading via telnet" node status even
though the user had long since disconnected and the log reflected that the terminal server was aware of this:
term Node 4 <user> sexyz: !1152 zmodem_recv_raw TIMEOUT (10 seconds)
term Node 4 <user> sexyz: !zmodem_recv_header TIMEOUT
term Node 4 <user> external Timeout waiting for output buffer to empty
<minutes later>
term Node 4 connection reset by peer on send
term Node 4 !ERROR 32 sending on socket 102
term Node 4 !ERROR 32 sending on socket 102
term Node 4 !ERROR 32 sending on socket 102
term Node 4 !ERROR 32 sending on socket 102
term Node 4 !ERROR 32 sending on socket 102
term Node 4 disconnected
term Node 4 !ERROR 32 sending on socket 102
and
term Node 3 <user> sexyz: !1152 zmodem_recv_raw TIMEOUT (10 seconds)
term Node 3 <user> sexyz: !zmodem_recv_header TIMEOUT
term Node 3 <user> sexyz: !Receive timeout (1 seconds)
term Node 3 <user> sexyz: !Receive timeout (1 seconds)
term Node 3 <user> sexyz: !Receive timeout (1 seconds)
term Node 3 <user> sexyz: !Receive timeout (1 seconds)
term Node 3 <user> sexyz: !Receive timeout (1 seconds)
term Node 3 <user> sexyz: !Receive timeout (1 seconds)
term Node 3 <user> sexyz: !Receive timeout (1 seconds)
term Node 3 <user> sexyz: !Receive timeout (1 seconds)
term Node 3 <user> sexyz: !Receive timeout (1 seconds)
term Node 3 <user> sexyz: !Receive timeout (1 seconds)
term Node 3 <user> sexyz: !1152 zmodem_recv_raw TIMEOUT (10 seconds)
term Node 3 <user> sexyz: !zmodem_recv_header TIMEOUT
term Node 3 <user> external Timeout waiting for output buffer to empty
<minutes later>
term Node 3 connection reset by peer on receive
term Node 3 !ERROR 32 sending on socket 96
These nodes were then locked up in call to passthru_socket_activate(false)
as reported by gdb, e.g.
Looking at passthru_socket_activate(), the deactivation path (called at the
end of external() in this case), it was clear that this could be an infinite loop in the case the user had disconnected:
do { // Allow time for the passthru_thread to move any pending socket data to the outbuf
SLEEP(100); // Before the node_thread starts sending its own data to the outbuf
} while(RingBufFull(&outbuf));
These flush/purge loops aren't strictly needed if the user has disconnected, but as can be seen by the above logs, the terminal server may not know that (the socket may not indicate disconnect) before passthru_socket_activate()
is called by external().
So... worst case, just do the activation and deactivation buffer flushes
and purges for 60 seconds.