udp_send: add a time window for errors
This implements a grace period during which ICMP Destination Unreachable /
Port Unreachable are ignored from the peer, with the purpose of catching
* synchronisation problems (e.g., receiver started after sender);
* unforeseen events (e.g. delays, reboot, reconfiguration).
To avoid receiving persistent errors from unicast UDP clients, two time
windows are used:
(a) The 'error-allowed' period t_A
This time window starts when the first ECONNREFUSED error is seen
and ends after t_A seconds. The value of t_A is a guess which should
cover the expected time needed to sort any receiver problems out.
(b) The 'error-free' period t_B
During the t_B seconds following the interval t_A, no further
connection errors are accepted; if an ECONNREFUSED is seen it
will cause the target to be evicted from the list.
This windowing process restarts itself, i.e. the first error seen after
t_A+t_B will reset the counters.
The following examples illustrate the algorithm, where 'x' indicates
receipt of an ICMP error message.
1) Some errors received during initial receiver setup
|-x-x-x-x-x-x-x--|------------|------....
|
t_0 t_0 + t_A t_0+t_A+t_B
Since no errors are received after t_0 + t_A, no action will be taken.
2) Persistent errors
|-x-x-x-x-x-x-x-x|-x-x-x-x-x-x|
|
t_0 t_0 + t_A t_0+t_A+t_B
The first error received after t_0+t_A evicts the target.
3) Recurring short errors
|-x-x-x-x-x-x-x--|-//--|--------//---|-x-x-x-x-x-x-x--|------------|---...
|
t_0 t_0+t_A+t_B t_1 t_1+t_A t_1+t_A+t_B
Here the counter is reset at the first error after t_1. Since no more errors
were seen after t_1+t_A, streaming continues.
For simplicity, the implementation uses t_A = t_B = 30 seconds.
The behaviour with an unavailable receiver is now:
May 24 18:08:16 (0) (2702) vss_send: sending 123:0 (548 bytes)
May 24 18:08:18 (0) (2702) vss_send: sending 132:2 (548 bytes)
May 24 18:08:18 (2) (2702) udp_check_socket_state: Evicting 10.0.0.2#8000 after 31 seconds of connection errors.
May 24 19:34:55 (0) (2702) vss_send: sending 5:14 (1232 bytes)
May 24 19:35:10 (0) (2702) vss_send: sending 11:3 (1232 bytes)
May 24 19:35:10 (2) (2702) udp_check_socket_state: Evicting 3ffe::2#8000 after 31 seconds of connection errors.