Guus Sliepen [Sun, 11 Oct 2020 14:16:31 +0000 (16:16 +0200)]
When a new connection is activated, terminate any pending connections to the same peer.
This prevents issues mainly in the test suite where peers try to connect to
each other simultaneously, and have to terminate one of the connections.
Before both connections would succeed, and both would be terminated, leading
to a loop of reconnections until enough randomness got in to break the tie.
Guus Sliepen [Sun, 11 Oct 2020 13:40:34 +0000 (15:40 +0200)]
Don't reset the UDP SPTPS session when a node becomes reachable.
Only do this when it becomes unreachable. This fixes an issue where right
after a meta-connection is established, the initiator sends a proactive
REQ_KEY, before the peer really becomes reachable according to the graph.
When the latter happened, it would reset the session so far, causing a new
REQ_KEY to be sent, which could cross the ANS_KEY from the peer. This would
resolve itself after a few seconds, but causes an unnecessary delay that is
easy to trigger.
Guus Sliepen [Sun, 11 Oct 2020 13:19:01 +0000 (15:19 +0200)]
Fix waiting for peer node to become reachable in test suite.
Guus Sliepen [Wed, 30 Sep 2020 20:16:22 +0000 (22:16 +0200)]
Fix cornercases closing channels.
Closing a channel while there was data in the receive buffer would cause a
RST to be sent instead of a FIN. We now always send a FIN, and let data
in the receive buffer be handled for a later data handling (which would
then send a RST if necessary).
The RST could be dropped if the ACK seqno was not in the correct range.
We now always accept RSTs for established connections.
Finally, when receiving more data after closing the channel, we would just
accept the data but discard it, instead of sending a RST back. Now we do
send a RST back.
Guus Sliepen [Tue, 29 Sep 2020 20:38:55 +0000 (22:38 +0200)]
Send RST packets when receiving data after we closed a UDP channel.
If the application closed a channel, we keep the UTCP connection alive for
a bit longer to handle resends of FIN packets. However, if this is missed
for some reason, either because the FIN got lost or the peer ignored the
receive callback, and the peer is sending new data, we need to inform it
that we are no longer listening. To do this, send a RST back.
Guus Sliepen [Fri, 25 Sep 2020 20:00:55 +0000 (22:00 +0200)]
Extend the timeout period of the authentication phase on progress.
If there is progress during the authentication phase of connections, we reset
the ping timer to give extra time to complete the authentication.
We do the same for invitation connections.
Guus Sliepen [Fri, 25 Sep 2020 19:53:12 +0000 (21:53 +0200)]
Don't use fast timeouts for fully established connections.
During the fast retry period, we want to have a fast ping timeout until we have
a fully working connection. However, the code still used fast timeouts during
the fast retry window even if the connection was fully established.
Guus Sliepen [Fri, 25 Sep 2020 19:48:59 +0000 (21:48 +0200)]
Fix timeouts of 1 second expiring in less than one second.
Guus Sliepen [Thu, 10 Sep 2020 21:18:26 +0000 (23:18 +0200)]
Allow sptps_force_kex() in all situations.
Also allow sptps_force_kex() during the initial key exchange.
Guus Sliepen [Thu, 10 Sep 2020 21:13:39 +0000 (23:13 +0200)]
Allow sptps_force_kex() while a key exchange is in progress
We should not do anything if we are already exchanging a new key, and
just return true. This change prevents higher layers in MeshLink from
terminating a connection between two nodes if both peers call
sptps_force_kex() at nearly the same time.
Guus Sliepen [Mon, 7 Sep 2020 19:12:28 +0000 (21:12 +0200)]
Use the canonical address exclusively for making outgoing meta-connections.
If we have a node's canonical address, we now always use that as a source
for addresses for outgoing meta-connection attempts. This commit also adds
the function meshlink_clear_canonical_address() to ensure the canonical
address can be removed if it is no longer valid.
Guus Sliepen [Mon, 7 Sep 2020 19:10:22 +0000 (21:10 +0200)]
Update the invite-join test to update the canonical address after a port change.
Guus Sliepen [Mon, 7 Sep 2020 15:58:23 +0000 (17:58 +0200)]
Always ensure we store a port number when setting the canonical address.
Guus Sliepen [Sat, 5 Sep 2020 09:52:12 +0000 (11:52 +0200)]
Use the fast retry period of the destination node's device class.
Guus Sliepen [Tue, 4 Aug 2020 13:24:07 +0000 (15:24 +0200)]
Remove temporary files at startup.
When something happens while a host config files is written, a temporary
file might be left over. Clean these up when we find them when starting
MeshLink.
Guus Sliepen [Wed, 29 Jul 2020 12:44:42 +0000 (14:44 +0200)]
Add meshlink_set_channel_listen_cb().
The accept callback is called when the peer has already fully established a
connection. The listen callback is called earlier, when there is no
fully established channel yet. However, the listen callback itself does not
get a channel handle, it can only make a decision based on the peer node
and port number whether to accept the channel, and if so the accept callback
will be called later.
Guus Sliepen [Wed, 29 Jul 2020 12:40:27 +0000 (14:40 +0200)]
Ensure the poll callback is called when a UDP channel has finished connecting.
Guus Sliepen [Thu, 23 Jul 2020 22:06:53 +0000 (00:06 +0200)]
Speed up the import-export test by using the PMTU callback.
Guus Sliepen [Thu, 23 Jul 2020 22:06:26 +0000 (00:06 +0200)]
Fix the import-export test failing sporadically.
Since two nodes connecting simultaneously can cause a node to be seen as
temporarily being down, thus setting the last_unreachable status.
Guus Sliepen [Thu, 23 Jul 2020 22:04:38 +0000 (00:04 +0200)]
Always let the initiator send a REQ_KEY once a connection is activated.
Before, the logic was to do this when the graph reported a bidirectional
edge. However, there was a possibility that if two nodes connect to each
other simultaneously, causing a second connection to be activated while the
first was also still active, which caused the REQ_KEY to not be sent.
Guus Sliepen [Thu, 23 Jul 2020 21:13:57 +0000 (23:13 +0200)]
Fix the channels-udp test case.
Close channels gracefully after the server has finished sending the streams.
Guus Sliepen [Thu, 23 Jul 2020 14:19:29 +0000 (16:19 +0200)]
Combine blackbox join test cases into the invite-join test.
Guus Sliepen [Wed, 22 Jul 2020 21:34:06 +0000 (23:34 +0200)]
Don't attempt to sync confbase for ephemeral nodes during a join.
Guus Sliepen [Wed, 22 Jul 2020 19:07:02 +0000 (21:07 +0200)]
Port the blackbox status_cb test.
Use ephemeral MeshLink instances to speed up the test.
Guus Sliepen [Wed, 22 Jul 2020 18:36:18 +0000 (20:36 +0200)]
Port the blackbox meta-connections test using network namespaces.
Guus Sliepen [Wed, 22 Jul 2020 18:35:18 +0000 (20:35 +0200)]
When activating a meta-connection, remember the address of the peer.
There was a possible situation where we did not remember the address of a
peer when there is only a TCP connection.
Guus Sliepen [Wed, 22 Jul 2020 18:31:59 +0000 (20:31 +0200)]
Reset last_connect_try for all nodes when starting the mesh.
This helps the autoconnect algorithm reconnect faster if the mesh was
stopped and restarted in quick succession.
Guus Sliepen [Wed, 22 Jul 2020 18:30:14 +0000 (20:30 +0200)]
Add reset_sync_flag().
This is for the test suite to reset a flag without broadcasting that it
changed state.
Guus Sliepen [Wed, 22 Jul 2020 18:29:22 +0000 (20:29 +0200)]
Add devtool_set_meta_status_cb().
This function is similar to meshlink_set_node_status_cb(), except that
this callback will only be called when a meta-connection to a node is
activated or terminated. This is mainly useful for the test suite.
Guus Sliepen [Mon, 13 Jul 2020 20:40:06 +0000 (22:40 +0200)]
Don't store empty canonical addresses.
Use a NULL pointer instead of an empty string to signal the lack of a
canonical address.
Guus Sliepen [Wed, 22 Jul 2020 13:00:59 +0000 (15:00 +0200)]
Fix invitation URL generation when running in a network namespace.
MeshLink could call getifaddrs() in the namespace of the caller instead of
the MeshLink thread, causing the wrong addresses to be put in the inviation
URL.
Guus Sliepen [Wed, 22 Jul 2020 12:59:44 +0000 (14:59 +0200)]
Export missing meshlink_open_params_* symbols.
Guus Sliepen [Fri, 10 Jul 2020 20:15:39 +0000 (22:15 +0200)]
Fix a crash with some network configurations.
It is apparently possible for getifaddrs() to return a struct ifaddrs that
contains a NULL ifa_addr pointer.
Guus Sliepen [Fri, 10 Jul 2020 20:13:23 +0000 (22:13 +0200)]
Add meshlink_reset_timers().
This resets the outgoing connections timers and ping timers, exactly what
is done when the network intterface status changes.
Guus Sliepen [Tue, 7 Jul 2020 08:26:21 +0000 (10:26 +0200)]
Check for the presence of stdatomic.h before using it.
Guus Sliepen [Sat, 4 Jul 2020 11:59:52 +0000 (13:59 +0200)]
Don't use assert() to check the results of pthread_*() calls.
This was done to debug the code, but it fails when MeshLink is compiled with
-DNDEBUG. Remove all assert()s from calls to pthread functions, and instead
add explicit checks to only those functions that can fail.
Guus Sliepen [Sat, 4 Jul 2020 10:26:15 +0000 (12:26 +0200)]
Export meshlink_set_dev_class_maxtimeout().
Guus Sliepen [Sat, 4 Jul 2020 10:26:00 +0000 (12:26 +0200)]
Ensure the maxtimeout value is taken from the destination node's device class.
Guus Sliepen [Wed, 24 Jun 2020 20:22:01 +0000 (22:22 +0200)]
Make the maximum outgoing connection timeout runtime configurable.
This moves maxtimeout to the dev_class_traits, and adds a function to
change it, similar to how we handle other configurable timeouts.
Guus Sliepen [Tue, 16 Jun 2020 19:34:52 +0000 (21:34 +0200)]
Fix compiling the C++ examples.
Guus Sliepen [Tue, 16 Jun 2020 18:07:12 +0000 (20:07 +0200)]
Revert "Fix warnings caused by C-only flags passed to the C++ compiler."
This reverts commit
9382f55b3f1c14c74e3bda229e277400743b11cc.
Guus Sliepen [Tue, 16 Jun 2020 18:06:59 +0000 (20:06 +0200)]
Revert "Don't overwrite CFLAGS when adding -std=c11."
This reverts commit
6d0820fda557de28c014b8ad4c40c385d23029a7.
Guus Sliepen [Mon, 15 Jun 2020 08:03:36 +0000 (10:03 +0200)]
Don't overwrite CFLAGS when adding -std=c11.
Guus Sliepen [Sun, 14 Jun 2020 12:45:18 +0000 (14:45 +0200)]
React faster to network changes, including point-to-point links.
Tell Catta to also include point-to-point links, and when we get an
update from the Catta thread, wake up the main MeshLink thread so we
react to it immediately.
Guus Sliepen [Sat, 13 Jun 2020 19:45:08 +0000 (21:45 +0200)]
Only call setitimer if ITIMER_REAL is defined.
Musl doesn't implement setitimer().
Guus Sliepen [Sat, 13 Jun 2020 19:39:39 +0000 (21:39 +0200)]
Set NOSIGPIPE on all sockets.
MacOS can raise a SIGPIPE when a local socket gets disconnected.
Guus Sliepen [Fri, 12 Jun 2020 22:41:47 +0000 (00:41 +0200)]
Fix warnings and missing mutex/cond initialization in the test suite.
Guus Sliepen [Fri, 12 Jun 2020 22:40:43 +0000 (00:40 +0200)]
Initialize the adns_done_queue.
Guus Sliepen [Fri, 12 Jun 2020 22:40:09 +0000 (00:40 +0200)]
Fix warnings caused by C-only flags passed to the C++ compiler.
Guus Sliepen [Fri, 12 Jun 2020 07:09:52 +0000 (09:09 +0200)]
Add missing initialization of a condition variable.
Guus Sliepen [Thu, 11 Jun 2020 20:31:25 +0000 (22:31 +0200)]
Fix some test cases using the same configuration directory.
Guus Sliepen [Thu, 11 Jun 2020 19:52:00 +0000 (21:52 +0200)]
Use atomic operations to check whether to write to the signal pipe.
We need to do an atomic test-and-set operation to check whether we can
avoid writing to the signal pipe. Use C11 atomics to do this in a portable
way (hopefully).
Guus Sliepen [Thu, 11 Jun 2020 20:22:37 +0000 (22:22 +0200)]
Remove gettimeofday() usage from test cases.
Guus Sliepen [Thu, 11 Jun 2020 20:17:23 +0000 (22:17 +0200)]
Add asserts() to all pthread related function calls.
We normally expect all pthread-related functions to succeed, so in all
places where we didn't already explicitly check the return value, assert()
that the functions return 0.
Guus Sliepen [Wed, 10 Jun 2020 20:25:12 +0000 (22:25 +0200)]
Properly initialize mutexes and condition variables.
On Linux, zeroing a pthread_mutex_t or pthread_cond_t variable ensures
the mutex/cond is properly initialized, however this is not the case om
some other platforms. Ensure we always call pthread_mutex/cond_init().
Guus Sliepen [Tue, 9 Jun 2020 21:18:04 +0000 (23:18 +0200)]
Fix compiler warnings on macOS.
Guus Sliepen [Tue, 9 Jun 2020 21:12:44 +0000 (23:12 +0200)]
Check whether IP_MTU is defined.
Guus Sliepen [Fri, 5 Jun 2020 16:09:38 +0000 (18:09 +0200)]
Fix meshlink_join() failing on Android.
The adns_blocking_request() function did not pass a hint to
getaddrinfo(). With glibc, the resulting struct addrinfo sets socktype
and protocol to SOCK_STREAM and IPPROTO_TCP, and the call to connect()
copied these values. However, bionic doesn't set the socktype and
protocol to those values if no hint was specified.
Guus Sliepen [Fri, 5 Jun 2020 07:40:26 +0000 (09:40 +0200)]
Ensure utcp-test is compiled with the same flags as libmeshlink.
Guus Sliepen [Wed, 3 Jun 2020 19:23:33 +0000 (21:23 +0200)]
Include "system.h" in the UTCP sources.
The UTCP sources were not using the system.h header file like the rest
of the code, resulting in issues with some compilers.
Guus Sliepen [Wed, 3 Jun 2020 19:19:09 +0000 (21:19 +0200)]
Don't crash when trying to connect a channel to port 0.
Guus Sliepen [Thu, 21 May 2020 12:48:02 +0000 (14:48 +0200)]
Explicitly set the stack size for the MeshLink thread.
Different libcs have different default sizes for newly created threads. In
particular, Musl defaults to 80 kB, which is too small for MeshLink. We now
request 1 MB, which should be more than enough to handle the deepest call
stacks.
Guus Sliepen [Wed, 20 May 2020 22:13:50 +0000 (00:13 +0200)]
Limit the size of the fd read buffer in channel_poll().
Guus Sliepen [Mon, 18 May 2020 21:25:50 +0000 (23:25 +0200)]
Add a test for AIO callback cornercases.
Guus Sliepen [Mon, 18 May 2020 21:24:56 +0000 (23:24 +0200)]
Fix assert that could incorrectly be triggered when a peer closed the channel.
If we are sending AIO data on a channel and the peer closed the connection,
we could trigger an assert incorrectly.
Guus Sliepen [Mon, 18 May 2020 21:23:32 +0000 (23:23 +0200)]
Report the amount of actual data sent/received in AIO callbacks.
We did this for fds but not for memory buffers.
Guus Sliepen [Mon, 18 May 2020 21:22:06 +0000 (23:22 +0200)]
Reset closed connections if there is incoming data.
Guus Sliepen [Fri, 15 May 2020 21:12:34 +0000 (23:12 +0200)]
Include our own key in REQ_PUBKEY requests.
If we don't know a peer's public key, it most likely means the peer
doesn't know our public key, so proactively send it along with the
REQ_PUBKEY request.
Guus Sliepen [Fri, 15 May 2020 07:39:59 +0000 (09:39 +0200)]
Handle UTCP receive buffer wraparound corner cases.
Before we allowed buf->offset to be equal to buf->size. This caused an
issue where buffer_call() would call the callback twice, once for 0
bytes at the end of the buffer, and once for len bytes at the start of
the buffer. This would cause the callback function to think the channel
had encountered an error.
If the data in the ringbuffer wraps around, and we call the receive
callback for the first part of the data, the callback function might
close the channel, so we must not call the callback for the second part
of the data.
Guus Sliepen [Wed, 13 May 2020 20:50:26 +0000 (22:50 +0200)]
Fix mismatch in supported compile flags tested for and added to CPPFLAGS.
Guus Sliepen [Wed, 13 May 2020 20:48:02 +0000 (22:48 +0200)]
Fix more compiler warnings.
This enables much more compiler warnings by default, and fixes all the
warnings we get from GCC 10 and Clang 10 in the library.
The blackbox test cases still produce a lot of warnings that have not
been fixed yet.
Guus Sliepen [Mon, 11 May 2020 20:04:22 +0000 (22:04 +0200)]
Fix make distcheck.
Guus Sliepen [Mon, 11 May 2020 19:34:50 +0000 (21:34 +0200)]
Add Doxygen support to the build system.
Guus Sliepen [Mon, 11 May 2020 17:52:00 +0000 (19:52 +0200)]
Move UTCP into the MeshLink repository.
UTCP is not used outside of MeshLink at the moment, and there is a tight
coupling between the two, so it makes more sense to have it as part of
MeshLink itself.
Guus Sliepen [Sat, 9 May 2020 20:41:23 +0000 (22:41 +0200)]
Handle EINTR when reading/writing to AIO fds.
Guus Sliepen [Fri, 8 May 2020 10:48:44 +0000 (12:48 +0200)]
Handle meshlink_channel_close() being called in callbacks.
When it's called in a callback, we can't free the channel until the
function that called the callback has a chance to safely complete. This
is not a problem for regular receive and poll callbacks, but it is for AIO,
where there can be multiple outstanding AIO buffers that each need their
callback called to signal completion, and each of them could potentially
call meshlink_channel_close().
This also ensures that when the channel is explicitly closed by the
application, it will not receive any further callbacks.
Guus Sliepen [Mon, 4 May 2020 06:42:22 +0000 (08:42 +0200)]
Update UTCP to fix retransmission of SYNACK packets.
Guus Sliepen [Tue, 28 Apr 2020 21:19:29 +0000 (23:19 +0200)]
Avoid a segfault when setting a timeout to 0.
The event loop was assuming that a timespec value of {0, 0} meant that the
timer was not added to the timer tree. However, it was possible for other
parts of the code to set the value to {0, 0}, which could result in a
segmentation fault. Use the splay_node_t data pointer to check whether a
timeout is linked into the tree instead.
Guus Sliepen [Tue, 28 Apr 2020 20:21:34 +0000 (22:21 +0200)]
Several fixes for channel AIO send and receive functions.
- Process multiple buffers if possible
- Better handling error conditions
- fd errors now cancel the AIO buffer
- channel errors cancel all outstanding AIO buffers
- Don't call the poll callback with a length larger than the remaining
UTCP send buffer.
Guus Sliepen [Tue, 28 Apr 2020 18:20:45 +0000 (20:20 +0200)]
Fix a potential read from a freed buffer when sending data to a blacklisted node.
Guus Sliepen [Tue, 21 Apr 2020 21:33:47 +0000 (23:33 +0200)]
Avoid a crash when graph() is called when the event loop is not running.
Guus Sliepen [Sun, 19 Apr 2020 14:28:43 +0000 (16:28 +0200)]
Make UTCP retranmissions trigger PMTU probes immediately.
If there are network problems while data is being transferred over a
channel, we want to react to this as soon as possible. Set the retranmission
callback to trigger the next PMTU probe immediately if there as none in
progress.
Guus Sliepen [Thu, 16 Apr 2020 21:56:36 +0000 (23:56 +0200)]
Use nanosleep() instead of clock_nanosleep().
Mac OS X does not support the latter.
Guus Sliepen [Thu, 16 Apr 2020 00:12:33 +0000 (02:12 +0200)]
Update UTCP to fix a potential segmentation fault.
Guus Sliepen [Thu, 16 Apr 2020 00:08:38 +0000 (02:08 +0200)]
Ensure exported host files have a port in the canonical address.
Guus Sliepen [Sat, 11 Apr 2020 22:54:58 +0000 (00:54 +0200)]
Update UTCP.
Guus Sliepen [Sat, 11 Apr 2020 22:14:23 +0000 (00:14 +0200)]
Check that we can send up to 65536 bytes at a time on UDP-style channels.
Guus Sliepen [Sat, 11 Apr 2020 22:06:13 +0000 (00:06 +0200)]
Have nodes remember in which submesh they live.
This exposes the submesh a node lives in to the node itself.
Guus Sliepen [Sat, 11 Apr 2020 16:15:50 +0000 (18:15 +0200)]
Allow meshlink_open() to be called with a NULL name.
This will use the name used last time the MeshLink instance was initialized.
If there is no initialized instance at the given confbase, it will return
an error.
Opening an instance with a different name than the one in the configuration
files will now also result in an error.
Guus Sliepen [Sat, 11 Apr 2020 15:43:18 +0000 (17:43 +0200)]
When resetting timers that use CLOCK_MONOTONIC, use a negative value.
CLOCK_MONOTONIC might be implemented as the time since the CPU booted, so
if MeshLink starts soon after booting, setting timers to "0" might not
actually be far enough in the past to trigger a timeout.
This has almost no effect in practice, since most timeouts are a minute or
less, but it might affect running tests in virtual machines.
Guus Sliepen [Sat, 11 Apr 2020 15:27:39 +0000 (17:27 +0200)]
Add a probe point for SPTPS renewal and devtool_force_sptps_renewal().
The latter function just sets the timers so they should time out immediately.
Guus Sliepen [Sat, 11 Apr 2020 15:25:53 +0000 (17:25 +0200)]
Also renew SPTPS keys for meta-connections.
Guus Sliepen [Sat, 11 Apr 2020 14:33:22 +0000 (16:33 +0200)]
Add a probe point for async DNS resolving.
sairoop-elear [Fri, 10 Apr 2020 21:58:51 +0000 (03:28 +0530)]
Add UTCP UDP channel corner cases and test cases on channel get MSS length API
sairoop-elear [Fri, 10 Apr 2020 11:45:22 +0000 (17:15 +0530)]
Add channel poll callback corner cases
sairoop-elear [Tue, 7 Apr 2020 20:22:26 +0000 (01:52 +0530)]
Add missing atomic test cases to the APIs that affects disk writes
Guus Sliepen [Thu, 9 Apr 2020 22:06:15 +0000 (00:06 +0200)]
Use blocking ADNS requests for most other hostname resolving.
We keep calling getaddrinfo() directly if we know we only need to resolve
a numerical address.
Guus Sliepen [Thu, 9 Apr 2020 21:39:19 +0000 (23:39 +0200)]
Add "blocking" asynchronous DNS requests.
These block for a limited amount of time, preventing lookups from taking
too long. Because these requests can be done without the main MeshLink
thread running, we don't use the request queue, but instead spawn a
thread for each blocking request.
Guus Sliepen [Tue, 7 Apr 2020 22:50:51 +0000 (00:50 +0200)]
Add asynchronous DNS lookups for outgoing connections.
Guus Sliepen [Tue, 7 Apr 2020 21:17:38 +0000 (23:17 +0200)]
Remove unused support for proxies.