Guus Sliepen [Fri, 8 May 2020 10:48:44 +0000 (12:48 +0200)]
Handle meshlink_channel_close() being called in callbacks.
When it's called in a callback, we can't free the channel until the
function that called the callback has a chance to safely complete. This
is not a problem for regular receive and poll callbacks, but it is for AIO,
where there can be multiple outstanding AIO buffers that each need their
callback called to signal completion, and each of them could potentially
call meshlink_channel_close().
This also ensures that when the channel is explicitly closed by the
application, it will not receive any further callbacks.
Guus Sliepen [Mon, 4 May 2020 06:42:22 +0000 (08:42 +0200)]
Update UTCP to fix retransmission of SYNACK packets.
Guus Sliepen [Tue, 28 Apr 2020 21:19:29 +0000 (23:19 +0200)]
Avoid a segfault when setting a timeout to 0.
The event loop was assuming that a timespec value of {0, 0} meant that the
timer was not added to the timer tree. However, it was possible for other
parts of the code to set the value to {0, 0}, which could result in a
segmentation fault. Use the splay_node_t data pointer to check whether a
timeout is linked into the tree instead.
Guus Sliepen [Tue, 28 Apr 2020 20:21:34 +0000 (22:21 +0200)]
Several fixes for channel AIO send and receive functions.
- Process multiple buffers if possible
- Better handling error conditions
- fd errors now cancel the AIO buffer
- channel errors cancel all outstanding AIO buffers
- Don't call the poll callback with a length larger than the remaining
UTCP send buffer.
Guus Sliepen [Tue, 28 Apr 2020 18:20:45 +0000 (20:20 +0200)]
Fix a potential read from a freed buffer when sending data to a blacklisted node.
Guus Sliepen [Tue, 21 Apr 2020 21:33:47 +0000 (23:33 +0200)]
Avoid a crash when graph() is called when the event loop is not running.
Guus Sliepen [Sun, 19 Apr 2020 14:28:43 +0000 (16:28 +0200)]
Make UTCP retranmissions trigger PMTU probes immediately.
If there are network problems while data is being transferred over a
channel, we want to react to this as soon as possible. Set the retranmission
callback to trigger the next PMTU probe immediately if there as none in
progress.
Guus Sliepen [Thu, 16 Apr 2020 21:56:36 +0000 (23:56 +0200)]
Use nanosleep() instead of clock_nanosleep().
Mac OS X does not support the latter.
Guus Sliepen [Thu, 16 Apr 2020 00:12:33 +0000 (02:12 +0200)]
Update UTCP to fix a potential segmentation fault.
Guus Sliepen [Thu, 16 Apr 2020 00:08:38 +0000 (02:08 +0200)]
Ensure exported host files have a port in the canonical address.
Guus Sliepen [Sat, 11 Apr 2020 22:54:58 +0000 (00:54 +0200)]
Update UTCP.
Guus Sliepen [Sat, 11 Apr 2020 22:14:23 +0000 (00:14 +0200)]
Check that we can send up to 65536 bytes at a time on UDP-style channels.
Guus Sliepen [Sat, 11 Apr 2020 22:06:13 +0000 (00:06 +0200)]
Have nodes remember in which submesh they live.
This exposes the submesh a node lives in to the node itself.
Guus Sliepen [Sat, 11 Apr 2020 16:15:50 +0000 (18:15 +0200)]
Allow meshlink_open() to be called with a NULL name.
This will use the name used last time the MeshLink instance was initialized.
If there is no initialized instance at the given confbase, it will return
an error.
Opening an instance with a different name than the one in the configuration
files will now also result in an error.
Guus Sliepen [Sat, 11 Apr 2020 15:43:18 +0000 (17:43 +0200)]
When resetting timers that use CLOCK_MONOTONIC, use a negative value.
CLOCK_MONOTONIC might be implemented as the time since the CPU booted, so
if MeshLink starts soon after booting, setting timers to "0" might not
actually be far enough in the past to trigger a timeout.
This has almost no effect in practice, since most timeouts are a minute or
less, but it might affect running tests in virtual machines.
Guus Sliepen [Sat, 11 Apr 2020 15:27:39 +0000 (17:27 +0200)]
Add a probe point for SPTPS renewal and devtool_force_sptps_renewal().
The latter function just sets the timers so they should time out immediately.
Guus Sliepen [Sat, 11 Apr 2020 15:25:53 +0000 (17:25 +0200)]
Also renew SPTPS keys for meta-connections.
Guus Sliepen [Sat, 11 Apr 2020 14:33:22 +0000 (16:33 +0200)]
Add a probe point for async DNS resolving.
sairoop-elear [Fri, 10 Apr 2020 21:58:51 +0000 (03:28 +0530)]
Add UTCP UDP channel corner cases and test cases on channel get MSS length API
sairoop-elear [Fri, 10 Apr 2020 11:45:22 +0000 (17:15 +0530)]
Add channel poll callback corner cases
sairoop-elear [Tue, 7 Apr 2020 20:22:26 +0000 (01:52 +0530)]
Add missing atomic test cases to the APIs that affects disk writes
Guus Sliepen [Thu, 9 Apr 2020 22:06:15 +0000 (00:06 +0200)]
Use blocking ADNS requests for most other hostname resolving.
We keep calling getaddrinfo() directly if we know we only need to resolve
a numerical address.
Guus Sliepen [Thu, 9 Apr 2020 21:39:19 +0000 (23:39 +0200)]
Add "blocking" asynchronous DNS requests.
These block for a limited amount of time, preventing lookups from taking
too long. Because these requests can be done without the main MeshLink
thread running, we don't use the request queue, but instead spawn a
thread for each blocking request.
Guus Sliepen [Tue, 7 Apr 2020 22:50:51 +0000 (00:50 +0200)]
Add asynchronous DNS lookups for outgoing connections.
Guus Sliepen [Tue, 7 Apr 2020 21:17:38 +0000 (23:17 +0200)]
Remove unused support for proxies.
Guus Sliepen [Thu, 23 May 2019 21:02:43 +0000 (23:02 +0200)]
Add an asynchronous DNS thread.
Add a thread dedicated to making DNS lookups. There are two queues, one
for pending DNS requests and one for done DNS requests. The async DNS
thread reads from the pending request queue, checks for each request if
the deadline has not been met yet, and if so calls getaddrinfo(). Once
the result is obtained, it adds that to the done request queue and
signals the main meshlink thread, which will then call the callback
function associated with the DNS request.
Guus Sliepen [Thu, 23 May 2019 20:12:04 +0000 (22:12 +0200)]
Assume getaddrinfo() and IPv6 are supported.
All major operating systems of the last 10 years have supported IPv6 and
provide getaddrinfo().
Guus Sliepen [Tue, 7 Apr 2020 22:46:26 +0000 (00:46 +0200)]
Fix a debug message being logged incorrectly.
Guus Sliepen [Mon, 6 Apr 2020 06:39:33 +0000 (08:39 +0200)]
Update UTCP to fix retransmit timeout calculation.
Guus Sliepen [Sun, 5 Apr 2020 23:59:28 +0000 (01:59 +0200)]
Update UTCP to fix RTT measurements.
Guus Sliepen [Thu, 2 Apr 2020 22:23:11 +0000 (00:23 +0200)]
Update UTCP to support fragmenting packets on UDP style channels.
This allows the application to send packets of arbitrary size (up to 64 kiB)
without worrying about the path MTU to the destination node, which might
vary, especially at the start of a connection.
If the application doesn't want packets to fragment, it should use
meshlink_channel_get_mss() to query the maximum size for unfragmented
packets.
Roop [Thu, 13 Feb 2020 11:18:43 +0000 (16:48 +0530)]
Updated test vectors for get node reachability
Roop [Tue, 4 Feb 2020 10:46:29 +0000 (16:16 +0530)]
Add meshlink_get_all_nodes_by_last_reachable API, meshlink_get_node_reachability API and its test vectors
MeshLink now keeps track of when a node was last reachable. This can be
used by an application to detect nodes that were never reachable or which
have not been reachable for a certain amount of time.
Guus Sliepen [Thu, 2 Apr 2020 18:31:48 +0000 (20:31 +0200)]
Update UTCP to fix a compile error.
Guus Sliepen [Tue, 31 Mar 2020 09:22:35 +0000 (11:22 +0200)]
Allow setting the UTCP clock granularity.
This sets the granularity to 10 ms by default, and adds the function
meshlink_set_scheduling_granularity() to be able to change it.
Guus Sliepen [Mon, 30 Mar 2020 06:21:24 +0000 (08:21 +0200)]
Fix key renewal being called too often after the first renewal.
Guus Sliepen [Sun, 29 Mar 2020 22:42:32 +0000 (00:42 +0200)]
Try addresses found by Catta for UDP probes.
Guus Sliepen [Sun, 29 Mar 2020 22:24:30 +0000 (00:24 +0200)]
Renew SPTPS keys every hour.
We did do this in the past, but in some commit we stopped automatically
renewing keys every hour.
Guus Sliepen [Sun, 29 Mar 2020 22:04:29 +0000 (00:04 +0200)]
Avoid allocating packet buffers unnecessarily.
Unless we have to queue a packet, we can avoid allocating and freeing
memory by keeping a permanently allocated packet buffer around.
Guus Sliepen [Sun, 29 Mar 2020 17:33:18 +0000 (19:33 +0200)]
Propagate the discovered PMTU between nodes to UTCP.
Guus Sliepen [Sun, 29 Mar 2020 19:52:24 +0000 (21:52 +0200)]
Update UTCP and replace gettimeofday() with clock_gettime().
Guus Sliepen [Fri, 27 Mar 2020 22:01:26 +0000 (23:01 +0100)]
Send out channel data immediately, bypassing the packet queue.
Guus Sliepen [Fri, 27 Mar 2020 21:52:46 +0000 (22:52 +0100)]
Reduce how often we have to poll the packet queue.
Packets are moved to the MeshLink thread via the packet queue. However,
each packet required a trigger byte to be sent to the event loop, requiring
more calls to select() than necessary. Now we make event loop signals level
triggered, and dequeue all enqueued packets at once.
This also adds debug log statements for the packet queue.
Guus Sliepen [Sun, 15 Mar 2020 15:33:46 +0000 (16:33 +0100)]
Remove redundant call to add_local_addresses().
Guus Sliepen [Sun, 15 Mar 2020 15:24:34 +0000 (16:24 +0100)]
Have try_bind() reuse the setup_*_listen_socket() functions.
This ensures try_bind() configures sockets exactly the same as the actual
listening sockets.
Guus Sliepen [Sun, 15 Mar 2020 15:08:44 +0000 (16:08 +0100)]
Fix the order of socket operations when setting up listening sockets.
Commit
db8e6e4 caused calls to setsockopt() for setting up socket parameters
to be called after the call to bind().
Guus Sliepen [Thu, 12 Mar 2020 20:32:41 +0000 (21:32 +0100)]
Use slashes internally to separate hostnames and ports in invitation addresses.
This avoids possible confusion when an IPv6 hostname is used without an
explicit port number.
sairoop-elear [Thu, 12 Mar 2020 08:24:45 +0000 (13:54 +0530)]
Fix IPv6 address validation bug preventing IPv6 addresses that are added for invitation from getting added
Guus Sliepen [Wed, 11 Mar 2020 22:12:08 +0000 (23:12 +0100)]
Correctly remove all duplicates when having many hostnames in an invitation URL.
Commit
fbcf089 missed adjusting one call to
remove_duplicate_hostnames().
Guus Sliepen [Tue, 10 Mar 2020 21:54:45 +0000 (22:54 +0100)]
Fix some log messages being reported for the wrong log levels.
Guus Sliepen [Tue, 10 Mar 2020 21:42:33 +0000 (22:42 +0100)]
Add all recent addresses resolved from a hostname in meshlink_invite().
When a canonical hostname or an invitation address resolves to multiple
numeric addresses, add all of them as recent addresses for ourself, so
they are all part of the host config file we send to the invitee.
Guus Sliepen [Tue, 10 Mar 2020 21:37:28 +0000 (22:37 +0100)]
Update the invite-join test.
- Check that duplicate addresses get culled correctly.
- Check that we can add lots of extra invitation addresses, and that
they are in the expected order in the invitation URL.
Guus Sliepen [Tue, 10 Mar 2020 21:33:42 +0000 (22:33 +0100)]
Ensure we process all hostnames for invitation URLs.
Commit
3febbb4 allowed more addresses to be added to invitation URLs, but
part of the still code assumed a maximum of 4 addresses in the URL.
Guus Sliepen [Tue, 10 Mar 2020 21:05:57 +0000 (22:05 +0100)]
Fix potential double free when using meshlink_add_invitation_address().
Guus Sliepen [Fri, 6 Mar 2020 23:19:57 +0000 (00:19 +0100)]
Handle not being able to bind to the configured port at startup.
When starting a MeshLink node that has already been configured to run on a
certain port, but that port is in use (for one or more of the supported
address families), it would either ignore some address families, or would
try to bind to port 0 if all address families failed. However, this is
problematic, because it makes discovery and invitation URL generation much
harder.
Fix this by checking if any port binding fails for a supported address
family, and if so, try to find another port that does support binding on
all address families. If it fails to find any available port, it will fall
back to binding to port 0, so that outgoing connections are still possible.
Guus Sliepen [Fri, 6 Mar 2020 22:24:49 +0000 (23:24 +0100)]
Don't abort on empty lines in receive_request().
Remove the assertion that lines are not empty, since this could lead to
a DoS attack. Empty lines are already handled correctly by the rest of
the logic in receive_request().
Guus Sliepen [Fri, 6 Mar 2020 22:20:22 +0000 (23:20 +0100)]
Add meshlink_add_invitation_address(), deprecate meshlink_add_address().
This adds a function to add one or more application-controlled address and
port combinations to invitation URLs. It is meant to replace
meshlink_add_address(), which is too limited because it only allows one
address to be set, and doesn't allow a different port number to be set.
Guus Sliepen [Sat, 29 Feb 2020 15:19:44 +0000 (16:19 +0100)]
Add meshlink_set_external_address_discovery_url().
This function can be used to override the default meshlink.io service to
query a host's own externally visible address.
Guus Sliepen [Tue, 3 Mar 2020 19:32:35 +0000 (20:32 +0100)]
Use the first working outgoing socket during meshlink_join().
Guus Sliepen [Fri, 28 Feb 2020 19:09:11 +0000 (20:09 +0100)]
Avoid ports that are in use by not all address families.
It could happen that a port is bound by another application, but only
for some of the supported address families (ie, only IPv4 but not IPv6).
We don't want MeshLink to then bind to the other address familie(s), but
rather have it try another port altogether.
Guus Sliepen [Fri, 28 Feb 2020 18:25:52 +0000 (19:25 +0100)]
Further improve try_bind().
Make try_bind() do the same checks as add_listen_address() does: try to
create both a TCP and UDP socket on a given port for all address
families. If one address family succeeds for both TCP and UDP, consider
this a valid port.
Guus Sliepen [Tue, 25 Feb 2020 19:39:48 +0000 (20:39 +0100)]
Fix logic in try_bind().
Fix the check for successful socket creation. Also make sure we only
return success if we can bind to IPv4 and IPv6, but ignore other
network protocols.
Guus Sliepen [Sun, 23 Feb 2020 00:47:11 +0000 (01:47 +0100)]
Check that importing the same data twice is fine, but importing garbage is not.
Guus Sliepen [Sun, 23 Feb 2020 00:41:56 +0000 (01:41 +0100)]
Move assert()s that dereference a pointer to after the pointer NULL check.
Guus Sliepen [Sun, 23 Feb 2020 00:40:26 +0000 (01:40 +0100)]
Add missing NULL-check in meshlink_verify().
Guus Sliepen [Sun, 23 Feb 2020 00:38:24 +0000 (01:38 +0100)]
Update the blackbox join test cases.
Guus Sliepen [Sat, 22 Feb 2020 22:40:04 +0000 (23:40 +0100)]
Fix compilation error caused by ACX_THREAD
Guus Sliepen [Tue, 11 Feb 2020 21:28:24 +0000 (22:28 +0100)]
Make the join commit order configurable.
By default, when an invitee joins a mesh, it will commit its configuration
to disk first, then the inviter. This adds a function to reverse that order.
Guus Sliepen [Tue, 11 Feb 2020 20:39:36 +0000 (21:39 +0100)]
Fix a memory leak when an invitation file contains an invalid submesh name.
Found by Clang's static analyzer.
Guus Sliepen [Tue, 11 Feb 2020 20:37:33 +0000 (21:37 +0100)]
Move join state out of meshlink_handle_t, and ensure proper cleanup on errors.
Move the state we keep when calling meshlink_join() out of meshlink_handle_t
and just put it on the stack of meshlink_join(). Also make sure we properly
release allocated resources in all error conditions during a join.
Guus Sliepen [Sat, 8 Feb 2020 15:04:42 +0000 (16:04 +0100)]
Fix garbage being sent at start of a UDP channel.
If meshlink_channel_send() was called before a UDP channel had finished
the handshake, it caused UTCP to send garbage data.
Guus Sliepen [Sat, 8 Feb 2020 13:55:21 +0000 (14:55 +0100)]
Fall back to getifaddrs() to get an interface address if there is no default route.
When generating invitations, we try to find a suitable local interface
address by faking an outgoing connection to the Internet. However,
that doesn't work if there is no default route. In this case, fall back
to using getifaddrs() if that function is available, and filter out any
link-local and loopback addresses.
Guus Sliepen [Thu, 6 Feb 2020 20:34:43 +0000 (21:34 +0100)]
Use bind() to check if a local address is still valid.
Some platforms don't support getifaddrs(). We use this to check if the
local address of a socket is still available on any network interface.
Instead, try to bind() a new socket to the same address (but port 0) as
existing sockets. If it returns EADDRNOTAVAIL, we know that this address
is no longer valid.
Roop [Fri, 7 Feb 2020 05:31:45 +0000 (11:01 +0530)]
Fix android (Android 6 or before) compilation issue around getifaddr
Guus Sliepen [Tue, 4 Feb 2020 22:11:34 +0000 (23:11 +0100)]
Clear reachability times in imported host config files.
This mirrors what we do with host config files received during a join.
Guus Sliepen [Mon, 3 Feb 2020 20:24:50 +0000 (21:24 +0100)]
Force -fPIC when compiling libcatta.
Guus Sliepen [Mon, 3 Feb 2020 16:43:50 +0000 (17:43 +0100)]
Clear reachability times in host config files received during a join.
When a node joins an existing mesh, it gets passed one or more host config
files from the inviter. However, these might contain non-zero reachability
times, but the invitee has never seen those nodes, so clear them before
storing the host config files.
Guus Sliepen [Mon, 3 Feb 2020 16:03:07 +0000 (17:03 +0100)]
Prevent meshlink_errno from being set incorrectly by meshlink_invite()
We called a public API function inside meshlink_invite() to check that we
don't try to invite a node that's already known. That causes it to set
meshlink_errno to MESHLINK_ENOENT. Fix this by calling lookup_node()
instead.
Guus Sliepen [Mon, 3 Feb 2020 15:24:41 +0000 (16:24 +0100)]
Fix spelling errors.
Found by codespell.
Guus Sliepen [Mon, 3 Feb 2020 15:11:36 +0000 (16:11 +0100)]
Fix reachability queries for blacklisted nodes.
Guus Sliepen [Mon, 3 Feb 2020 15:10:26 +0000 (16:10 +0100)]
Fix compiling with GCC 10.
Guus Sliepen [Wed, 29 Jan 2020 08:28:25 +0000 (09:28 +0100)]
Fix potential segmentation fault on iOS.
The PONG handler could call freeaddrinfo() on a struct that was not
allocated with getaddrinfo(). On most platforms this apparently works
fine, but on iOS it will try to free memory that wasn't allocated. Fix
this by moving the code to reset an outgoing_t to a separate function,
and calling that from the PONG handler.
Guus Sliepen [Mon, 27 Jan 2020 14:07:35 +0000 (15:07 +0100)]
Only let mesh->self be reachable when the mesh is started.
This ensures meshlink_node_get_reachability(mesh->self) returns true only
if the mesh has been started. It also handles reachability of self in
graph.c just like any other node. This means there will now also be a
node status callback generated when the mesh is started and stopped.
Guus Sliepen [Fri, 24 Jan 2020 20:08:01 +0000 (21:08 +0100)]
Sync host config file immediately after initial connect.
Guus Sliepen [Sun, 19 Jan 2020 23:45:09 +0000 (00:45 +0100)]
Add meshlink_get_node_reachability().
This function returns the current state of a node's reachability, as
well as the last time the node became reachable and the last time it
became unreachable.
Guus Sliepen [Mon, 13 Jan 2020 13:23:15 +0000 (14:23 +0100)]
Add a configurable fast connection retry period.
If no nodes are reachable, allow connections to retry once every second for a
per device-class configurable amount of time.
Guus Sliepen [Fri, 6 Dec 2019 22:01:46 +0000 (23:01 +0100)]
Remember the address used by an invitee.
When a new node uses an invitation succesfully, store the address it
used to connect.
Guus Sliepen [Fri, 6 Dec 2019 21:58:29 +0000 (22:58 +0100)]
Remember the address used when connecting to an inviting node.
The inviter sends us its own host config file, which should be populated
with its known addresses. However, if a symbolic hostname is in the
invitation URL and it can resolve to multiple IP addresses, or if the IP
address associated with it is currently different from when the invitation
was generated, the address used to connect to the inviter might not be
present in its host config file. This could cause the invitation to succeed,
but then the nodes would fail to make a regular MeshLink connection.
Guus Sliepen [Fri, 6 Dec 2019 21:42:59 +0000 (22:42 +0100)]
Ensure all addresses in the invitation URL are also in the invitation file.
Guus Sliepen [Fri, 6 Dec 2019 20:50:02 +0000 (21:50 +0100)]
Prefer sockaddr_t over struct sockaddr_*.
This avoids a lot of pointer casts, and also fixes some problems with the
sockaddr length potentially being smaller than necessary.
Guus Sliepen [Fri, 6 Dec 2019 20:47:11 +0000 (21:47 +0100)]
Don't add duplicates to the list of recently seen addresses.
Duplicate addresses would be appended to the list, and could push out other
addresses. If the address already exists, only move it to the top if it is
not already there.
Also don't force an immediate write of the host config file when trying to
add an address that already exists.
Guus Sliepen [Sun, 1 Dec 2019 23:32:57 +0000 (00:32 +0100)]
Destroy new/ and old/ subdirectories when creating a new instance.
Guus Sliepen [Sun, 1 Dec 2019 22:56:10 +0000 (23:56 +0100)]
Add meshlink_get_all_nodes_by_last_reachable().
MeshLink now keeps track of when a node was last reachable. This can be
used by an application to detect nodes that were never reachable or which
have not been reachable for a certain amount of time.
Guus Sliepen [Sun, 1 Dec 2019 22:29:39 +0000 (23:29 +0100)]
Add a #define for the maximum number of tracked recently seen addresses.
Guus Sliepen [Thu, 28 Nov 2019 21:24:05 +0000 (22:24 +0100)]
Sync the base configuration directory at the end of meshlink_join().
While joining a mesh, we create a new current/ subdirectory. While the
contents were already synced to disk, we need to ensure the subdirectory
itself is also synced before returning.
Guus Sliepen [Thu, 28 Nov 2019 21:21:19 +0000 (22:21 +0100)]
Sync the base configuration directory after each subdirectory rename operation.
This ensures the proper ordering of the renames in the event of a crash.
Guus Sliepen [Thu, 28 Nov 2019 21:20:05 +0000 (22:20 +0100)]
Sync the base configuration directory after each call to config_destroy().
This guarantees proper ordering when deleting the current/, new/ and old/
subdirectories.
Guus Sliepen [Thu, 14 Nov 2019 20:48:02 +0000 (21:48 +0100)]
Fix logic error preventing fast update of reflexive address.
When we are trying to communicate with peers that don't know our
reflexive address, and we just learned our own one, we want to inform
those peers of it immediately, so they can send PMTU probes to the right
address. A logic error prevented this from happening in the common case.
Guus Sliepen [Mon, 11 Nov 2019 21:54:46 +0000 (22:54 +0100)]
Assert that nodes black/whitelisted by name persist after closing the mesh.
Guus Sliepen [Mon, 11 Nov 2019 21:49:05 +0000 (22:49 +0100)]
Add support for black/whitelisting by name, and forgetting nodes.