Running netbird client on Synology DSM
Introduction
I recently made the aquisition of an Arm powered Synology NAS. I looked for a way to securly connect to it from outside my home network. I settled on Netbird as a nice, self-hostable wrapper over Wireguard that also happen to have a mobile application.
I could also have used Tailscale, which has a native app for DSM, the Synology's Linux based operating system the NAS comes with.
Disclaimer
Running the Netbird client implies running may commands as root, which can damage your DSM installation and/or prevent a smooth upgrade. Also, the Netbird client container will run with nearly all powers and could be seen as an additional vector of attack. And the firewall commands I use might not be enough to properly secure the access to the NAS.
Beside, the situation (kernel version, Netbird client version, ...) might have changed by the time you read this and most of those steps might no longer be needed or might do more harm than good.
Follow along at your own risks and use your own judgement.
Running Netbird
The Netbird client is primarily written in Go and a docker image is provided for aarch64. Which is great as it is the architecture of my NAS.
Pre-requisites
Install container manager.
Fetch the Netbird image as per Netbird's documentation.
Generate a setup key on the Netbird's website.
I'm assuming you have already connected a peer or two to your network (not required but I expect you tested it works).
Running
Netbird's documentation recommends this command:
docker run --rm --name PEER_NAME --hostname PEER_NAME --cap-add=NET_ADMIN --cap-add=SYS_ADMIN --cap-add=SYS_RESOURCE -d -e NB_SETUP_KEY=<SETUP KEY> -v netbird-client:/etc/netbird netbirdio/netbird:latest
And while you can map all option of the above command to the right toggle in the Container Manager's UI, do not bother, it does not work. Running the command as root from an SSH session does not work either.
The instead have the error:
INFO client/internal/connect.go:119: starting NetBird client version 0.28.6 on linux/arm64
INFO iface/module_linux.go:76: couldn't access device /dev/net/tun, go error stat /dev/net/tun: no such file or directory, will attempt to load tun module, if running on container add flag --cap-add=NET_ADMIN
ERRO client/internal/engine.go:302: failed creating wireguard interface instance wt0: [couldn't check or load tun module]
ERRO client/internal/connect.go:263: error while starting Netbird Connection Engine: new wg interface: couldn't check or load tun module
Missing module
Netbird says it'll try to load the tun
module, but that does not seem to work from inside the container despite the NET_ADMIN
capability being set. Thanksfully, loading it beforehand on the host just works.
But that's not all. As mentioned in the error message, Netbird also needs access to the device /dev/net/tun
. So we'll need to add --device /dev/net/tun
to the command.
With this new commands, you should be able to run Netbird and it should be able to connect to other peers:
modprobe tun
docker run --rm --name PEER_NAME --hostname PEER_NAME --cap-add=NET_ADMIN --cap-add=SYS_ADMIN --cap-add=SYS_RESOURCE -e NB_SETUP_KEY=<SETUP KEY> -v netbird-client:/etc/netbird --device /dev/net/tun -d netbirdio/netbird:latest
Routing traffic
While other peers can connect to Netbird this way, they cannot access any of the services running on the NAS. The main reason is that they connect to the container, which is running in a virtual network, and therefore has no direct access to the services running on the host.
In theory, Netbird sets up iptables rules to direct the traffic it receives from the tunnel back to the hosts, but this fails.
Introducing iptables
It turns out the iptables binary that comes with the Netbird docker image is too "recent". Or rather, it expects the new kernel API to manipulate the packet routing and filtering tables. This new kernel API is known as nftables. And indeed, if we run iptables -V
on the host we get
sh-4.4# iptables -V
iptables v1.8.3 (legacy)
Whereas in the Netbird's container (here named netbird), we get:
sh-4.4# docker exec netbird iptables -V
iptables v1.8.10 (nf_tables)
Actually, the host seems to support nftables, but I've not found any way to make them work (the nft binary is missing). Nor did I try very hard. But it's fairly annoying as we seem to be stuck in a situation where the iptables target LOG
does not work.
sh-4.4# iptables -A INPUT -p tcp --dport 443 -j LOG
iptables: No chain/target/match by that name.
Even though I tried to add xt_LOG
to /proc/sys/net/netfilter/nf_log
:
sh-4.4# cat /proc/sys/net/netfilter/nf_log/2
NONE
sh-4.4# modprobe xt_LOG
sh-4.4# echo xt_LOG >/proc/sys/net/netfilter/nf_log/2
sh: echo: write error: No such file or directory
sh-4.4# cat /proc/sys/net/netfilter/nf_log/2
NONE
Well, that's not mandatory but it would be useful for diagnosis. If someone find a workaround, do let me know.
Running the host's iptables inside netbird's container
Anyway, the iptables present in the Netbird's container won't do. So the solution would be to run the host version of iptables from inside the host container and vaguely reproduce the configuration Netbird intends to set up. Well, since I have full control over it, I'll actually change the configuration a bit as I would like to expose only select few services instead of everything and the DSM's admin panel.
But how to run a host binary from inside a docker container? Well, the obvious solution is to mount /bin
and maybe /lib
inside the container. And maybe mess around with LD_LIBRARY_PATH
. But that would be cheating. And it sounds annoying.
Anatomy of a container
On Linux, containers are composed of mainly two things. In short: cgroups, which limit access to physical resources (in the broad sense, with many exceptions) and namespaces, which isolate processes on some given axis.
I cannot really expand too much on the topic myself as I just brushed the surface while searching for a solution to my problem. I'm sure you can find good resources online that go well beyond the vague and inaccurate description I made of them.
However, what I can tell, is that containers are isolated from one-another on a network level using namespaces. So we need to run iptables in the same network namespace as the container.
Enter nsenter
Here is nsenter(1) to save the day. On another machine, I tried the following:
(host)# docker run -d --rm --name httpd docker.io/library/httpd
<snip>
(host)# docker inspect httpd -f '{{.State.Pid}}'
87082
(host)# iptables -S
-P INPUT ACCEPT
-P FORWARD ACCEPT
-P OUTPUT ACCEPT
(host)# nsenter --net=/proc/87082/ns/net bash
(httpd)# iptables -S
-P INPUT ACCEPT
-P FORWARD ACCEPT
-P OUTPUT ACCEPT
(httpd)# iptables -A INPUT -m tcp -p tcp --dport 80 -j ACCEPT
(httpd)# iptables -S
-P INPUT ACCEPT
-P FORWARD ACCEPT
-P OUTPUT ACCEPT
-A INPUT -p tcp -m tcp --dport 80 -j ACCEPT
(httpd)# exit
(host)# iptables -S
-P INPUT ACCEPT
-P FORWARD ACCEPT
-P OUTPUT ACCEPT
(host)# docker stop httpd
As we can see, adding rules to iptables after nsenter
does modify the filtering table, but those changes are invisible when we are back on the host system. Therefore, we must have modified the container's filtering table.
Perfect, that'll do the job nicely. Let's try on the NAS:
sh-4.4# nsenter
sh: nsenter: command not found
Welp.
But the associated syscall should exist, otherwise docker should not be able to work.
Zig to the rescue
I've been dabbling in Zig from time to time and it seems like the perfect fit for this issue. Basically, the code I need is already written, but in C, at the end of the setns(2) man page. This is the syscall used by nsenter as its manpage nicely mentions.
But I also need to cross compile it for aarch64 from x86_64 and, ideally, without linking to libc as to avoid symbol lookup error when I try to run the binary on the NAS.
Here is my code. But beware that I barely know Zig and there might be a few mistakes and non-idiomatic code in there.
const std = @import("std");
const builtin = @import("builtin");
pub fn print_usage(pname: [*:0]u8) void {
std.log.err("Usage: {s} /proc/PID/ns/NS [CMD ARGS...]", .{pname});
}
pub fn my_streql(lhs: [*:0]u8, rhs: []const u8) bool {
for (rhs, 0..) |c, i| {
if (c != lhs[i]) {
return false;
}
}
return lhs[rhs.len] == 0;
}
pub fn main() !void {
var buffer = [_]u8{0} ** (64 * 1024);
var fb_allocator = std.heap.FixedBufferAllocator.init(&buffer);
var arena_allocator = std.heap.ArenaAllocator.init(fb_allocator.allocator());
defer arena_allocator.deinit();
const allocator = arena_allocator.allocator();
if (std.os.argv.len < 2) {
print_usage(std.os.argv[0]);
std.log.err("Missing argument /proc/PID/ns/NS", .{});
std.process.exit(1);
return;
}
if (my_streql(std.os.argv[1], "-h")) {
print_usage(std.os.argv[0]);
std.process.cleanExit();
return;
}
var cmd: [][]u8 = undefined;
if (std.os.argv.len == 2) {
if (std.process.getEnvVarOwned(allocator, "SHELL")) |shell| {
cmd = try allocator.alloc([]u8, 1);
cmd[0] = shell;
} else |_| {
std.log.err("No command specified and could not access the SHELL env var", .{});
std.process.exit(1);
return;
}
} else {
cmd = try allocator.alloc([]u8, std.os.argv.len - 2);
for (cmd, 0..) |*e, i| {
e.* = std.mem.sliceTo(std.os.argv[i + 2], 0);
}
}
const fname = std.os.argv[1];
const fd = std.os.linux.openat(std.os.linux.AT.FDCWD, fname, .{}, 0);
if (fd > (1 << 32)) {
std.log.err("Error opening file at {s}", .{fname});
std.process.exit(1);
}
const ret = std.os.linux.syscall2(.setns, fd, 0);
if (ret != 0) {
std.log.err("Failed to move to namespace described by {s}", .{fname});
std.process.exit(1);
return;
}
std.process.execv(allocator, cmd) catch {
std.log.err("Failed to execute command:", .{});
for (cmd) |e| {
std.log.err(" {s}", .{e});
}
std.log.err("\n", .{});
std.process.exit(1);
};
}
And a single
$ zig build -Dtarget=aarch64-linux-musl --release=safe
later and I've my own nsenter.
Routing traffic (for real this time)
After getting sightly side-tracked, let's get back to the matter at hands: we need to route the traffic that comes out of the tunnel to the host. However, as I've mentioned, I do not want to route everything. I only want to expose a select few services.
There are many ways to do it apparently. One could set another tunnel up for example, with proper filtering on the host this time. Most likely on the DOCKER_USER table.
I chose to use the SNAT and DNAT feature of iptables as it seems the easiest. But that wont suffice, I also need to tell the host how to reply to the peers over the VPN. So I'll also add a route on the host to the Netbird's container for this.
Basically, this is the mess I created for myself:
Which is achieved by running the following on the host:
ip route add 100.72.0.0/16 via 172.17.0.3
And the following in the network namespace of the container:
my_nsenter /proc/$(docker inspect netbird -f '{{.State.Pid}}')/ns/net bash
iptables --wait -t nat -N MINE-IN
iptables --wait -t nat -N MINE-OUT
iptables --wait -t nat -A MINE-IN -s 100.72.0.0/16 -d 100.72.184.98/32 -p tcp -m tcp --dport 443 -j DNAT --to-destination 172.17.0.1:443
iptables --wait -t nat -A MINE-OUT -s 172.17.0.1/32 -d 100.72.0.0/16 -p tcp -m tcp --sport 443 -j SNAT --to-destination 100.72.184.98:443
iptables --wait -t nat -A PREROUTING -j MINE-IN
iptables --wait -t nat -A INPUT -j MINE-OUT
iptables --wait -t nat -A OUTPUT -j MINE-IN
iptables --wait -t nat -A POSTROUTING -j MINE-OUT
Which allows peer connecting with https to my NAS's Netbird IP to reach the webserver running on my NAS. And nothing else (hopefully).
The complete script
In order to have Netbird's client run and connect when the NAS starts up, I ended up adding the following script, triggered on boot:
#!/usr/bin/env bash
set -euo pipefail
die() {
echo $@
exit 1
}
is_container_running() {
if docker inspect "${CONTAINER_NAME}" -f '{{.State.Running}}' 2>/dev/null | grep -q true; then
return 0
else
return 1
fi
}
CONTAINER_NAME=netbird
# setup environment for the container to run properly
modprobe tun
# wait for docker command to be available
count=0
until docker ps -q -l >/dev/null 2>/dev/null; do
if [ $count -gt 20 ]; then
die "docker still not available"
fi
count=$((count+1))
sleep 6
done
docker run --rm --name "${CONTAINER_NAME}" --hostname "${CONTAINER_NAME}" --cap-add=NET_ADMIN --cap-add=SYS_ADMIN --cap-add=SYS_RESOURCE -v /volume1/docker/netbird:/etc/netbird --device /dev/net/tun -d netbirdio/netbird:latest
container_ip=
container_ip=
# retrieve the netbird_ip, which requires a connection to the coordinator and might take time
netbird_ip=
while [ -z "$netbird_ip" ] && is_container_running; do
netbird_ip=
netbird_ip=
[ -z "$netbird_ip" ] && sleep 5
done
[ -z "$netbird_ip" ] && die "Could not find netbird's IP (did the container die?)"
netbird_pid=
[ -z "$netbird_pid" ] && die "Could not find netbird's PID"
host_docker_ip= .1
nb_netmask= .0.0/16
nb_iptables() {
/volume1/admin/bin/my_nsenter "/proc/${netbird_pid}/ns/net" iptables "$@" --wait
}
nb_iptables -t nat -N MINE-IN
nb_iptables -t nat -N MINE-OUT
for port in 443 3434; do
nb_iptables -t nat -A MINE-IN -s "${nb_netmask}" -d "${netbird_ip}"/32 -p tcp -m tcp --dport "${port}" -j DNAT --to-destination "${host_docker_ip}":"${port}"
nb_iptables -t nat -A MINE-OUT -s "${host_docker_ip}"/32 -d "${nb_netmask}" -p tcp -m tcp --sport "${port}" -j SNAT --to-source "${netbird_ip}":"${port}"
done
nb_iptables -t nat -A PREROUTING -j MINE-IN
nb_iptables -t nat -A INPUT -j MINE-OUT
nb_iptables -t nat -A OUTPUT -j MINE-IN
nb_iptables -t nat -A POSTROUTING -j MINE-OUT
if ! ip route | grep -q "$container_ip"; then
# need to create route to netbird's subnet via netbird's container
ip route add "${nb_netmask}" via "${container_ip}"
fi
You'll notice a loop whose purpose is to wait for docker to be available. Apparently, the "boot trigger" from the DSM interface triggers before applications are "mounted" so to speak. And since docker is part of the Container Manager application, it is not available until after the script started.
There is also a reference to my_nsenter, this is the binary produced from the above zig code and installed in an admin/bin folder on the NAS.