Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

T4930: Allow WireGuard peers via DNS hostname #4200

Open
wants to merge 1 commit into
base: current
Choose a base branch
from

Conversation

sskaje
Copy link
Contributor

@sskaje sskaje commented Nov 20, 2024

Change Summary

T4930: Allow WireGuard peers via DNS hostname + new script resetting peer
T4930: Ensure peer is created even if dns not working
T4930: limit wg retry times by using environment variable
T4930: make wg dns retry configurable through interfaces wireguard wgX max-dns-retry

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Code style update (formatting, renaming)
  • Refactoring (no functional changes)
  • Migration from an old Vyatta component to vyos-1x, please link to related PR inside obsoleted component
  • Other (please describe):

Related Task(s)

https://vyos.dev/T4930

Related PR(s)

Component(s) name

wireguard

Proposed changes

  1. Support wireguard endpoint using domain as peer address, make sure wg retries for dns resolution no more than 5 times;
  2. Introduce an op mode command reset wireguard, if user want to use wg set to force wireguard redo dns resolution;
  3. Detail changes like splitting Wireguard peer creation into peer creation + endpoint setup, with exception (need help on correct exception/log handling, not found in docs.vyos.io)

How to test

config mode

vyos@vyos# set interfaces wireguard wg0 peer wg0-xxx address t2.vm.xxx.xxx
[edit]
vyos@vyos# commit
[edit]
vyos@vyos#

op mode

# reset all peers on wg0
vyos@vyos:~$ reset wireguard interface wg0
Resetting wg0 peer YYYY= endpoint to t1.vm.xxx.xxx:ppppp ... done
Resetting wg0 peer XXXX= endpoint to t2.vm.xxxx.xxx:ppppp ... done

# reset single peer on wg0
vyos@vyos:~$ reset wireguard interface wg0  peer wg0-xxx
Resetting wg0 peer XXXX= endpoint to t2.vm.xxxx.xxx:ppppp ... done

vyos without working dns

I've provided screenshots in task's comments.

configure max dns resolution retry times

vyos@vyos# set interfaces wireguard wg0
Possible completions:
+  address              IP address
   description          Description
   disable              Administratively disable interface
   domain-name          Endpoint Domain Name
   fwmark               A 32-bit fwmark value set on all outgoing packets (default: 0)
 > ip                   IPv4 routing parameters
 > ipv6                 IPv6 routing parameters
   max-dns-retry        Max retry when DNS resolves failed. (default: 3)
 > mirror               Mirror ingress/egress packets
   mtu                  Maximum Transmission Unit (MTU) (default: 1420)
+> peer                 peer alias
   per-client-thread    Process traffic from each client in a dedicated thread
   port                 Port number used by connection
   private-key          Base64 encoded private key
   redirect             Redirect incoming packet to destination
   vrf                  VRF instance name


[edit]
vyos@vyos# set interfaces wireguard wg0 max-dns-retry
Possible completions:
   <1-15>               Max retry times



[edit]
vyos@vyos# set interfaces wireguard wg0 max-dns-retry 3
[edit]
vyos@vyos#

Smoketest result

root@vyos:/home/vyos# python3 /usr/libexec/vyos/tests/smoke/cli/test_interfaces_wireguard.py
test_01_wireguard_peer (__main__.WireGuardInterfaceTest.test_01_wireguard_peer) ... ok
test_02_wireguard_add_remove_peer (__main__.WireGuardInterfaceTest.test_02_wireguard_add_remove_peer) ... ok
test_03_wireguard_same_public_key (__main__.WireGuardInterfaceTest.test_03_wireguard_same_public_key) ... ok
test_04_wireguard_threaded (__main__.WireGuardInterfaceTest.test_04_wireguard_threaded) ... ok
test_05_wireguard_peer_pubkey_change (__main__.WireGuardInterfaceTest.test_05_wireguard_peer_pubkey_change) ... ok

----------------------------------------------------------------------
Ran 5 tests in 52.648s

OK
root@vyos:/home/vyos#

Checklist:

  • I have read the CONTRIBUTING document
  • I have linked this PR to one or more Phabricator Task(s)
  • I have run the components SMOKETESTS if applicable
  • My commit headlines contain a valid Task id
  • My change requires a change to the documentation
  • I have updated the documentation accordingly

Copy link

github-actions bot commented Nov 20, 2024

👍
No issues in PR Title / Commit Title

@sever-sever
Copy link
Member

sever-sever commented Nov 20, 2024

Build package fails (based on CI)

PYTHONPATH=python/ python3 -m "nose" --with-xunit src --with-coverage --cover-erase --cover-xml --cover-package src/conf_mode,src/op_mode,src/completion,src/helpers,src/validators,src/tests --verbose
*** Error compiling './python/vyos/ifconfig/wireguard.py'...
  File "./python/vyos/ifconfig/wireguard.py", line 179
    f'Resetting {self.config['ifname']} peer {public_key} endpoint to {address}:{port} ... ',
                              ^^^^^^
SyntaxError: f-string: unmatched '['

make[2]: *** [Makefile:89: test] Error 1
make[2]: Leaving directory '/__w/vyos-1x/vyos-1x/packages/vyos-1x'
make[1]: *** [debian/rules:31: override_dh_auto_build] Error 2
make[1]: Leaving directory '/__w/vyos-1x/vyos-1x/packages/vyos-1x'
make: *** [debian/rules:21: build] Error 2
dpkg-buildpackage: error: debian/rules build subprocess returned exit status 2
Error: Process completed with exit code

@sskaje
Copy link
Contributor Author

sskaje commented Nov 20, 2024

Build package fails (based on CI)

PYTHONPATH=python/ python3 -m "nose" --with-xunit src --with-coverage --cover-erase --cover-xml --cover-package src/conf_mode,src/op_mode,src/completion,src/helpers,src/validators,src/tests --verbose
*** Error compiling './python/vyos/ifconfig/wireguard.py'...
  File "./python/vyos/ifconfig/wireguard.py", line 179
    f'Resetting {self.config['ifname']} peer {public_key} endpoint to {address}:{port} ... ',
                              ^^^^^^
SyntaxError: f-string: unmatched '['

make[2]: *** [Makefile:89: test] Error 1
make[2]: Leaving directory '/__w/vyos-1x/vyos-1x/packages/vyos-1x'
make[1]: *** [debian/rules:31: override_dh_auto_build] Error 2
make[1]: Leaving directory '/__w/vyos-1x/vyos-1x/packages/vyos-1x'
make: *** [debian/rules:21: build] Error 2
dpkg-buildpackage: error: debian/rules build subprocess returned exit status 2
Error: Process completed with exit code

Updated.

My IDE was set python3.12, that syntax is acceptable in 3.12 but not in 3.11 XD

Copy link
Member

@sarthurdev sarthurdev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does the WG_ENDPOINT_RESOLUTION_RETRIES var block execution while waiting to resolve?

op-mode-definitions/reset-wireguard.xml.in Outdated Show resolved Hide resolved
op-mode-definitions/reset-wireguard.xml.in Outdated Show resolved Hide resolved
@sskaje
Copy link
Contributor Author

sskaje commented Nov 20, 2024

Does the WG_ENDPOINT_RESOLUTION_RETRIES var block execution while waiting to resolve?

Sorry, I don't understand your question.

WG_ENDPOINT_RESOLUTION_RETRIES is an environment variable use in wg command, source here:

wg reads env:
https://git.zx2c4.com/wireguard-tools/tree/src/config.c#n180

wg assigns as local variable:
https://git.zx2c4.com/wireguard-tools/tree/src/config.c#n199

wg loops and retries:
https://git.zx2c4.com/wireguard-tools/tree/src/config.c#n259

By default, wg retries 15 times, delay time increased to 1.2x of last delay time on each failure.
This make wg keeps wait for around 1 minutes until dns resolved or all attempts failed.

BTW, the reset-wireguard.xml.in is simplified by following your tips, please check if there is anything else I didn't make right.
That was AI generated code, because my earlier attempts on completions were failed.

@sarthurdev
Copy link
Member

By default, wg retries 15 times, delay time increased to 1.2x of last delay time on each failure.
This make wg keeps wait for around 1 minutes until dns resolved or all attempts failed.

That's what I was concerned by, we want to avoid long boot/commit time because of DNS resolution not being available.

@sskaje
Copy link
Contributor Author

sskaje commented Nov 20, 2024

By default, wg retries 15 times, delay time increased to 1.2x of last delay time on each failure.
This make wg keeps wait for around 1 minutes until dns resolved or all attempts failed.

That's what I was concerned by, we want to avoid long boot/commit time because of DNS resolution not being available.

Yes, that's why I set it to 5 and I still feel it costs too much time.

How about 3x? Or 3x by default and let user can customize it in somewhere like set interfaces wireguard wg0 max-dns-retry?

@sskaje
Copy link
Contributor Author

sskaje commented Nov 21, 2024

max-dns-retry added, limit 1-15.

vyos@vyos# set interfaces wireguard wg0
Possible completions:
+  address              IP address
   description          Description
   disable              Administratively disable interface
   domain-name          Endpoint Domain Name
   fwmark               A 32-bit fwmark value set on all outgoing packets (default: 0)
 > ip                   IPv4 routing parameters
 > ipv6                 IPv6 routing parameters
   max-dns-retry        Max retry when DNS resolves failed. (default: 3)
 > mirror               Mirror ingress/egress packets
   mtu                  Maximum Transmission Unit (MTU) (default: 1420)
+> peer                 peer alias
   per-client-thread    Process traffic from each client in a dedicated thread
   port                 Port number used by connection
   private-key          Base64 encoded private key
   redirect             Redirect incoming packet to destination
   vrf                  VRF instance name


[edit]
vyos@vyos# set interfaces wireguard wg0 max-dns-retry
Possible completions:
   <1-15>               Max retry times



[edit]
vyos@vyos# set interfaces wireguard wg0 max-dns-retry 3
[edit]
vyos@vyos#

@sskaje
Copy link
Contributor Author

sskaje commented Nov 29, 2024

Hi reviewers,

Still need help:

  1. Exception handling in WireGuardIf.update(). print() or any best practice in vyos?
  2. vyos-domain-resolver part in interfaces_wireguard, need help about the logic: as I read what's in nat & firewall, the service will keep running if ip_fqdn or ip6_fqdn not empty. That means I should check all working peers and build a similar object storing peers with domain, and check if this object not empty to create the /run/use-vyos-domain-resolver-*?

@dmbaturin
Copy link
Member

The current design seems promising, we'll get back to you with comments about implementation if we find anything that we think we can improve.

python/vyos/ifconfig/wireguard.py Outdated Show resolved Hide resolved
python/vyos/ifconfig/wireguard.py Outdated Show resolved Hide resolved
python/vyos/ifconfig/wireguard.py Outdated Show resolved Hide resolved
python/vyos/ifconfig/wireguard.py Outdated Show resolved Hide resolved
python/vyos/ifconfig/wireguard.py Outdated Show resolved Hide resolved
python/vyos/ifconfig/wireguard.py Outdated Show resolved Hide resolved
python/vyos/ifconfig/wireguard.py Outdated Show resolved Hide resolved
python/vyos/ifconfig/wireguard.py Show resolved Hide resolved
src/conf_mode/nat.py Show resolved Hide resolved
src/helpers/vyos-domain-resolver.py Outdated Show resolved Hide resolved
@c-po
Copy link
Member

c-po commented Dec 27, 2024

Hi @sskaje,

thanks for the PR! I did a testdrive with this implementation and in general it works for me.
Can you please address the open requested changes so we can get this merged?

Please also rebase your work on current as its pretty far behind.

@sskaje
Copy link
Contributor Author

sskaje commented Dec 27, 2024

I'm trying to build and run tests, but I see frr (>= 10.2) is now required because of T6746, but the latest 1.5-rolling-202412160007 doesn't have frr >= 10.2.

Any Idea how I build and run tests? Revert code in T6746 or just update debian/control?

=============

Updated: I modified debian/control on my VyOS VM.

@c-po
Copy link
Member

c-po commented Dec 28, 2024

FRR 10.2 is part of the official VyOS repositories now. You do not need to build it on your own.

You could grab a fresh build from e.g. https://github.com/vyos/vyos-nightly-build/actions/runs/12521976266/artifacts/2367044381

It contains some minor smoketest issues but would be fine for your development. If you're uncertain ant would rather NOT rebase that would work, too - we can rebase in the end after all changes got applied.

@c-po
Copy link
Member

c-po commented Dec 29, 2024

@sskaje you can also force package installation and omit the FRR 10.2 dependency

dpkg --install --force-all *.deb

@c-po c-po requested a review from sarthurdev December 29, 2024 09:05
if 'peers_need_resolve' in wireguard and len(wireguard['peers_need_resolve']) > 0:
text = f'# Automatically generated by interfaces_wireguard.py\nThis file indicates that vyos-domain-resolver service is used by the interfaces_wireguard.\n'
text += "intefaces:\n" + "".join([f" - {peer}\n" for peer in wireguard['peers_need_resolve']])
Path(domain_resolver_usage).write_text(text)
Copy link
Member

@c-po c-po Dec 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use our common vyos.utils.file.write_file() instead.

Other then that it formally looks good to me - waiting for some more reviewers for this piece of code and another real world test from me.

Copy link
Member

@c-po c-po Dec 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dec 29 10:29:43 LR1.wue3 python3[2210]: Resetting wg1000 peer BMj7LgeZS1aoZRQ6rweeKpz+9+HdicT0F590f+7iDH4= endpoint to wg-dynamic.vyos.io:10000 ... done
Dec 29 10:29:43 LR1.wue3 python3[2210]: Wireguard: reset wg1000 peer BMj7LgeZS1aoZRQ6rweeKpz+9+HdicT0F590f+7iDH4=

Working

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should I perform this file writing change?

python/vyos/ifconfig/wireguard.py Outdated Show resolved Hide resolved
@c-po
Copy link
Member

c-po commented Dec 29, 2024

Bug:

When moving from a FQDN address back to an IP address, vyos-domain-resolver is not stopped.

@sskaje
Copy link
Contributor Author

sskaje commented Dec 30, 2024

Bug:

When moving from a FQDN address back to an IP address, vyos-domain-resolver is not stopped.

I tested on my vm, when I move it back to ip address, the /run/use-vyos-domain-resolver-interfaces-wireguard-wgX is gone, and vyos-domain-resolver turn stopped here.

-rw-r--r--  1 root     vyattacfg    184 Dec 30 10:17 use-vyos-domain-resolver-interfaces-wireguard-wg0
-rw-r--r--  1 root     vyattacfg    168 Dec 30 10:17 use-vyos-domain-resolver-interfaces-wireguard-wg2
-rw-r--r--  1 root     vyattacfg    182 Dec 30 10:17 use-vyos-domain-resolver-interfaces-wireguard-wg4

vyos@vyos# set interfaces wireguard wg2 peer wg2-test address 1.1.1.1
[edit]
vyos@vyos# commit
[edit]
vyos@vyos# ls -al /run/|grep use-vyos
-rw-r--r--  1 root     vyattacfg    184 Dec 30 10:17 use-vyos-domain-resolver-interfaces-wireguard-wg0
-rw-r--r--  1 root     vyattacfg    182 Dec 30 10:17 use-vyos-domain-resolver-interfaces-wireguard-wg4
[edit]
vyos@vyos# ps -ef|grep resolver
root       26900       1  0 10:19 ?        00:00:00 /usr/bin/python3 -u /usr/libexec/vyos/vyos-domain-resolver.py
vyos       27026   24131  0 10:19 pts/4    00:00:00 grep resolver
[edit]
vyos@vyos# delete interfaces wireguard wg0
[edit]
vyos@vyos# delete interfaces wireguard wg4
[edit]
vyos@vyos# commit
[edit]
vyos@vyos# ps -ef|grep resolver
vyos       27168   24131  0 10:20 pts/4    00:00:00 grep resolver
[edit]
vyos@vyos# set interfaces wireguard wg2 peer wg2-test address x.x.x.x
[edit]
vyos@vyos# commit
[edit]
vyos@vyos# ps -ef|grep resolver
root       27279       1  4 10:20 ?        00:00:00 /usr/bin/python3 -u /usr/libexec/vyos/vyos-domain-resolver.py
vyos       27350   24131  0 10:20 pts/4    00:00:00 grep resolver
[edit]
vyos@vyos# set interfaces wireguard wg2 peer wg2-test address 1.1.1.1
[edit]
vyos@vyos# commit
[edit]
vyos@vyos# ps -ef|grep resolver
vyos       27512   24131  0 10:20 pts/4    00:00:00 grep resolver
[edit]

Can you please check if there is any other /run/use-vyos-domain-resolver-* on your side?

@c-po
Copy link
Member

c-po commented Dec 30, 2024

I tested on my vm, when I move it back to ip address, the /run/use-vyos-domain-resolver-interfaces-wireguard-wgX is gone, and vyos-domain-resolver turn stopped here.

image

In you example vyos-domain-resolver is started again after you set interfaces wireguard wg2 peer wg2-test address x.x.x.x - I think this is invalid.

@c-po
Copy link
Member

c-po commented Dec 30, 2024

This branch has conflicts that must be resolved

  • src/helpers/vyos-domain-resolver.py

Copy link

This pull request has conflicts, please resolve those before we can evaluate the pull request.

T4930: print previous/current endpoint when resetting peer
T4930: resolve code change requests
T4930: Fix ConfigTreeQuery usages
T4930: resolve code change requests
T4930: make wireguard domain resolver run flag files separated by interface; code style
T4930: Move dns resolution to vyos-domain-resolver
T4930: make wg dns retry configurable through `interfaces wireguard wgX max-dns-retry`
T4930: simplify reset-wireguard.xml.in
T4930: code style changes for python 3.11
T4930: code style changes
T4930: code style changes
T4930: limit wg retry times by using environment variable
T4930: Ensure peer is created even if dns not working
T4930: Allow WireGuard peers via DNS hostname + new script resetting peer
Copy link

Conflicts have been resolved. A maintainer will review the pull request shortly.

Copy link

CI integration 👍 passed!

Details

CI logs

  • CLI Smoketests (no interfaces) 👍 passed
  • CLI Smoketests (interfaces only) 👍 passed
  • Config tests 👍 passed
  • RAID1 tests 👍 passed
  • TPM tests 👍 passed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Development

Successfully merging this pull request may close these issues.

5 participants