681 Commits

Author SHA1 Message Date
Or Mergi
7c122fabb4 bridge: Add option to enable port isolation
Enable bridge CNI plugin setting port-isolation [1] the interface.
When port-isolation is enabled, containers connected to the network
cannot communicate with each other over the linux-bridge.
Communication will be enable depending on the gateway appliance according
to its restrictions / policies.

For example: in a scenario the env connected to smart switch, enabling
port-isolation ensure traffic will go outbound, allowing the
smart-switch routing the traffic according to policies.

Add "portIsolation" flag to bridge plugin.
When true, configure the node interface with port-isolation [1].
Default is false.

[1] https://man7.org/linux/man-pages/man8/bridge.8.html (see "isolated" option)

Signed-off-by: Or Mergi <ormergi@redhat.com>
2025-01-29 16:10:47 +01:00
Etienne Champetier
7f756b411e portmap: fix iptables conditions detection
As show in the docs, iptables conditions can also start with '!'

Fixes 01a94e17c77e6ff8e5019e15c42d8d92cf87194f

Signed-off-by: Etienne Champetier <e.champetier@ateme.com>
2024-12-02 17:06:11 +01:00
Etienne Champetier
6de8a9853c ipmasq: fix nftables backend
Rename
SetupIPMasqForNetwork -> SetupIPMasqForNetworks
TeardownIPMasqForNetwork -> TeardownIPMasqForNetworks
and have them take []*net.IPNet instead of *net.IPNet.

This allow the nftables backend to cleanup stale rules and recreate all
needed rules in a single transaction, where previously the stale rules
cleanup was breaking all but the last IPNet.

Fixes 61d078645a6d2a2391a1555ecda3d0a080a45831

Signed-off-by: Etienne Champetier <e.champetier@ateme.com>
2024-11-21 20:23:25 +01:00
Etienne Champetier
9296c5f80a portmap: fix nftables backend
We can't use dnat from the input hook,
depending on nftables (and kernel ?) version we get
"Error: Could not process rule: Operation not supported"
iptables backend also uses prerouting.

Also 'ip6 protocol tcp' is invalid, so rework / simplify the rules

Fixes 01a94e17c77e6ff8e5019e15c42d8d92cf87194f

Signed-off-by: Etienne Champetier <e.champetier@ateme.com>
2024-11-18 17:04:37 +01:00
Lionel Jouin
fec2d62676 Pass status along ipam update
Signed-off-by: Lionel Jouin <lionel.jouin@est.tech>
2024-10-15 10:22:10 +02:00
Songmin Li
a4fc6f93c7 feat(dhcp): Cancel backoff retry on stop
Signed-off-by: Songmin Li <lisongmin@protonmail.com>
2024-10-14 17:42:30 +02:00
Songmin Li
d61e7e5e1f fix(dhcp): can not renew an ip address
The dhcp server is systemd-networkd, and the dhcp
plugin can request an ip but can not renew it.
The systemd-networkd just ignore the renew request.

```
2024/09/14 21:46:00 no DHCP packet received within 10s
2024/09/14 21:46:00 retrying in 31.529038 seconds
2024/09/14 21:46:42 no DHCP packet received within 10s
2024/09/14 21:46:42 retrying in 63.150490 seconds
2024/09/14 21:47:45 98184616c91f15419f5cacd012697f85afaa2daeb5d3233e28b0ec21589fb45a/iot/eth1: no more tries
2024/09/14 21:47:45 98184616c91f15419f5cacd012697f85afaa2daeb5d3233e28b0ec21589fb45a/iot/eth1: renewal time expired, rebinding
2024/09/14 21:47:45 Link "eth1" down. Attempting to set up
2024/09/14 21:47:45 98184616c91f15419f5cacd012697f85afaa2daeb5d3233e28b0ec21589fb45a/iot/eth1: lease rebound, expiration is 2024-09-14 22:47:45.309270751 +0800 CST m=+11730.048516519
```

Follow the https://datatracker.ietf.org/doc/html/rfc2131#section-4.3.6,
following options must not be sent in renew

- Requested IP Address
- Server Identifier

Since the upstream code has been inactive for 6 years,
we should switch to another dhcpv4 library.
The new selected one is https://github.com/insomniacslk/dhcp.

Signed-off-by: Songmin Li <lisongmin@protonmail.com>
2024-10-14 17:42:30 +02:00
Lionel Jouin
93d197c455 VRF: Wait for the local/host routes to be added
Without waiting for the local/host routes to be added
by the kernel after the IP address is being added to
an interface. The routes requiring the local/host routes
may failed. This caused flaky e2e tests, but could also
happen during the execution of the VRF plugin when the
IPv6 addresses were being re-added to the interface and
when the route were being moved to the VRF table.

Signed-off-by: Lionel Jouin <lionel.jouin@est.tech>
2024-10-14 11:49:25 +02:00
h0nIg
c52e02bccf add problem hint
Signed-off-by: h0nIg <h0nIg@users.noreply.github.com>
2024-10-14 11:47:24 +02:00
h0nIg
d44bbf28af Revert "Merge pull request #921 from oOraph/dev/exclude_subnets_from_traffic_shapping2"
This reverts commit ef076afac1af0b9a8446f72e3343666567bc04dc, reversing
changes made to 597408952e3e7247fb0deef26a3a935c405aa0cf.

Signed-off-by: h0nIg <h0nIg@users.noreply.github.com>
2024-10-14 11:47:24 +02:00
h0nIg
8ad0361964 resolve merge conflicts
Signed-off-by: h0nIg <h0nIg@users.noreply.github.com>
2024-10-14 11:47:24 +02:00
Etienne Champetier
a4b80cc634 host-device: use temp network namespace for rename
Using a temporary name / doing a fast rename causes
some race conditions with udev and NetworkManager:
https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/issues/1599

Signed-off-by: Etienne Champetier <e.champetier@ateme.com>
2024-10-02 10:30:27 +02:00
Gudmundur Bjarni Olafsson
3a49cff1f6 Fix txqueuelen being accidentally set to zero
TxQLen was unintentionally set to 0 due to a struct literal.

Signed-off-by: Gudmundur Bjarni Olafsson <gudmundur.bjarni@gmail.com>
2024-10-02 10:01:11 +02:00
Lionel Jouin
c11ed48733 Ignore link-local routes in SBR tests
The tests were flaky due to a route with the link-local IP being
automatically added after the test run saves the initial state
(routes before SBR plugin is ran). When the SBR plugin is ran,
the new state is compared with the old state. The new state will
then contain the route with the link-local IP (that has been
added after saving the old state), the old state was not
containing it, so the tests were failing

The solution here is to ignore routes with the link-local IP
for the tests.

fixes: #1096

Signed-off-by: Lionel Jouin <lionel.jouin@est.tech>
2024-10-01 00:36:30 +02:00
Casey Callendrello
e5df283ab3
ci, go.mod: bump to go 1.23 (#1094)
* ci, go.mod: bump to go 1.23

Now that go.mod matches our go version, we can stop setting go version
in CI separately.

Signed-off-by: Casey Callendrello <c1@caseyc.net>

* minor: fix lint errors

Bumping golangci-lint to v1.61 introduced some new reasonable checks;
fix the errors they found.

Signed-off-by: Casey Callendrello <c1@caseyc.net>

* ci: bump golangci-lint to v1.61.0

Also, fix some deprecated config directives.

Signed-off-by: Casey Callendrello <c1@caseyc.net>

---------

Signed-off-by: Casey Callendrello <c1@caseyc.net>
2024-09-17 12:28:55 +02:00
Songmin Li
cc8b1bd80c dhcp: Add priority option to dhcp.
Currently, we can not set the metric of routes in dhcp.
It's ok if there is only a network interface.

But if there are multiple network interfaces, and both have a default route,
We need to set the metric of the route to make the traffic
go through the correct network interface.

For host-local and static, we can set the metric with the route.priority option.
But there is no such option for dhcp.

Signed-off-by: Songmin Li <lisongmin@protonmail.com>
2024-09-17 11:47:37 +02:00
Dan Winship
01a94e17c7 Add nftables backend to portmap
Signed-off-by: Dan Winship <danwinship@redhat.com>
2024-09-16 21:17:49 +02:00
Dan Winship
3d1968c152 Fix portmap unit tests
Use `conditionsV4` and `conditionsV6` values that actually look like
valid iptables conditions.

Signed-off-by: Dan Winship <danwinship@redhat.com>
2024-09-16 21:17:49 +02:00
Dan Winship
a3ccebc6ec Add a backend abstraction to the portmap plugin
Signed-off-by: Dan Winship <danwinship@redhat.com>
2024-09-16 21:17:49 +02:00
Dan Winship
61d078645a Add nftables implementation of ipmasq
Signed-off-by: Dan Winship <danwinship@redhat.com>
2024-09-16 21:17:49 +02:00
Lionel Jouin
01b3db8e01
SBR: option to pass the table id (#1088)
* Use of Table ID in IPAM

Signed-off-by: Lionel Jouin <lionel.jouin@est.tech>

* SBR: option to pass the table id

Using the option to set the table number in the SBR meta plugin will
create a policy route for each IP added for the interface returned by
the main plugin.
Unlike the default behavior, the routes will not be moved to the table.
The default behavior of the SBR plugin is kept if the table id is not set.

Signed-off-by: Lionel Jouin <lionel.jouin@est.tech>

---------

Signed-off-by: Lionel Jouin <lionel.jouin@est.tech>
2024-09-09 17:07:23 +02:00
Dan Winship
06ba001d84 Update containernetworking/cni to v1.2.3 for GC
Signed-off-by: Dan Winship <danwinship@redhat.com>
2024-08-28 12:17:48 -04:00
Etienne Champetier
bdb6814fe2 macvlan: add bcqueuelen setting
This setting was introduced in Linux 5.11
d4bff72c84
42f5642a40

Signed-off-by: Etienne Champetier <e.champetier@ateme.com>
2024-08-27 09:21:29 -04:00
Casey Callendrello
0d2780f0e7
Merge branch 'main' into main 2024-08-27 10:20:16 +02:00
Etienne Champetier
d924f05e12 build: update github.com/vishvananda/netlink to 1.3.0
This includes a breaking change:
acdc658b86
route.Dst is now a zero IPNet instead of nil

Signed-off-by: Etienne Champetier <e.champetier@ateme.com>
2024-08-26 14:27:30 -04:00
Songmin Li
6269f399a5
Fix unnecessary retrying when the link is down in dhcp.
From the dhcp daemon log, we can see that dhcp will fail to acquire
the lease when the link is down, and success on retry.

```
2024/08/21 21:30:44 macvlan-dhcp/eth1: acquiring lease
2024/08/21 21:30:44 Link "eth1" down. Attempting to set up
2024/08/21 21:30:44 network is down
2024/08/21 21:30:44 retrying in 2.641696 seconds
2024/08/21 21:30:49 macvlan-dhcp/eth1: lease acquired, expiration is 2024-08-22 09:30:49.755367962 +0800 CST m=+43205.712107889
```

After move the code of set up link to the beginning of the function, the
dhcp success on first time.

```
2024/08/21 22:04:02 macvlan-dhcp/eth1: acquiring lease
2024/08/21 22:04:02 Link "eth1" down. Attempting to set up
2024/08/21 22:04:05 macvlan-dhcp/eth1: lease acquired, expiration is 2024-08-22 10:04:05.297887726 +0800 CST m=+43203.081141304
```

Signed-off-by: Songmin Li <lisongmin@protonmail.com>
2024-08-24 19:54:34 +08:00
guangwu
ada798a3f7 fix: close resolv.conf
Signed-off-by: guoguangwu <guoguangwug@gmail.com>
2024-05-08 20:38:15 +08:00
Tomofumi Hayashi
ccc1cfaa58 Simplify unit test
Signed-off-by: Tomofumi Hayashi <tohayash@redhat.com>
2024-04-08 15:42:20 +09:00
Raphael
78ebd8bfb9 minor case change
even if json unmarshalling in golang with the standard libs is case unsensitive regarding the keys

Signed-off-by: Raphael <oOraph@users.noreply.github.com>
2024-04-08 15:39:47 +09:00
Raphael
c666d1400d bandwidth plugin: split unit tests in several files
Signed-off-by: Raphael <oOraph@users.noreply.github.com>
2024-04-08 15:39:47 +09:00
Raphael
ab0b386b4e bandwidth: possibility to specify shaped subnets or to exclude some from shaping
Signed-off-by: Raphael <oOraph@users.noreply.github.com>
2024-04-08 15:39:47 +09:00
Raphael
52da39d3aa bandwidth: possibility to exclude some subnets from traffic shaping
what changed:

we had to refactor the bandwidth plugin and switch from a classless qdisc (tbf)
to a classful qdisc (htb).

subnets are to be provided in config or runtimeconfig just like other parameters

unit and integration tests were also adapted in consequence

unrelated changes:

test fixes: the most important tests were just silently skipped due to ginkgo Measure deprecation
(the ones actually checking the effectiveness of the traffic control)

Signed-off-by: Raphael <oOraph@users.noreply.github.com>
2024-04-08 15:39:46 +09:00
Tomofumi Hayashi
9f1bf2a848
Merge branch 'main' into support-sf 2024-03-12 20:51:56 +09:00
adrianc
ba5bdafe5d
Use temporary name for netdevice when moving in/out of NS
Today, it is not possible to use host-device CNI to move a
host device to container namespace if a device already exists
in that namespace.

e.g when a delegate plugin (such as multus) is used to provide
multiple networks to a container, CNI Add call will fail if
the targeted host device name already exists in container network
namespace.

to overcome this, we use a temporary name for the interface before
moving it in/out of container network namespace.

Signed-off-by: adrianc <adrianc@nvidia.com>
2024-03-12 12:25:23 +02:00
adrianc
d34720b531
Support DeviceID on Auxiliary Bus
Device plugins may allocate network device on a bus
different than PCI.

sriov-network-device-plugin supports the allocation
of network devices over Auxiliary bus[1][2][3].

extend host-device CNI to support such devices if provided
through runtime config.

- Check if device provided by DeviceID runtime config
  is present on either PCI bus or Auxiliary bus
- extend getLink method to support getting netdev link obj
  from auxiliary bus
- add unit-test to cover the new flow

[1] https://github.com/k8snetworkplumbingwg/sriov-network-device-plugin/tree/master?tab=readme-ov-file#auxiliary-network-devices-selectors
[2] https://github.com/k8snetworkplumbingwg/sriov-network-device-plugin/tree/master/docs/subfunctions
[3] https://docs.kernel.org/networking/devlink/devlink-port.html

Signed-off-by: adrianc <adrianc@nvidia.com>
2024-03-12 12:09:29 +02:00
Austin Vazquez
9c016b5d12 Rename unused variables to resolve lint warnings
Signed-off-by: Austin Vazquez <macedonv@amazon.com>
2024-03-11 17:52:02 +01:00
Or Mergi
7e131a0076 bridge: Enable disabling bridge interface
The new `disableContainerInterface` parameter is added to the bridge plugin to
enable setting the container interface state down.

When the parameter is enabled, the container interface (veth peer that is placed
at the container ns) remain down (i.e: disabled).
The bridge and host peer interfaces state are not affected by the parameter.

Since IPAM logic involve various configurations including waiting for addresses
to be realized and setting the interface state UP, the new parameter cannot work
with IPAM.
In case both IPAM and DisableContainerInterface parameters are set, the bridge
plugin will raise an error.

Signed-off-by: Or Mergi <ormergi@redhat.com>
2024-01-10 15:35:23 +02:00
Casey Callendrello
abee8ccc0d
Merge pull request #954 from cyclinder/improve_cmd_del
macvlan cmdDel: replace the loadConf function with json.unmarshal
2023-11-16 19:06:11 +01:00
arthur-zhang
f90ac41ae4 revert some code in pr 962
Signed-off-by: arthur-zhang <zhangya_no1@qq.com>
2023-11-14 10:04:18 +08:00
Tomofumi Hayashi
00406f9d1e
Merge branch 'main' into fix/ndisc_ipvlan 2023-11-14 08:18:07 +09:00
arthur-zhang
5280b4d582 bridge: fix spelling
Signed-off-by: arthur-zhang <zhangya_no1@qq.com>
2023-11-13 17:11:21 +01:00
arthur-zhang
495a2cbb0c bridge: remove useless firstV4Addr
Signed-off-by: arthur-zhang <zhangya_no1@qq.com>
2023-11-13 17:11:21 +01:00
arthur-zhang
8c59fc1eea bridge: remove useless check
gws.defaultRouteFound here is always false.

Signed-off-by: arthur-zhang <zhangya_no1@qq.com>
2023-11-13 17:11:21 +01:00
Tomofumi Hayashi
1079e113fe Add ndisc_notify in ipvlan for ipv6 ndp
Signed-off-by: Tomofumi Hayashi <tohayash@redhat.com>
2023-11-14 01:07:59 +09:00
Zenghui Shi
999ca15763 macvlan: enable ipv6 ndisc_notify
Signed-off-by: Zenghui Shi <zshi@redhat.com>
2023-11-07 19:43:50 +08:00
cyclinder
845ef62b74 macvlan cmdDel: replace the loadConf function with json.unmarshal
When the master interface on the node has been deleted, and loadConf tries
to get the MTU, This causes cmdDel to return a linkNotFound error to the
runtime. The cmdDel only needs to unmarshal the netConf. No need to
get the MTU. So we just replaced the loadConf function with
json.unmarshal in cmdDel.

Signed-off-by: cyclinder <qifeng.guo@daocloud.io>
2023-10-17 10:26:18 +08:00
Riccardo Ravaioli
33ccedc66f Create IPAM files with 0600 permissions
Conform to CIS Benchmarks "1.1.9 Ensure that the Container Network Interface file permissions are set to 600 or more restrictive"
https://www.tenable.com/audits/items/CIS_Kubernetes_v1.20_v1.0.1_Level_1_Master.audit:f1717a5dd65d498074dd41c4a639e47d

Signed-off-by: Riccardo Ravaioli <rravaiol@redhat.com>
2023-10-02 11:59:31 +02:00
Casey Callendrello
9d9ec6e3e1
Merge pull request #927 from sockmister/vrf_filter_fix
vrf: fix route filter to use output iface
2023-07-21 13:49:33 +02:00
Casey Callendrello
8fd63065a6
Merge pull request #913 from AlinaSecret/dhcp/fix-race-test
Fix race conditions in DHCP test
2023-07-21 12:55:01 +02:00
Poh Chiat Koh
c1a7948b19 vrf: fix route filter to use output iface
current route filter uses RT_FILTER_IIF in conjunction with LinkIndex.
This combination is ignored by netlink, rendering the filter
ineffective

Signed-off-by: Poh Chiat Koh <poh@inter.link>
2023-07-21 12:50:21 +02:00