107 Commits

Author SHA1 Message Date
selansen
8b8825bcd8 V2 API support for win-overlay CNI
This PR bring V2 API support into win-overlay CNI. With the current V1
API, only docker runtime works for win-overlay. By bringing new changes, we
should be able to use containerd as the runtime.Below are the key
points regarding this implementation.
	1. Clear seperation for V1 & V2 API support
	2. New cni.conf sample that works for win-overlay

Signed-off-by: selansen <esiva@redhat.com>
Signed-off-by: mansikulkarni96 <mankulka@redhat.com>
2022-04-14 12:44:49 -04:00
Sang Heon Lee
dca23ad451 portmap: fix bug that new udp connection deletes all existing conntrack entries
Calling AddPort before AddProtocol returns an error, which means ConntrackDeleteFilter has been called without port filter.

Signed-off-by: Sang Heon Lee <developistBV@gmail.com>
2022-02-19 14:34:43 +09:00
Akihiro Suda
22dd6c553d
firewall: support ingressPolicy=(open|same-bridge) for isolating bridges as in Docker
This commit adds a new parameter `ingressPolicy` (`string`) to the `firewall` plugin.
The supported values are `open` and `same-bridge`.

- `open` is the default and does NOP.

- `same-bridge` creates "CNI-ISOLATION-STAGE-1" and "CNI-ISOLATION-STAGE-2"
that are similar to Docker libnetwork's "DOCKER-ISOLATION-STAGE-1" and
"DOCKER-ISOLATION-STAGE-2" rules.

e.g., when `ns1` and `ns2` are connected to bridge `cni1`, and `ns3` is
connected to bridge `cni2`, the `same-bridge` ingress policy disallows
communications between `ns1` and `ns3`, while allowing communications
between `ns1` and `ns2`.

Please refer to the comment lines in `ingresspolicy.go` for the actual iptables rules.

The `same-bridge` ingress policy is expected to be used in conjunction
with `bridge` plugin. May not work as expected with other "main" plugins.

It should be also noted that the `same-bridge` ingress policy executes
raw `iptables` commands directly, even when the `backend` is set to `firewalld`.
We could potentially use the "direct" API of firewalld [1] to execute
iptables via firewalld, but it doesn't seem to have a clear benefit over just directly
executing raw iptables commands.
(Anyway, we have been already executing raw iptables commands in the `portmap` plugin)

[1] https://firewalld.org/documentation/direct/options.html

This commit replaces the `isolation` plugin proposal (issue 573, PR 574).
The design of `ingressPolicy` was discussed in the comments of the withdrawn PR 574 ,
but `same-network` was renamed to `same-bridge` then.

Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
2022-02-03 15:49:43 +09:00
Tobias Klauser
9649ec14f5
pkg/ns: use file system magic numbers from golang.org/x/sys/unix
Use the constants already defined in the golang.org/x/sys/unix package
instead of open-coding them.

Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
2022-01-20 12:43:20 +01:00
Casey Callendrello
0c12d8a1c8 gofmt
Signed-off-by: Casey Callendrello <cdc@redhat.com>
2022-01-19 18:25:39 +01:00
Michael Zappa
5d073d690c plugins: replace arping package with arp_notify
this replaces the arping package with the linux arp_notify feature.

Resolves: #588
Signed-off-by: Michael Zappa <Michael.Zappa@stateless.net>
2022-01-06 20:53:54 -07:00
Matt Dupre
f1f128e3c9
Merge pull request #639 from EdDev/bridge-macspoofchk
bridge: Add macspoofchk support
2021-10-06 08:39:10 -07:00
Dan Williams
2a9114d1af
Merge pull request #665 from edef1c/filepath-clean
Don't redundantly filepath.Clean the output of filepath.Join
2021-09-29 10:35:48 -05:00
edef
ceb34eb2e6 Don't redundantly filepath.Clean the output of filepath.Join
filepath.Join is already specified to clean its output,
and the implementation indeed does so.

Signed-off-by: edef <edef@edef.eu>
2021-09-17 14:12:46 +00:00
edef
90c018566c Use crypto/rand.Read, not crypto.Reader.Read
The current code accidentally ignores partial reads, since it doesn't
check the return value of (io.Reader).Read.

What we actually want is io.ReadFull(rand.Reader, buf), which is
conveniently provided by rand.Read(buf).

Signed-off-by: edef <edef@edef.eu>
2021-09-17 13:30:14 +00:00
Edward Haas
081ed44a1d bridge: Add macspoofchk support
The new macspoofchk field is added to the bridge plugin to support
anti-mac-spoofing.
When the parameter is enabled, traffic is limited to the mac addresses
of the container interface (the veth peer that is placed in the
container ns).
Any traffic that exits the pod is checked against the source mac address
that is expected. If the mac address is different, the frames are
dropped.

The implementation is using nftables and should only be used on nodes
that support it.

Signed-off-by: Edward Haas <edwardh@redhat.com>
2021-09-14 12:46:15 +03:00
Dan Williams
a49f908168 ip: place veth peer in host namspace directly
Instead of moving the host side of the veth peer into the host
network namespace later, just create it in the host namespace
directly.

Signed-off-by: Dan Williams <dcbw@redhat.com>
2021-07-21 09:59:11 -05:00
Dan Williams
f14ff6687a
Merge pull request #636 from EdDev/bridge-mac-specification
bridge: Add mac field to specify container iface mac
2021-06-30 10:57:09 -05:00
Edward Haas
a3cde17fc0 bridge: Add mac field to specify container iface mac
Controlling the mac address of the interface (veth peer) in the
container is useful for functionalities that depend on the mac address.
Examples range from dynamic IP allocations based on an identifier (the
mac) and up to firewall rules (e.g. no-mac-spoofing).

Enforcing a mac address at an early stage and not through a chained
plugin assures the configuration does not have wrong intermediate
configuration. This is especially critical when a dynamic IP may be
provided already in this period.
But it also has implications for future abilities that may land on the
bridge plugin, e.g. supporting no-mac-spoofing.

The field name used (`mac`) fits with other plugins which control the
mac address of the container interface.

The mac address may be specified through the following methods:
- CNI_ARGS
- Args
- RuntimeConfig [1]

The list is ordered by priority, from lowest to higher. The higher
priority method overrides any previous settings.
(e.g. if the mac is specified in RuntimeConfig, it will override any
specifications of the mac mentioned in CNI_ARGS or Args)

[1] To use RuntimeConfig, the network configuration should include the
`capabilities` field with `mac` specified (`"capabilities": {"mac": true}`).

Signed-off-by: Edward Haas <edwardh@redhat.com>
2021-06-29 10:50:19 +03:00
Casey Callendrello
2876cd5476
Merge pull request #635 from EdDev/cleanup-hwaddr-util
Cleanup unused code
2021-06-16 17:42:31 +02:00
Edward Haas
2f9917ebed utils, hwaddr: Remove unused package
Signed-off-by: Edward Haas <edwardh@redhat.com>
2021-06-07 16:22:31 +03:00
Edward Haas
272f15420d ip, link_linux: Remove unused SetHWAddrByIP function
Signed-off-by: Edward Haas <edwardh@redhat.com>
2021-06-07 15:59:41 +03:00
thxcode
4b180a9d9c refactor(win-bridge): netconf
- support v2 api
- unify v1 and v2 api

BREAKING CHANGE:
- remove `HcnPolicyArgs` field
- merge `HcnPolicyArgs` into `Policies` field

Signed-off-by: thxcode <thxcode0824@gmail.com>
2021-05-27 23:49:16 +08:00
thxcode
9215e60986 refactor(win-bridge): hcn api processing
Signed-off-by: thxcode <thxcode0824@gmail.com>
2021-05-27 23:14:11 +08:00
thxcode
93a55036b1 refactor(win-bridge): hns api processing
Signed-off-by: thxcode <thxcode0824@gmail.com>
2021-05-27 23:14:11 +08:00
thxcode
aa8c8c1489 chore(win-bridge): location related
- group functions by HNS,HCN

Signed-off-by: thxcode <thxcode0824@gmail.com>
2021-05-27 23:14:11 +08:00
thxcode
ec75bb8587 chore(win-bridge): text related
- format function names
- add/remove comments
- adjust message of error

Signed-off-by: thxcode <thxcode0824@gmail.com>
2021-05-27 23:14:11 +08:00
Casey Callendrello
8de0287741
Merge pull request #615 from mccv1r0/pr602
Allow multiple routes to be added for the same prefix
2021-05-05 11:33:55 -04:00
Michael Cambria
76ef07ebc6 Allow multiple routes to be added for the same prefix.
Enables ECMP

Signed-off-by: Michael Cambria <mcambria@redhat.com>
2021-05-05 11:20:10 -04:00
Bruce Ma
7da1c84919 pkg/ip: introduce a new type IP to support formated <ip>[/<prefix>]
Signed-off-by: Bruce Ma <brucema19901024@gmail.com>
2021-04-13 17:53:43 +08:00
Dan Williams
d385120175
Merge pull request #537 from dcbw/100
Port plugins to CNI 1.0.0 and increase old verison test coverage
2021-03-03 10:51:56 -06:00
David Verbeiren
9b09f167bb pkg/ipam: use slash as sysctl separator so interface name can have dot
A dot is a valid character in interface names and is often used in the
names of VLAN interfaces. The sysctl net.ipv6.conf.<ifname>.disable_ipv6
key path cannot use dots both in the ifname and as path separator.
We switch to using / as key path separator so dots are allowed in the
ifname.

This works because sysctl.Sysctl() accepts key paths with either dots
or slashes as separators.

Also, print error message to stderr in case sysctl cannot be read
instead of silently hiding the error.

Signed-off-by: David Verbeiren <david.verbeiren@tessares.net>
2021-02-22 15:54:03 +01:00
Dan Williams
8c25db87b1 testutils: add test utilities for spec version features
Signed-off-by: Dan Williams <dcbw@redhat.com>
2021-02-11 23:27:08 -06:00
Dan Williams
7d8c767622 plugins: update to spec version 1.0.0
Signed-off-by: Dan Williams <dcbw@redhat.com>
2021-02-11 23:27:08 -06:00
Casey Callendrello
3ae85c1093 pkg/ns: fix test case to tolerate pids going away.
Signed-off-by: Casey Callendrello <cdc@redhat.com>
2020-12-09 17:46:29 +01:00
Antonio Ojea
108c2aebd4 portmap plugin should flush previous udp connections
conntrack does not have any way to track UDP connections, so
it relies on timers to delete a connection.
The problem is that UDP is connectionless, so a client will keep
sending traffic despite the server has gone, thus renewing the
conntrack entries.
Pods that use portmaps to expose UDP services need to flush the existing
conntrack entries on the port exposed when they are created,
otherwise conntrack will keep sending the traffic to the previous IP
until the connection age (the client stops sending traffic)

Signed-off-by: Antonio Ojea <aojea@redhat.com>
2020-11-23 16:29:52 +01:00
Federico Paolinelli
8a6e96b3f0 Replace nc with the local echo client.
This makes the behaviour more consistent across platforms.

Signed-off-by: Federico Paolinelli <fpaoline@redhat.com>
2020-10-07 20:13:24 +02:00
Federico Paolinelli
322790226b Add an echo client to be used instead of nc.
nc behaviour depends on the implementation version of what's on the current host.
Here we use our own client with stable behaviour.

Signed-off-by: Federico Paolinelli <fpaoline@redhat.com>
2020-10-02 15:56:27 +02:00
Bryan Boreham
1ea19f9213 Remove extraneous test file in Windows plugin
We already have a function to run all tests in the package, in netconf_suite_windows_test.go

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2020-09-09 16:12:54 +00:00
Quan Tian
799d3cbf4c Fix race condition in GetCurrentNS
In GetCurrentNS, If there is a context-switch between
getCurrentThreadNetNSPath and GetNS, another goroutine may execute in
the original thread and change its network namespace, then the original
goroutine would get the updated network namespace, which could lead to
unexpected behavior, especially when GetCurrentNS is used to get the
host network namespace in netNS.Do.

The added test has a chance to reproduce it with "-count=50".

The patch fixes it by locking the thread in GetCurrentNS.

Signed-off-by: Quan Tian <qtian@vmware.com>
2020-08-21 13:05:21 +08:00
Casey Callendrello
219eb9e046 ptp, bridge: disable accept_ra on the host-side interface
The interface plugins should have absolute control over their addressing
and routing.

Signed-off-by: Casey Callendrello <cdc@redhat.com>
2020-05-12 15:54:23 +02:00
Vincent Boulineau
2d2583ee33
win-bridge: add support for portMappings capability
If the pluging receives portMappings in runtimeConfig, the pluing will add a NAT policy for each port mapping on the generated endpoints.
It enables HostPort usage on Windows with win-bridge.

Signed-off-by: Vincent Boulineau <vincent.boulineau@datadoghq.com>
2020-04-15 15:01:32 +02:00
Bruce Ma
486ef96e6f [DO NOT REVIEW] vendor upate to remove useless dependencies
Signed-off-by: Bruce Ma <brucema19901024@gmail.com>
2020-03-17 14:30:28 +08:00
Bruce Ma
8a0e3fe10e build error utility package to replace juju/errors
Signed-off-by: Bruce Ma <brucema19901024@gmail.com>
2020-03-11 20:33:21 +08:00
Ihar Hrachyshka
112288ecb2 Unlock OS thread after netns is restored
The current ns package code is very careful about not leaving the calling
thread with the overridden namespace set, for example when origns.Set() fails.
This is achieved by starting a new green thread, locking its OS thread, and
never unlocking it. Which makes golang runtime to scrap the OS thread backing
the green thread after the go routine exits.

While this works, it's probably not as optimal: stopping and starting a new OS
thread is expensive and may be avoided if we unlock the thread after resetting
network namespace to the original. On the other hand, if resetting fails, it's
better to leave the thread locked and die.

While it won't work in all cases, we can still make an attempt to reuse the OS
thread when resetting the namespace succeeds. This can be achieved by unlocking
the thread conditionally to the namespace reset success.

Signed-off-by: Ihar Hrachyshka <ihrachys@redhat.com>
2020-02-20 17:24:36 -05:00
Piotr Skamruk
f5c3d1b1ba
Merge pull request #443 from mars1024/bugfix/black_box_test
pkg/utils: sysctl package should use black-box testing
2020-01-29 17:26:04 +01:00
Bruce Ma
2ff84a481e pkg/ip: use type cast instead of untrusty error message
Signed-off-by: Bruce Ma <brucema19901024@gmail.com>
2020-01-29 20:03:15 +08:00
Bruce Ma
37207f05b4 pkg/utils: sysctl package should use black-box testing
Signed-off-by: Bruce Ma <brucema19901024@gmail.com>
2020-01-27 21:09:04 +08:00
Jaime Caamaño Ruiz
963aaf86e6 Format with gofmt
Signed-off-by: Jaime Caamaño Ruiz <jcaamano@suse.com>
2020-01-13 19:44:40 +01:00
Jaime Caamaño Ruiz
cd9d6b28da Use Replace instead of ReplaceAll
Signed-off-by: Jaime Caamaño Ruiz <jcaamano@suse.com>
2020-01-13 16:50:13 +01:00
Jaime Caamaño Ruiz
0452c1dd10 Fix copyrights
Signed-off-by: Jaime Caamaño Ruiz <jcaamano@suse.com>
2020-01-13 14:56:58 +01:00
Jaime Caamaño Ruiz
d671d29ad5 Improve support of sysctl name seprators
Sysctl names can use dots or slashes as separator:

- if dots are used, dots and slashes are interchanged.
- if slashes are used, slashes and dots are left intact.

Separator in use is determined by firt ocurrence.

Reference: http://man7.org/linux/man-pages/man5/sysctl.d.5.html

Signed-off-by: Jaime Caamaño Ruiz <jcaamano@suse.com>
2020-01-13 14:40:42 +01:00
Antonio Ojea
bf8f171041 iptables: add idempotent functions
Add the following idempotent functions to iptables utils:

DeleteRule: idempotently delete an iptables rule
DeleteChain: idempotently delete an iptables chain
ClearChain: idempotently flush an iptables chain

Signed-off-by: Antonio Ojea <antonio.ojea.garcia@gmail.com>
2019-12-12 15:13:15 +01:00
Tim Gross
58dd90b996 ensure iptables chain creation is idempotent
Concurrent use of the `portmap` and `firewall` plugins can result in
errors during iptables chain creation:

- The `portmap` plugin has a time-of-check-time-of-use race where it
  checks for existence of the chain but the operation isn't atomic.
- The `firewall` plugin doesn't check for existing chains and just
  returns an error.

This commit makes both operations idempotent by creating the chain and
then discarding the error if it's caused by the chain already
existing. It also factors the chain creation out into `pkg/utils` as a
site for future refactoring work.

Signed-off-by: Tim Gross <tim@0x74696d.com>
2019-11-11 10:00:11 -05:00
Giuseppe Scrivano
85083ea434
testutils: newNS() works in a rootless user namespace
When running in a user namespace created by an unprivileged user the
owner of /var/run will be reported as the unknown user (as defined in
/proc/sys/kernel/overflowuid) so any access to the directory will
fail.

If the XDG_RUNTIME_DIR environment variable is set, check whether the
current user is also the owner of /var/run.  If the owner is different
than the current user, use the $XDG_RUNTIME_DIR/netns directory.

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2019-10-19 12:04:53 +02:00