This commit would make host-device plugin as a placeholder
for DPDK device when applications wants to attach it with
a pod container through network attachment definition.
Signed-off-by: Periyasamy Palanisamy <periyasamy.palanisamy@est.tech>
Eventually the timeout value will become a CLI argument
The default timeout was nestled all the way in the lease constructor
This commit is the first step in making the timeout configurable by
moving it to the DHCPLease constructor
Signed-off-by: toby lorne <toby@toby.codes>
Closes #544
The above issue describes a situation where using the bridge plugin
with IPv6 addresses prevented `DEL` from working correctly.
`DEL` seems to be failing in the body of `TeardownIPMasq`
This arises because:
* twice delete postrouting rules: `ipn.String()` `ipn.IP.String()` #279
* we are using a version of go-iptables which is bugged for v6
PR github.com/coreos/go-iptables/pull/74 describes why this does
not work. The error message is not being checked correctly.
Using a later version of go-iptables means that
* when the second `ipt.Delete` fails (this is okay)
* we will correctly interpret this as an non-fatal error
* `TeardownIPMasq` will not prematurely exit the method
* `ipt.ClearChain` now can run
* `ipt.DeleteChain` now can run
This explains why this was working for v4 but not v6
This commit was amended to include v0.5.0 instead of a pseudo-version
v0.4.6-0.20200318170312-12696f5c9108
Signed-off-by: toby lorne <toby@toby.codes>
Values changed by Tuning plugin should be changed only for pod, therefore should be reverted when NIC is being moved from pod back to host.
Fixes: #493
Signed-off-by: Patryk Strusiewicz-Surmacki <patrykx.strusiewicz-surmacki@intel.com>
Instead of checking the total number of addresses, which might vary
depending on the IPv6 Privacy Address settings of the distro being
used, just check that we have the number of non-link-local addresses
we expect.
Signed-off-by: Dan Williams <dcbw@redhat.com>
conntrack does not have any way to track UDP connections, so
it relies on timers to delete a connection.
The problem is that UDP is connectionless, so a client will keep
sending traffic despite the server has gone, thus renewing the
conntrack entries.
Pods that use portmaps to expose UDP services need to flush the existing
conntrack entries on the port exposed when they are created,
otherwise conntrack will keep sending the traffic to the previous IP
until the connection age (the client stops sending traffic)
Signed-off-by: Antonio Ojea <aojea@redhat.com>
The current cni config has an extra comma and cannot be parsed normally, the kubelet will report an error as follows:
"Error loading CNI config file: error parsing configuration: invalid character '}' looking for beginning of object key string"
Signed-off-by: xieyanker <xjsisnice@gmail.com>
The e2e tests already covers both versions, and since the plugin is
meant to be used in chains, this will augment the scope of the plugins
it can be used with.
Signed-off-by: Federico Paolinelli <fpaoline@redhat.com>
When specified from the user, the VRF will get assigned to the given
tableid instead of having the CNI to choose for a free one.
Signed-off-by: Federico Paolinelli <fpaoline@redhat.com>
The new tests expand coverage, checking deletion, ip address handling,
0.4.0 compatibility, behaviour in case of multiple vrfs.
Signed-off-by: Federico Paolinelli <fpaoline@redhat.com>
This plugin allows to create a VRF with the given name (or use the existing
one if any) in the target namespace, and to allocate the interface
to it.
VRFs make it possible to use multiple routing tables on the same namespace and
allows isolation among interfaces within the same namespace. On top of that, this
allow different interfaces to have overlapping CIDRs (or even addresses).
This is only useful in addition to other plugins.
The configuration is pretty simple and looks like:
{
"type": "vrf",
"vrfname": "blue"
}
Signed-off-by: Federico Paolinelli <fpaoline@redhat.com>
nc behaviour depends on the implementation version of what's on the current host.
Here we use our own client with stable behaviour.
Signed-off-by: Federico Paolinelli <fpaoline@redhat.com>
VRF support was introduced in ubuntu bionic, while it's missing in Xenial.
This also introduces a change in the behaviour of nc command.
On one hand, it requires a new line to send the buffer on the other side,
on the other it hangs waiting for new input.
To address this, a timeout was introduced to avoid the tests to hang,
plus the buffer sent is terminated with a new line character.
Signed-off-by: Federico Paolinelli <fpaoline@redhat.com>
This change allows providing an 'ipam' section as part of the
input network configuration for flannel. It is then used as
basis to construct the ipam parameters provided to the delegate.
All parameters from the input ipam are preserved except:
* 'subnet' which is set to the flannel host subnet
* 'routes' which is complemented by a route to the flannel
network.
One use case of this feature is to allow adding back the routes
to the cluster services and/or to the hosts (HostPort) when
using isDefaultGateway=false. In that case, the bridge plugin
does not install a default route and, as a result, only pod-to-pod
connectivity would be available.
Example:
{
"name": "cbr0",
"cniVersion": "0.3.1",
"type": "flannel",
"ipam": {
"routes": [
{
"dst": "192.168.242.0/24"
},
{
"dst": "10.96.0.0/12"
}
],
"unknown-param": "value"
},
"delegate": {
"hairpinMode": true,
"isDefaultGateway": false
}
...
}
This results in the following 'ipam' being provided to the delegate:
{
"routes" : [
{
"dst": "192.168.242.0/24"
},
{
"dst": "10.96.0.0/12"
},
{
"dst" : "10.1.0.0/16"
}
],
"subnet" : "10.1.17.0/24",
"type" : "host-local"
"unknown-param": "value"
}
where "10.1.0.0/16" is the flannel network and "10.1.17.0/24" is
the host flannel subnet.
Note that this also allows setting a different ipam 'type' than
"host-local".
Signed-off-by: David Verbeiren <david.verbeiren@tessares.net>
This change makes ipvlan master parameter optional.
Default to default route interface as macvlan does.
Signed-off-by: Tomofumi Hayashi <tohayash@redhat.com>
In GetCurrentNS, If there is a context-switch between
getCurrentThreadNetNSPath and GetNS, another goroutine may execute in
the original thread and change its network namespace, then the original
goroutine would get the updated network namespace, which could lead to
unexpected behavior, especially when GetCurrentNS is used to get the
host network namespace in netNS.Do.
The added test has a chance to reproduce it with "-count=50".
The patch fixes it by locking the thread in GetCurrentNS.
Signed-off-by: Quan Tian <qtian@vmware.com>
if the runtime is not passing portMappings in the runtimeConfig,
then DEL is a noop.
This solves performance issues, when the portmap plugin is
executed multiple times, holding the iptables lock, despite
it does not have anything to delete.
Signed-off-by: Antonio Ojea <aojea@redhat.com>
It may happen that you want to map a port only in one IP family.
It can be achieved using the unspecified IP address of the
corresponding IP family as HostIP i.e.:
podman run --rm --name some-nginx -d -p 0.0.0.0:8080:80 nginx
The problem is that current implementation considers the
unspecified address valid and appends it to the iptables rule:
-A CNI-DN-60380cb3197c5457ed6ba -s 10.88.0.0/16
-d 0.0.0.0/32 -p tcp -m tcp --dport 8080 -j CNI-HOSTPORT-SETMARK
This rule is not forwarding the traffic to the mapped port.
We should use the unspecified address only to discriminate the IP
family of the port mapping, but not use it to filter the dst.
Signed-off-by: Antonio Ojea <antonio.ojea.garcia@gmail.com>
This change sets the mac address if specified during the creation of the
macvlan interface. This is superior to setting it via the tuning plugin
because this ensures the mac address is set before an IP is set,
allowing a container to get a reserved IP address from DHCP.
Related #450
Signed-off-by: Clint Armstrong <clint@clintarmstrong.net>
When trying to move a master and slave interface into a container it is not
possible without first bringing the interfaces down. This change ensures
that the interface is set to down prior to trying to move the interface
into the container. This matches the behaviour on moving an interface out
of the container.
Signed-off-by: cns <christopher.swindle@metaswitch.com>
A /64 mask was used which routed an entire cidr based on source,
not only the bound address.
Fixes #478
Signed-off-by: Lars Ekman <lars.g.ekman@est.tech>
The DNAT hairpin rule only allow the container itself to access the
ports it is exposing thru the host IP. Other containers in the same
subnet might also want to access this service via the host IP, so
apply this rule to the whole subnet instead of just for the container.
This is particularly useful with setups using a reverse proxy for
https. With such a setup connections between containers (for ex.
oauth2) have to downgrade to http, or need complex dns setup to make
use of the internal IP of the reverse proxy. On the other hand going
thru the host IP is easy as that is probably what the service name
already resolve to.
Signed-off-by: Alban Bedel <albeu@free.fr>
--
v2: Fixed the tests
v3: Updated iptables rules documentation in README.md
v4: Fixed the network addresses in README.md to match iptables output
If the pluging receives portMappings in runtimeConfig, the pluing will add a NAT policy for each port mapping on the generated endpoints.
It enables HostPort usage on Windows with win-bridge.
Signed-off-by: Vincent Boulineau <vincent.boulineau@datadoghq.com>
fix #463
link host veth pair to bridge, the Initial state
of port is BR_STATE_DISABLED and change to
BR_STATE_FORWARDING async.
Signed-off-by: honglichang <honglichang@tencent.com>
The current ns package code is very careful about not leaving the calling
thread with the overridden namespace set, for example when origns.Set() fails.
This is achieved by starting a new green thread, locking its OS thread, and
never unlocking it. Which makes golang runtime to scrap the OS thread backing
the green thread after the go routine exits.
While this works, it's probably not as optimal: stopping and starting a new OS
thread is expensive and may be avoided if we unlock the thread after resetting
network namespace to the original. On the other hand, if resetting fails, it's
better to leave the thread locked and die.
While it won't work in all cases, we can still make an attempt to reuse the OS
thread when resetting the namespace succeeds. This can be achieved by unlocking
the thread conditionally to the namespace reset success.
Signed-off-by: Ihar Hrachyshka <ihrachys@redhat.com>
DEL can be called multiple times, a plugin should return no error if
the device is already removed, and other errors should be returned. It
was the opposite for vlan plugin. This PR fixes it.
Signed-off-by: Quan Tian <qtian@vmware.com>
Sysctl names can use dots or slashes as separator:
- if dots are used, dots and slashes are interchanged.
- if slashes are used, slashes and dots are left intact.
Separator in use is determined by firt ocurrence.
Reference: http://man7.org/linux/man-pages/man5/sysctl.d.5.html
Signed-off-by: Jaime Caamaño Ruiz <jcaamano@suse.com>
If the Linux kernel is not built with the parameter
CONFIG_BRIDGE_VLAN_FILTERING, passing vlanFiltering in
the Bridge struct returns an error creating the bridge interface.
This happens even when no parameter is set on Vlan in the CNI config.
This change fixes the case where no Vlan parameter is configured on
CNI config file so the flag doesn't need to be included in the struct.
Signed-off-by: Carlos de Paula <me@carlosedp.com>
bump the go-iptables module to v0.4.5 to avoid
concurrency issues with the portmap plugin and
errors related to iptables not able to hold the
lock.
Signed-off-by: Antonio Ojea <antonio.ojea.garcia@gmail.com>
Add the following idempotent functions to iptables utils:
DeleteRule: idempotently delete an iptables rule
DeleteChain: idempotently delete an iptables chain
ClearChain: idempotently flush an iptables chain
Signed-off-by: Antonio Ojea <antonio.ojea.garcia@gmail.com>
It turns out that the portmap plugin is not idempotent if its
executed in parallel.
The errors are caused due to a race of different instantiations
deleting the chains.
This patch does that the portmap plugin doesn't fail if the
errors are because the chain doesn't exist on teardown.
Signed-off-by: Antonio Ojea <antonio.ojea.garcia@gmail.com>
Use a Describe container for the It code block of the
portmap port forward integration test.
Signed-off-by: Antonio Ojea <antonio.ojea.garcia@gmail.com>
Concurrent use of the `portmap` and `firewall` plugins can result in
errors during iptables chain creation:
- The `portmap` plugin has a time-of-check-time-of-use race where it
checks for existence of the chain but the operation isn't atomic.
- The `firewall` plugin doesn't check for existing chains and just
returns an error.
This commit makes both operations idempotent by creating the chain and
then discarding the error if it's caused by the chain already
existing. It also factors the chain creation out into `pkg/utils` as a
site for future refactoring work.
Signed-off-by: Tim Gross <tim@0x74696d.com>
This change sends gratuitous ARP when MAC address is changed to
let other devices to know the MAC address update.
Signed-off-by: Tomofumi Hayashi <tohayash@redhat.com>
When running in a user namespace created by an unprivileged user the
owner of /var/run will be reported as the unknown user (as defined in
/proc/sys/kernel/overflowuid) so any access to the directory will
fail.
If the XDG_RUNTIME_DIR environment variable is set, check whether the
current user is also the owner of /var/run. If the owner is different
than the current user, use the $XDG_RUNTIME_DIR/netns directory.
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
This change introduce priorities for IPs input among CNI_ARGS,
'args' and runtimeConfig. Fix #399.
Signed-off-by: Tomofumi Hayashi <tohayash@redhat.com>
The CNI spec states that for DEL implementations, "when CNI_NETNS and/or
prevResult are not provided, the plugin should clean up as many resources as
possible (e.g. releasing IPAM allocations) and return a successful response".
This change results in the firewall plugin conforming to the spec by not
returning an error whenever the del method is not provided a prevResult.
Signed-off-by: Erik Sipsma <sipsma@amazon.com>
Previously, if an IPAM plugin provided DNS settings in the result to the PTP
plugin, those settings were always lost because the PTP plugin would always
provide its own DNS settings in the result even if the PTP plugin was not
configured with any DNS settings.
This was especially problematic when trying to use, for example, the host-local
IPAM plugin's support for retrieving DNS settings from a resolv.conf file on
the host. Before this change, those DNS settings were always lost when using the
PTP plugin and couldn't be specified as part of PTP instead because PTP does not
support parsing a resolv.conf file.
This change checks to see if any fields were actually set in the PTP plugin's
DNS settings and only overrides any previous DNS results from an IPAM plugin in
the case that settings actually were provided to PTP. In the case where no
DNS settings are provided to PTP, the DNS results of the IPAM plugin (if any)
are used instead.
Signed-off-by: Erik Sipsma <sipsma@amazon.com>
Adds a bool to the cni config that will add a policy that allows for loopbackDSR on an interface. Updates relevant documentation. Allows L2Tunnel networks to be used for L2Bridge plugin.
* Increase entroy from 2 bytes to 7 bytes to prevent collisions
* Extract common library function for hash with prefix
* Refactor portmap plugin to use library function
fixes #347
Co-authored-by: Cameron Moreau <cmoreau@pivotal.io>
Co-authored-by: Mikael Manukyan <mmanukyan@pivotal.io>
Update github.com/safchain/ethtool to fix the compilation
error on 386. Also added 386 to the tarvis yaml.
Fixes #322
Signed-off-by: Moshe Levi <moshele@mellanox.com>
This PR add the option to configure an empty ipam for the macvlan cni plugin.
When using the macvlan cni plugin with an empty ipam the requeted pod will get the macvlan interface but without any ip address.
One of the use cases for this feature is for projects that runs a dhcp server inside the pod like KubeVirt.
In KubeVirt we need to let the vm running inside the pod to make the dhcp request so it will be able to make a release an renew request when needed.
We used to return error if no endpoint was found during delete. We now treat this as a success. If we fail during an add call, we now make a delete delegate call to the ipam to clean-up.
Now that libcni has the ability to print a version message, plumb it
through correctly.
While we're at it,
- fix import paths
- run gofmt
- add some more comments to sample
- add container runtime swappability for release
Example of usage, which uses flannel for allocating IP
addresses for containers and then registers them in `trusted`
zone in firewalld:
{
"cniVersion": "0.3.1",
"name": "flannel-firewalld",
"plugins": [
{
"name": "cbr0",
"type": "flannel",
"delegate": {
"isDefaultGateway": true
}
},
{
"type": "firewall",
"backend": "firewalld",
"zone": "trusted"
}
]
}
Fixes #114
Signed-off-by: Alban Crequy <alban@kinvolk.io>
Signed-off-by: Michal Rostecki <mrostecki@suse.com>
Distros often have additional rules in the their iptabvles 'filter' table
that do things like:
-A FORWARD -j REJECT --reject-with icmp-host-prohibited
docker, for example, gets around this by adding explicit rules to the filter
table's FORWARD chain to allow traffic from the docker0 interface. Do that
for a given host interface too, as a chained plugin.
With the VLAN filter, the Linux bridge acts more like a real switch, Allow to tag and untag
vlan id's on every interface connected to the bridge.
This PR also creates a veth interface for the bridge vlan interface on L3 configuration.
Related to https://developers.redhat.com/blog/2017/09/14/vlan-filter-support-on-bridge/ post.
Note: This feature was introduced in Linux kernel 3.8 and was added to RHEL in version 7.0.
host-local and static ipam plugins
tuning, bandwidth and portmap meta plugins
Utility functions created for common PrevResult checking
Fix windows build
DHCP REQUEST from DHCP plugin does not include Subnet Mask option parameter (1). Some DHCP servers need that option to be explicit in order to return it in a DHCPACK message.
If not, DHCP plugin returns "DHCP option Subnet Mask not found in DHCPACK" error msg in this type of scenario.
This adds the dns capability for supplying a runtime dnsConfig from a CRI. It also includes a bug fix for removing an endpoint when no IPAM is supplied. Adds version dependency of 0.3.0. Mild updates to windows READMEs.
Move the windows plugin to use the Host Compute (v2) APIs, as well
as clean-up the code. Allows win-bridge to use either the old API or Host Compute (v2) api
depending on a conf parameter. Fixes a leaked endpoint issue on windows for the v1 flow, and
removes the hns/pkg from the linux test run.
interfaces using dhcp ipam.
Vendor latest dhcp4server, dhcp4client, dhcp4
Added additional tests for new functionality in dhcp2_test.go
Wrap d2g dhcp4client calls with our own which add clientID to packet.
- Change variable name to camel style to fix golint warning
- Execute the IPAM to assign the IP address if it's inside in the config
- Test the IPAM module with static plugin
When building the windows plugin exe's (host-local, flannel, win-overlay, win-bridge),
it was necessary to use 'GOOS=windows go build path/to/plugin' rather than the build script.
This makes 'GOOS=windows GOARCH=amd64 ./build.sh' build all the windows plugin binaries.
Add two interfaces (e.g. eth0, eth1) to the same container.
Ensure each file now has ContainerID and ifname.
Delete one, ensure that the right file was deleted.
Add an interface using just ContainerID in the file.
Delete to verify we are still backwards compatible with any
files created using earlier verison of host-local plugin.
Running ginkgo tests in parallel causes problems with dhcp_test.go.
BeforeEach() is run once for each spec before any actual dhcp test starts.
This results in setting up two dhcp4servers that run concurrently.
Both try to Listen and use unix socketPath file /run/cni/dhcp.sock at the same time.
AfterEach() for one test runs when test completes, deleting /run/cni/dhcp.sock.
But other test still needs the file resulting in test failing. Often, the next dhcp
test hasn't started yet. When test does start it waits 15 seconds for dhcp4server to
create /run/cni/dhcp.sock (which has just been deleted) so test fails.
Other times dhcp tests fail because /run/cni/dhcp.sock is deleted while still being used.
Add two interfaces (e.g. eth0, eth1) to the same container.
Ensure each file now has ContainerID and ifname.
Delete one, ensure that the right file was deleted.
Add an interface using just ContainerID in the file.
Delete to verify we are still backwards compatible with any
files created using earlier verison of host-local plugin.
This change is to add CNI_ARGS support in static IPAM plugin.
When IP/SUBNET/GATEWAY are given in CNI_ARGS, static IPAM adds
these info in addition to config files.
To configure ip address only from CNI_ARGS, 'address' field in config
is changed to optional from required.
Recent CNI specification changes require the container ID on ADD/DEL,
which the testcases were not providing. Fix that up so things work
when this repo gets CNI revendored.
Namespace creation had an unergonomic interface and isn't used, except
for testing code. Remove it; downstream users should really be creating
their own namespaces
I need to build deb packages using golang-1.10 on Ubuntu Xenial which is
present in a non-default location while /usr/bin/go is at golang-1.6.
With this commit I can simply run:
`GO=/usr/lib/go-1.10/bin/go ./build.sh`
to use golang-1.10.
Signed-off-by: Lokesh Mandvekar <lsm5@fedoraproject.org>
When chained with a plugin that returns multiple devices, the bandwidth
plugin chooses the host veth device.
Signed-off-by: Tyler Schultz <tschultz@pivotal.io>
If CNI is about to configure an IPv6 address, make sure IPv6 is not
disabled through the "disable_ipv6" sysctl setting.
This is a workaround for Docker 17.06 and later which sets
disable_ipv6=1 for all interfaces even if ipv6 is enabled in Docker.
Don't lock around the Stop() operation though, as that may take
a while and block other operations. That may mean we call Stop()
multiple times, but the Lease object should handle that correctly
itself.
Classless static routes (DHCP option 121) are now processed first.
If CSRs exist, static routes (DHCP option 33) and the gateway default
route are ignored as per RFC 3442.
This diff adds -hostprefix option in dhcp daemon. This option
could be used to run dhcp daemon as container because container
cannot touch host's netns directly. The diff changes dhcp daemon
to touch procfs mounted to another path, like '/hostfs/proc'.
On ADD save the host device's name into its IFLA_ALIAS property and
rename the device to the requested CNI_IFNAME inside the container
to conform to the CNI specification. On DEL rename the device to
the original name and move it back into the host namespace.
This change improves the performance of the ipam.ConfigureIface.
Some plugins are slow because of the ip.SettleAddress in ipam.ConfigureIface.
It seems to be only needed for IPv6, so should be skipped if only IPv4 is used.
For IP allocation schemes that cannot be interface agnostic, the
ipvlan plugin can be chained with an earlier plugin that handles this
logic. If "master" is omitted from the ipvlan configuration, then the
previous Result must contain a single interface name for the ipvlan
plugin to enslave. If "ipam" is omitted, then the previous Result is
used to configure the ipvlan interface.
- start list of linux_only plugins; ignore them when testing on Windows
- Isolate linux-only code by filename suffix
- Remove stub (NotImplemented) functions
- other misc. fixes for Windows compatibility
This change improves the performance of the portmap plugin and fixes
hairpin, when a container is mapped back to itself.
Performance is improved by using a multiport test to reduce rule
traversal, and by using a masquerade mark.
Hairpin is fixed by enabling masquerading for hairpin traffic.
For IP allocation schemes that cannot be interface agnostic, master can be set
to "ipam". In this configuration, the IPAM plugin is required to return a single
interface name for the ipvlan plugin to enslave.
There are at least two reasons why a lease is not present:
* The dhcp ipam daemon was restarted
* On add the IPAM plugin failed
Don't fail the IPAM invocation when the lease is not present, to allow
proper device cleanup on CNI delete invocations.
In real-world address allocations, disjoint address ranges are common.
Therefore, the host-local allocator should support them.
This change still allows for multiple IPs in a single configuration, but
also allows for a "set of subnets."
Fixes: #45
The source address selection was random, and sometimes we picked a
source address that the container didn't have a route to. Adding a
default route fixes that!
This updates the netlink dependency to include recent updates, including a fix when setting prosmic mode on a bridge and additions for creating qdisc/classes/filters. This is necessary for some upcoming additions to CNI
* Wait for addresses to leave tentative state before setting routes
* Enable forwarding correctly
* Set up masquerading according to the active protocol
This change adds support for IPv6 container/pod addresses to the CNI
bridge plugin, both for dual-stack (IPv4 + IPv6) and for IPv6-only
network configurations.
The proposed changes support multiple IPv6 addresses on a container
interface. If isGW is configured, the bridge will also be configured with
gateway addresses for each IPv6 subnet.
Please note that both the dual-stack functionality and support for multiple
IPv6 container/gateway addresses depends upon containernetworking/cni
PR 451 "ipam/host-local: support multiple IP ranges".
This change could potentially be committed independently from this host-local
plugin change, however the dual-stack and multiple IPv6 address
functionality that is enabled by this change can't be exercised/tested
until the host-local plugin change is committed.
There are some IPv6 unit test cases that are currently commented out
in the proposed changes because these test cases will fail without the
prior commits of the multiple IP range host-local change.
This pull request includes a temporary workaround for Kubernetes
Issue #32291 (Container IPv6 address is marked as duplicate, or dadfailed).
The problem is that kubelet enables hairpin mode on bridge veth
interfaces. Hairpin mode causes the container/pod to see echos of its
IPv6 neighbor solicitation packets, so that it declares duplicate address
detection (DAD) failure. The long-term fix is to use enhanced-DAD
when that feature is readily available in kernels. The short-term fix is
to disable IPv6 DAD in the container. Unfortunately, this has to be done
unconditionally (i.e. without a check for whether hairpin mode is enabled)
because hairpin mode is turned on by kubelet after the CNI bridge plugin
has completed cmdAdd processing. Disabling DAD should be okay if
IPv6 addresses are guaranteed to be unique (which is the case for
host-local IPAM plugin).
This change allows the host-local allocator to allocate multiple IPs.
This is intended to enable dual-stack, but is not limited to only two
subnets or separate address families.
To actually use CNI plugins in the given CNI_PATH, we need to add
CNI_PATH to PATH because CmdAddWithResult() does this:
os.Setenv("CNI_PATH", os.Getenv("PATH"))
Go's "..." syntax (eg, ./plugins/...) doesn't traverse symlinks, so
go test wasn't finding the vendor/ directory for imports. To get around
that we have to specify each testable package specifically rather
than use "...".
Checking error strings makes these tests flaky, especially if the
error string is changed in libcni. Have gone ahead an introduced a new
error type `NoConfigsFoundError` and the Match is against the error
type making it more deterministic.
CNI-Genie enables orchestrators (kubernetes, mesos) for seamless connectivity to choice of CNI plugins (calico, canal, romana, weave) configured on a Node
It's difficult to include this repository using bazel, because
the file named "build" conflicts with new_go_repository generation
on case-insensitive filesystems (ref
https://github.com/bazelbuild/rules_go/issues/234). This change
renames the file to something that doesn't conflict, and also
renames the test script for consistency.
Add a new CapabilityArgs member to the RuntimeConf struct which runtimes can
use to pass arbitrary capability-based keys to the plugin. Elements of this
member will be filtered against the plugin's advertised capabilities (from
its config JSON) and then added to a new "runtimeConfig" top-level map added
to the config JSON sent to the plugin on stdin.
Also "runtime_config"->"runtimeConfig" in CONVENTIONS.md to make
capitalization consistent with other CNI config keys like "cniVersion".
Add a new CapabilityArgs member to the RuntimeConf struct which runtimes can
use to pass arbitrary capability-based keys to the plugin. Elements of this
member will be filtered against the plugin's advertised capabilities (from
its config JSON) and then added to a new "runtimeConfig" top-level map added
to the config JSON sent to the plugin on stdin.
Also "runtime_config"->"runtimeConfig" in CONVENTIONS.md to make
capitalization consistent with other CNI config keys like "cniVersion".
moved functions that depend on linux packages (sys/unix) into a separate file
and added nop methods with a build tag for non-linux platforms in a new file.
moved functions that depend on linux packages (sys/unix) into a separate file
and added nop methods with a build tag for non-linux platforms in a new file.
After discussion with the CNI maintainers and representatives from Mesos and Kubernetes we've settled on this approach for passing the dynamic port mapping information from runtimes to plugins.
Including it in the plugin config allows "operators" to signal that they expect port mapping information to be passed to a specific plugin (or plugins). This follows the model that config passed in the plugin config should be acted upon by a plugin.
Runtimes can optionally return an error to users if they find no plugin in the chain that supports port mappings.
I'm sure we'd all like to see this merged soon so I'll try to turn any feedback quickly (and I'm happy to get any and all feedback - spellings, unclear working, wrong JSON terminology etc..)
A CNI network configuration file contains the plugin's executable file name.
Some platforms like Windows require a file name extension for executables.
This causes unnecessary burden on admins as they now have to maintain two
versions of each type of netconfig file, which differ only by the ".exe"
extension. A much simpler design is for invoke package to also look for
well-known extensions on platforms that require it. Existing tests are
improved and new tests are added to cover the new behavior.
Fixes #360
A CNI network configuration file contains the plugin's executable file name.
Some platforms like Windows require a file name extension for executables.
This causes unnecessary burden on admins as they now have to maintain two
versions of each type of netconfig file, which differ only by the ".exe"
extension. A much simpler design is for invoke package to also look for
well-known extensions on platforms that require it. Existing tests are
improved and new tests are added to cover the new behavior.
Fixes #360
Hardcoding the list separator character as ":" causes CNI to fail when parsing
CNI_PATH on other operating systems. For example, Windows uses ";" as list
separator because ":" can legally appear in paths such as "C:\path\to\file".
This change replaces use of ":" with OS-agnostic APIs or os.PathListSeparator.
Fixes #358
Hardcoding the list separator character as ":" causes CNI to fail when parsing
CNI_PATH on other operating systems. For example, Windows uses ";" as list
separator because ":" can legally appear in paths such as "C:\path\to\file".
This change replaces use of ":" with OS-agnostic APIs or os.PathListSeparator.
Fixes #358
Updates the spec and plugins to return an array of interfaces and IP details
to the runtime including:
- interface names and MAC addresses configured by the plugin
- whether the interfaces are sandboxed (container/VM) or host (bridge, veth, etc)
- multiple IP addresses configured by IPAM and which interface they
have been assigned to
Returning interface details is useful for runtimes, as well as allowing
more flexible chaining of CNI plugins themselves. For example, some
meta plugins may need to know the host-side interface to be able to
apply firewall or traffic shaping rules to the container.
Updates the spec and plugins to return an array of interfaces and IP details
to the runtime including:
- interface names and MAC addresses configured by the plugin
- whether the interfaces are sandboxed (container/VM) or host (bridge, veth, etc)
- multiple IP addresses configured by IPAM and which interface they
have been assigned to
Returning interface details is useful for runtimes, as well as allowing
more flexible chaining of CNI plugins themselves. For example, some
meta plugins may need to know the host-side interface to be able to
apply firewall or traffic shaping rules to the container.
Using a new ".configlist" file format that allows specifying
a list of CNI network configurations to run, add new libcni
helper functions to call each plugin in the list, injecting
the overall name, CNI version, and previous plugin's Result
structure into the configuration of the next plugin.
Using a new ".configlist" file format that allows specifying
a list of CNI network configurations to run, add new libcni
helper functions to call each plugin in the list, injecting
the overall name, CNI version, and previous plugin's Result
structure into the configuration of the next plugin.
Chaining sends different config JSON to each plugin, but the same
environment, and if we want to test multiple noop plugin runs in
the same chain we need a way of telling each run to use a different
debug file.
This adds the option `resolvConf` to the host-local IPAM configuration.
If specified, the plugin will try to parse the file as a resolv.conf(5)
type file and return it in the DNS response.
This adds the option `resolvConf` to the host-local IPAM configuration.
If specified, the plugin will try to parse the file as a resolv.conf(5)
type file and return it in the DNS response.
It doesn't seem like container IDs should really have whitespace or
newlines in them. As a complete edge-case, manipulating the host-local
store's IP reservations with 'echo' puts a newline at the end, which
caused matching to fail in ReleaseByID(). Don't ask...
Add an e2e host-local plugin testcase, which requires being able
to pass the datadir into the plugin so we can erase it later.
We're not always guaranteed to have access to the default data
dir location, plus it should probably be configurable anyway.
- Add optional 'stateDir' to flannel NetConf, if not present default to
/var/lib/cni/flannel
Signed-off-by: Jay Dunkelberger <ldunkelberger@pivotal.io>
highlights:
- NetConf struct finally includes cniVersion field
- improve test coverage of current version report behavior
- godoc a few key functions
- allow tests to control version list reported by no-op plugin
highlights:
- NetConf struct finally includes cniVersion field
- improve test coverage of current version report behavior
- godoc a few key functions
- allow tests to control version list reported by no-op plugin
When RangeEnd is given, a.end = RangeEnd+1.
If when getSearchRange() is called and lastReservedIP equals
RangeEnd, a.nextIP() only compares lastReservedIP (which in this
example is RangeEnd) against a.end (which in this example is
RangeEnd+1) and they clearly don't match, so a.nextIP() returns
start=RangeEnd+1 and end=RangeEnd.
Get() happily allocates RangeEnd+1 because it only compares 'cur'
to the end returned by getSearchRange(), not to a.end, and thus
allocates past RangeEnd.
Since a.end is inclusive (eg, host-local will allocate a.end) the
fix is to simply set a.end equal to RangeEnd.
The expectation on older kernels (< 3.19) was to have the network
namespace always be a directory. This is not true if the network
namespace is bind mounted to a file, and will make the plugin fail
erroneously in such cases. The fix is to remove this assumption
completely and just do a basic check on the file system types being
returned.
Fixes #288
The expectation on older kernels (< 3.19) was to have the network
namespace always be a directory. This is not true if the network
namespace is bind mounted to a file, and will make the plugin fail
erroneously in such cases. The fix is to remove this assumption
completely and just do a basic check on the file system types being
returned.
Fixes #288
The veth is moved from the container NS to the host NS.
This is handled by the code that sets the link to UP but the wrong
hostVeth is returned to the calling code.
The veth is moved from the container NS to the host NS.
This is handled by the code that sets the link to UP but the wrong
hostVeth is returned to the calling code.
Cross-compile cni for arm, arm64 and ppc64le with go1.6 only
Allow go tip to fail
Set fast_finish to true, which means travis will instantly return build failure when any of the required builds fail
ref #209
The current vendor of sys/unix is really old, and doesn't work on arm64 and ppc64le
Updating to the latest version might also fix other issues
ref #209
If interface name for a container provided by a user is already present,
Veth creation fails with incorrect error.
If os.IsExist error is returned by makeVethPair:
* Check for peer name, if exists generate another random peer name,
* else, IsExist error is due to container interface present, return error.
Fixes #155
If interface name for a container provided by a user is already present,
Veth creation fails with incorrect error.
If os.IsExist error is returned by makeVethPair:
* Check for peer name, if exists generate another random peer name,
* else, IsExist error is due to container interface present, return error.
Fixes #155
Add possibility to reconfigure bridge IP address when there is a new value.
New boolean flag added to net configuration to force IP change if it is need.
Otherwise code behaves as previously and throws error
* bridge: Test the following interface's hardware address for the CNI specific
prefix:
- bridge with IP address
- container veth
* plugins/macvlan test: ensure hardware addr
This will give deterministic MAC addresses for all interfaces CNI
creates and manages the IP for:
* bridge: container veth and host bridge
* macvlan: container veth
* ptp: container veth and host veth
This will give deterministic MAC addresses for all interfaces CNI
creates and manages the IP for:
* bridge: container veth and host bridge
* macvlan: container veth
* ptp: container veth and host veth
* _suite.go and _test.go file should be in the same package, using the
_test package for that, which requires some fields and methods to be
exported
* Introduce error type for cleaner error handling
* test adaptions for error type checking
* _suite.go and _test.go file should be in the same package, using the
_test package for that, which requires some fields and methods to be
exported
* Introduce error type for cleaner error handling
* test adaptions for error type checking
Previously, the log lines appeared in stdout before the JSON encoding of
the error message. That would break JSON parsing of stdout. Instead, we use
stderr for these unstructured logs, consistent with the CNI spec.
Previously, the log lines appeared in stdout before the JSON encoding of
the error message. That would break JSON parsing of stdout. Instead, we use
stderr for these unstructured logs, consistent with the CNI spec.
Based on previous discussions on the CNI maintainers calls, the spec is
unclear on 1) when CNI_ARGS should be used and 2) the fact the dynamic
config can be passed in through the network JSON.
This PR makes it clear that per-container config can be passed in
through the network JSON, adding a top level `args` field into
which orchestrators can add additional metadata without worrying that
plugins might reject the additional data. It also allows for plugins to
reject unknown fields passed in at the top level.
Using JSON is preferable to CNI_ARGS since it allows namespaced and
structured data. CNI_ARGS is a flat list of KV pairs which has reserved
characters with no escaping rules defined.
CNI_ARGS may still be used by orchestrators that want the simplicity of
passing the network config JSON as specified by the user, unchanged
through to the CNI plugin. But for any kind of structured data, it's
recommended that the `args` field in the JSON is used instead.
Allow strings to be unmarshalled for CNI_ARGS
CNI_ARGS uses types.LoadArgs to populate a struct.
The fields in the struct must meet the TextUnmarshaler interface.
This code adds a UnmarshallableString type to assist with this.
Allow strings to be unmarshalled for CNI_ARGS
CNI_ARGS uses types.LoadArgs to populate a struct.
The fields in the struct must meet the TextUnmarshaler interface.
This code adds a UnmarshallableString type to assist with this.
This changes the ip allocation logic to round robin. Before this, host-local IPAM searched for available IPs from start of subnet. Hence it tends to allocate IPs that had been used recently. This is not ideal since it may cause collisions.
Thank you Zach for all the great work done on CNI, farewell!
At the same time we are happy to welcome Dan amongst us who has already
contributed lots of valuable work!
When isDefaultGateway is true it automatically sets isGateway to true.
The default route will be added via the (bridge's) gateway IP.
If a default gateway has been configured via IPAM in the same
configuration file, the plugin will error out.
When isDefaultGateway is true it automatically sets isGateway to true.
The default route will be added via the (bridge's) gateway IP.
If a default gateway has been configured via IPAM in the same
configuration file, the plugin will error out.
Add a namespace object interface for somewhat cleaner code when
creating and switching between network namespaces. All created
namespaces are now mounted in /var/run/netns to ensure they
have persistent inodes and paths that can be passed around
between plugin components without relying on the current namespace
being correct.
Also remove the thread-locking arguments from the ns package
per https://github.com/appc/cni/issues/183 by doing all the namespace
changes in a separate goroutine that locks/unlocks itself, instead of
the caller having to track OS thread locking.
Add a namespace object interface for somewhat cleaner code when
creating and switching between network namespaces. All created
namespaces are now mounted in /var/run/netns to ensure they
have persistent inodes and paths that can be passed around
between plugin components without relying on the current namespace
being correct.
Also remove the thread-locking arguments from the ns package
per https://github.com/appc/cni/issues/183 by doing all the namespace
changes in a separate goroutine that locks/unlocks itself, instead of
the caller having to track OS thread locking.
Previously this code used a run-time map lookup keyed by
runtime.GOOS/GOARCH. This version uses conditional compilation to make
this choice at compile time, giving immediate feedback for unsupported
platforms.
Previously this code used a run-time map lookup keyed by
runtime.GOOS/GOARCH. This version uses conditional compilation to make
this choice at compile time, giving immediate feedback for unsupported
platforms.
Resolves CNI part of https://github.com/coreos/rkt/issues/1765
Second part would be adding similar lines into kvm flavored macvlan
support (in time of creating macvtap device).
/proc/self/ns/net gives the main thread's namespace, not necessarily
the namespace of the thread that's running the testcases. This causes
sporadic failures of the tests.
For example, with a testcase reading inodes after switching netns:
/proc/27686/task/27689/ns/net 4026532565
/proc/self/ns/net 4026531969
/proc/27686/task/27689/ns/net 4026532565
See also:
008d17ae00
Running Suite: pkg/ns Suite
===========================
Random Seed: 1459953577
Will run 6 of 6 specs
• Failure [0.028 seconds]
Linux namespace operations
/cni/gopath/src/github.com/appc/cni/pkg/ns/ns_test.go:167
WithNetNS
/cni/gopath/src/github.com/appc/cni/pkg/ns/ns_test.go:166
executes the callback within the target network namespace [It]
/cni/gopath/src/github.com/appc/cni/pkg/ns/ns_test.go:97
Expected
<uint64>: 4026531969
to equal
<uint64>: 4026532565
/cni/gopath/src/github.com/appc/cni/pkg/ns/ns_test.go:96
------------------------------
•••••
Summarizing 1 Failure:
[Fail] Linux namespace operations WithNetNS [It] executes the callback within the target network namespace
/cni/gopath/src/github.com/appc/cni/pkg/ns/ns_test.go:96
Ran 6 of 6 Specs in 0.564 seconds
FAIL! -- 5 Passed | 1 Failed | 0 Pending | 0 Skipped --- FAIL: TestNs (0.56s)
FAIL
/proc/self/ns/net gives the main thread's namespace, not necessarily
the namespace of the thread that's running the testcases. This causes
sporadic failures of the tests.
For example, with a testcase reading inodes after switching netns:
/proc/27686/task/27689/ns/net 4026532565
/proc/self/ns/net 4026531969
/proc/27686/task/27689/ns/net 4026532565
See also:
008d17ae00
Running Suite: pkg/ns Suite
===========================
Random Seed: 1459953577
Will run 6 of 6 specs
• Failure [0.028 seconds]
Linux namespace operations
/cni/gopath/src/github.com/appc/cni/pkg/ns/ns_test.go:167
WithNetNS
/cni/gopath/src/github.com/appc/cni/pkg/ns/ns_test.go:166
executes the callback within the target network namespace [It]
/cni/gopath/src/github.com/appc/cni/pkg/ns/ns_test.go:97
Expected
<uint64>: 4026531969
to equal
<uint64>: 4026532565
/cni/gopath/src/github.com/appc/cni/pkg/ns/ns_test.go:96
------------------------------
•••••
Summarizing 1 Failure:
[Fail] Linux namespace operations WithNetNS [It] executes the callback within the target network namespace
/cni/gopath/src/github.com/appc/cni/pkg/ns/ns_test.go:96
Ran 6 of 6 Specs in 0.564 seconds
FAIL! -- 5 Passed | 1 Failed | 0 Pending | 0 Skipped --- FAIL: TestNs (0.56s)
FAIL
Previously this code used a run-time map lookup keyed by
runtime.GOOS/GOARCH. This version uses conditional compilation to make
this choice at compile time, giving immediate feedback for unsupported
platforms.
Not using NewLinkAttrs() or not initializing TxQLen leaves
the value as 0, which tells the kernel to set a zero-length
tx_queue_len. That messes up FIFO traffic shapers (like pfifo)
that use the device TX queue length as the default packet
limit. This leads to a default packet limit of 0, which drops
all packets.
/proc/self/ns/net gives the main thread's namespace, not necessarily
the namespace of the thread that's running the testcases. This causes
sporadic failures of the tests.
For example, with a testcase reading inodes after switching netns:
/proc/27686/task/27689/ns/net 4026532565
/proc/self/ns/net 4026531969
/proc/27686/task/27689/ns/net 4026532565
See also:
008d17ae00
Running Suite: pkg/ns Suite
===========================
Random Seed: 1459953577
Will run 6 of 6 specs
• Failure [0.028 seconds]
Linux namespace operations
/cni/gopath/src/github.com/appc/cni/pkg/ns/ns_test.go:167
WithNetNS
/cni/gopath/src/github.com/appc/cni/pkg/ns/ns_test.go:166
executes the callback within the target network namespace [It]
/cni/gopath/src/github.com/appc/cni/pkg/ns/ns_test.go:97
Expected
<uint64>: 4026531969
to equal
<uint64>: 4026532565
/cni/gopath/src/github.com/appc/cni/pkg/ns/ns_test.go:96
------------------------------
•••••
Summarizing 1 Failure:
[Fail] Linux namespace operations WithNetNS [It] executes the callback within the target network namespace
/cni/gopath/src/github.com/appc/cni/pkg/ns/ns_test.go:96
Ran 6 of 6 Specs in 0.564 seconds
FAIL! -- 5 Passed | 1 Failed | 0 Pending | 0 Skipped --- FAIL: TestNs (0.56s)
FAIL
/proc/self/ns/net gives the main thread's namespace, not necessarily
the namespace of the thread that's running the testcases. This causes
sporadic failures of the tests.
For example, with a testcase reading inodes after switching netns:
/proc/27686/task/27689/ns/net 4026532565
/proc/self/ns/net 4026531969
/proc/27686/task/27689/ns/net 4026532565
See also:
008d17ae00
Running Suite: pkg/ns Suite
===========================
Random Seed: 1459953577
Will run 6 of 6 specs
• Failure [0.028 seconds]
Linux namespace operations
/cni/gopath/src/github.com/appc/cni/pkg/ns/ns_test.go:167
WithNetNS
/cni/gopath/src/github.com/appc/cni/pkg/ns/ns_test.go:166
executes the callback within the target network namespace [It]
/cni/gopath/src/github.com/appc/cni/pkg/ns/ns_test.go:97
Expected
<uint64>: 4026531969
to equal
<uint64>: 4026532565
/cni/gopath/src/github.com/appc/cni/pkg/ns/ns_test.go:96
------------------------------
•••••
Summarizing 1 Failure:
[Fail] Linux namespace operations WithNetNS [It] executes the callback within the target network namespace
/cni/gopath/src/github.com/appc/cni/pkg/ns/ns_test.go:96
Ran 6 of 6 Specs in 0.564 seconds
FAIL! -- 5 Passed | 1 Failed | 0 Pending | 0 Skipped --- FAIL: TestNs (0.56s)
FAIL
Resolves CNI part of https://github.com/coreos/rkt/issues/1765
Second part would be adding similar lines into kvm flavored macvlan
support (in time of creating macvtap device).
This commit adds a struct type CommonArgs that is to be embedded in
every plugin's argument struct. It contains a field named
"IgnoreUnknown" which will be parsed as a boolean and can be provided to
ignore unknown arguments passed to the plugin.
This is an attempt to testing the PluginMain() function of the skel pkg.
We should be able to do better by using a mockable interface for the
plugins, but this is a start.
The 'flannel' meta plugin delegates to other plugins to do the actual
OS-level work. It used the ipam.Exec{Add,Del} procedures for this
delegation, since those do precisely what's needed.
However this is a bit misleading, since the flannel plugin _isn't_
doing this for IPAM, and the ipam.Exec* procedures aren't doing
something specific to IPAM plugins.
So: anticipating that there may be more meta plugins that want to
delegate in the same way, this commit moves generic delegation
procedures to `pkg/invoke`, and makes the `pkg/ipam` procedures (still
used, accurately, in the non-meta plugins) shims.
Allow users to tune net network parameters such as somaxconn.
With this patch, users can add a new network configuration:
> {
> "name": "mytuning",
> "type": "tuning",
> "sysctl": {
> "net.core.somaxconn": "500"
> }
> }
The value /proc/sys/net/core/somaxconn will be set to 500 in the network
namespace but will remain unchanged on the host.
Only sysctl parameters that belong to the network subsystem can be
modified.
Related to: https://github.com/coreos/rkt/pull/2140
appc/cni#76 added a "dns" field in the result JSON. But before this
patch, the plugins had no way of knowing which name server to return.
There could be two ways of knowing which name server to return:
1. add it as an extra argument ("CNI_ARGS")
2. add it in the network configuration as a convenience (received via
stdin)
I chose the second way because it is easier. In the case of rkt, it
means the user could just add the DNS name servers in
/etc/rkt/net.d/mynetwork.conf.
appc/cni#76 added a "dns" field in the result JSON. But before this
patch, the plugins had no way of knowing which name server to return.
There could be two ways of knowing which name server to return:
1. add it as an extra argument ("CNI_ARGS")
2. add it in the network configuration as a convenience (received via
stdin)
I chose the second way because it is easier. In the case of rkt, it
means the user could just add the DNS name servers in
/etc/rkt/net.d/mynetwork.conf.
When plugin is executed with a DEL command, it does not
print result to stdout unless there is an error. Therefore
it stdout bytes should not be passed to json.Unmarshal.
This takes some of the machinery from CNI and from the rkt networking
code, and turns it into a library that can be linked into go apps.
Included is an example command-line application that uses the library,
called `cnitool`.
Other headline changes:
* Plugin exec'ing is factored out
The motivation here is to factor out the protocol for invoking
plugins. To that end, a generalisation of the code from api.go and
pkg/plugin/ipam.go goes into pkg/invoke/exec.go.
* Move argument-handling and conf-loading into public API
The fact that the arguments get turned into an environment for the
plugin is incidental to the API; so, provide a way of supplying them
as a struct or saying "just use the same arguments as I got" (the
latter is for IPAM plugins).
A specific IP can now be requested via the environment variable CNI_ARGS, e.g.
`CNI_ARGS=ip=1.2.3.4`.
The plugin will try to reserve the specified IP.
If this is not successful the execution will fail.
A specific IP can now be requested via the environment variable CNI_ARGS, e.g.
`CNI_ARGS=ip=1.2.3.4`.
The plugin will try to reserve the specified IP.
If this is not successful the execution will fail.
Path rewriting causes too many problems when vendoring
vendored code. When CNI code is vendored into rkt,
godep has problems code already vendored by CNI.
This introduces a notion of a "meta" plugin. A meta plugin
is one that delegates the actual work of setting up the interface
to the main plugin. The meta plugin is used to select and dynamically
configure the main plugin. The sequence of events, is as follows:
Given netconf like:
{
"name": "mynet",
"type": "flannel",
"delegate": {
"type": "bridge"
}
}
flannel fills in values like "mtu", "ipam.subnet" and delegates to
"bridge" main plugin. "bridge" plugin will operate as usual, calling
into ipam module for IP assignment.
Delegate dictionary should not contain "name" field as it will be
filled in by the flannel plugin.
When plugin errors out, it prints out a JSON object to stdout
describing the failure. This object needs to be propagated out
through the plugins and to the container runtime. This change
also adds Print method to both the result and error structs
for easy serialization to stdout.
The plugin binary actually functions in two modes. The first mode
is a regular CNI plugin. The second mode (when stared with "daemon" arg)
runs a DHCP client daemon. When executed as a CNI plugin, it issues
an RPC request to the daemon for actual processing. The daemon is
required since a DHCP lease needs to be maintained by periodically
renewing it. One instance of the daemon can server arbitrary number
of containers/leases.
This adds basic plugins.
"main" types: veth, bridge, macvlan
"ipam" type: host-local
The code has been ported over from github.com/coreos/rkt project
and adapted to fit the CNI spec.
2015-04-27 14:14:29 -07:00
1456 changed files with 409819 additions and 35320 deletions
With bridge plugin, all containers (on the same host) are plugged into a bridge (virtual switch) that resides in the host network namespace.
The containers receive one end of the veth pair with the other end connected to the bridge.
An IP address is only assigned to one end of the veth pair -- one residing in the container.
The bridge itself can also be assigned an IP address, turning it into a gateway for the containers.
Alternatively, the bridge can function purely in L2 mode and would need to be bridged to the host network interface (if other than container-to-container communication on the same host is desired).
The network configuration specifies the name of the bridge to be used.
If the bridge is missing, the plugin will create one on first use and, if gateway mode is used, assign it an IP that was returned by IPAM plugin via the gateway field.
## Example configuration
```
{
"name": "mynet",
"type": "bridge",
"bridge": "mynet0",
"isGateway": true,
"ipMasq": true,
"ipam": {
"type": "host-local",
"subnet": "10.10.0.0/16"
}
}
```
## Network configuration reference
*`name` (string, required): the name of the network.
*`type` (string, required): "bridge".
*`bridge` (string, optional): name of the bridge to use/create. Defaults to "cni0".
*`isGateway` (boolean, optional): assign an IP address to the bridge. Defaults to false.
*`ipMasq` (boolean, optional): set up IP Masquerade on the host for traffic originating from this network and destined outside of it. Defaults to false.
*`mtu` (integer, optional): explicitly set MTU to the specified value. Defaults to the value chosen by the kernel.
*`ipam` (dictionary, required): IPAM configuration to be used for this network.
host-local IPAM plugin allocates IPv4 addresses out of a specified address range.
It stores the state locally on the host filesystem, therefore ensuring uniqueness of IP addresses on a single host.
## Example configuration
```
{
"ipam": {
"type": "host-local",
"subnet": "10.10.0.0/16",
"rangeStart": "10.10.1.20",
"rangeEnd": "10.10.3.50",
"gateway": "10.10.0.254",
"routes": [
{ "dst": "0.0.0.0/0" },
{ "dst": "192.168.0.0/16", "gw": "10.10.5.1" }
]
}
}
```
## Network configuration reference
*`type` (string, required): "host-local".
*`subnet` (string, required): CIDR block to allocate out of.
*`rangeStart` (string, optional): IP inside of "subnet" from which to start allocating addresses. Defaults to ".2" IP inside of the "subnet" block.
*`rangeEnd` (string, optional): IP inside of "subnet" with which to end allocating addresses. Defaults to ".254" IP inside of the "subnet" block.
*`gateway` (string, optional): IP inside of "subnet" to designate as the gateway. Defaults to ".1" IP inside of the "subnet" block.
*`routes` (string, optional): list of routes to add to the container namespace. Each route is a dictionary with "dst" and optional "gw" fields. If "gw" is omitted, value of "gateway" will be used.
## Supported arguments
The following [CNI_ARGS](https://github.com/appc/cni/blob/master/SPEC.md#parameters) are supported:
*`ip`: request a specific IP address from the subnet. If it's not available, the plugin will exit with an error
## Files
Allocated IP addresses are stored as files in /var/lib/cni/networks/$NETWORK_NAME.
ipvlan is a new [addition](https://lwn.net/Articles/620087/) to the Linux kernel.
Like its cousin macvlan, it virtualizes the host interface.
However unlike macvlan which generates a new MAC address for each interface, ipvlan devices all share the same MAC.
The kernel driver inspects the IP address of each packet when making a decision about which virtual interface should process the packet.
Because all ipvlan interfaces share the MAC address with the host interface, DHCP can only be used in conjunction with ClientID (currently not supported by DHCP plugin).
## Example configuration
```
{
"name": "mynet",
"type": "ipvlan",
"master": "eth0",
"ipam": {
"type": "host-local",
"subnet": "10.1.2.0/24",
}
}
```
## Network configuration reference
*`name` (string, required): the name of the network.
*`type` (string, required): "ipvlan".
*`master` (string, required): name of the host interface to enslave.
*`mode` (string, optional): one of "l2", "l3". Defaults to "l2".
*`mtu` (integer, optional): explicitly set MTU to the specified value. Defaults to the value chosen by the kernel.
*`ipam` (dictionary, required): IPAM configuration to be used for this network.
## Notes
*`ipvlan` does not allow virtual interfaces to communicate with the master interface.
Therefore the container will not be able to reach the host via `ipvlan` interface.
Be sure to also have container join a network that provides connectivity to the host (e.g. `ptp`).
* A single master interface can not be enslaved by both `macvlan` and `ipvlan`.
The ptp plugin creates a point-to-point link between a container and the host by using a veth device.
One end of the veth pair is placed inside a container and the other end resides on the host.
The host-local IPAM plugin can be used to allocate an IP address to the container.
The traffic of the container interface will be routed through the interface of the host.
## Example network configuration
```
{
"name": "mynet",
"type": "ptp",
"ipam": {
"type": "host-local",
"subnet": "10.1.1.0/24"
},
"dns": {
"nameservers": [ "10.1.1.1", "8.8.8.8" ]
}
}
## Network configuration reference
* `name` (string, required): the name of the network
* `type` (string, required): "ptp"
* `ipMasq` (boolean, optional): set up IP Masquerade on the host for traffic originating from this network and destined outside of it. Defaults to false.
* `mtu` (integer, optional): explicitly set MTU to the specified value. Defaults to value chosen by the kernel.
* `ipam` (dictionary, required): IPAM configuration to be used for this network.
* `dns` (dictionary, optional): DNS information to return as described in the [Result](/SPEC.md#result).
Some CNI network plugins, maintained by the containernetworking team. For more information, see the [CNI website](https://www.cni.dev).
## What is CNI?
Read [CONTRIBUTING](CONTRIBUTING.md) for build and test instructions.
CNI, the _Container Network Interface_, is a proposed standard for configuring network interfaces for Linux application containers.
The standard consists of a simple specification for how executable plugins can be used to configure network namespaces; this repository also contains a go library implementing that specification.
## Plugins supplied:
### Main: interface-creating
*`bridge`: Creates a bridge, adds the host and the container to it.
*`ipvlan`: Adds an [ipvlan](https://www.kernel.org/doc/Documentation/networking/ipvlan.txt) interface in the container.
*`loopback`: Set the state of loopback interface to up.
*`macvlan`: Creates a new MAC address, forwards all traffic to that to the container.
*`ptp`: Creates a veth pair.
*`vlan`: Allocates a vlan device.
*`host-device`: Move an already-existing device into a container.
#### Windows: windows specific
*`win-bridge`: Creates a bridge, adds the host and the container to it.
*`win-overlay`: Creates an overlay interface to the container.
### IPAM: IP address allocation
*`dhcp`: Runs a daemon on the host to make DHCP requests on behalf of the container
*`host-local`: Maintains a local database of allocated IPs
*`static`: Allocate a static IPv4/IPv6 addresses to container and it's useful in debugging purpose.
The specification itself is contained in [SPEC.md](SPEC.md).
### Meta: other plugins
*`flannel`: Generates an interface corresponding to a flannel config file
*`tuning`: Tweaks sysctl parameters of an existing interface
*`portmap`: An iptables-based portmapping plugin. Maps ports from the host's address space to the container.
*`bandwidth`: Allows bandwidth-limiting through use of traffic control tbf (ingress/egress).
*`sbr`: A plugin that configures source based routing for an interface (from which it is chained).
*`firewall`: A firewall plugin which uses iptables or firewalld to add rules to allow traffic to/from the container.
## Why develop CNI?
Application containers on Linux are a rapidly evolving area, and within this space networking is a particularly unsolved problem, as it is highly environment-specific.
We believe that every container runtime will seek to solve the same problem of making the network layer pluggable.
To avoid duplication, we think it is prudent to define a common interface between the network plugins and container execution.
Hence we are proposing this specification, along with an initial set of plugins that can be used by different container runtime systems.
- [Kubernetes - a system to simplify container operations](http://kubernetes.io/docs/admin/network-plugins/)
- [Cloud Foundry - a platform for cloud applications](https://github.com/cloudfoundry-incubator/guardian-cni-adapter)
- [Weave - a multi-host Docker network](https://github.com/weaveworks/weave)
- [Project Calico - a layer 3 virtual network](https://github.com/projectcalico/calico-cni)
## Contributing to CNI
We welcome contributions, including [bug reports](https://github.com/appc/cni/issues), and code and documentation improvements.
If you intend to contribute to code or documentation, please read [CONTRIBUTING.md](CONTRIBUTING.md). Also see the [contact section](#contact) in this README.
## How do I use CNI?
### Requirements
CNI requires Go 1.5+ to build.
Go 1.5 users will need to set GO15VENDOREXPERIMENT=1 to get vendored
dependencies. This flag is set by default in 1.6.
### Included Plugins
This repository includes a number of common plugins in the `plugins/` directory.
Please see the [Documentation/](Documentation/) directory for documentation about particular plugins.
### Running the plugins
The scripts/ directory contains two scripts, `priv-net-run.sh` and `docker-run.sh`, that can be used to exercise the plugins.
**note - priv-net-run.sh depends on `jq`**
Start out by creating a netconf file to describe a network:
```bash
$ mkdir -p /etc/cni/net.d
$ cat >/etc/cni/net.d/10-mynet.conf <<EOF
{
"name": "mynet",
"type": "bridge",
"bridge": "cni0",
"isGateway": true,
"ipMasq": true,
"ipam": {
"type": "host-local",
"subnet": "10.22.0.0/16",
"routes": [
{ "dst": "0.0.0.0/0" }
]
}
}
EOF
$ cat >/etc/cni/net.d/99-loopback.conf <<EOF
{
"type": "loopback"
}
EOF
```
The directory `/etc/cni/net.d` is the default location in which the scripts will look for net configurations.
Next, build the plugins:
```bash
$ ./build
```
Finally, execute a command (`ifconfig` in this example) in a private network namespace that has joined the `mynet` network:
Creating a new release produces the following artifacts:
- Binaries (stored in the `release-<TAG>` directory) :
-`cni-plugins-<PLATFORM>-<VERSION>.tgz` binaries
-`cni-plugins-<VERSION>.tgz` binary (copy of amd64 platform binary)
-`sha1`, `sha256` and `sha512` files for the above files.
## Preparing for a release
1. Releases are performed by maintainers and should usually be discussed and planned at a maintainer meeting.
- Choose the version number. It should be prefixed with `v`, e.g. `v1.2.3`
- Take a quick scan through the PRs and issues to make sure there isn't anything crucial that _must_ be in the next release.
- Create a draft of the release note
- Discuss the level of testing that's needed and create a test plan if sensible
- Check what version of `go` is used in the build container, updating it if there's a new stable release.
- Update the vendor directory and Godeps to pin to the corresponding containernetworking/cni release. Create a PR, makes sure it passes CI and get it merged.
## Creating the release artifacts
1. Make sure you are on the master branch and don't have any local uncommitted changes.
1. Create a signed tag for the release `git tag -s $VERSION` (Ensure that GPG keys are created and added to GitHub)
1. Run the release script from the root of the repository
-`scripts/release.sh`
- The script requires Docker and ensures that a consistent environment is used.
- The artifacts will now be present in the `release-<TAG>` directory.
1. Test these binaries according to the test plan.
## Publishing the release
1. Push the tag to git `git push origin <TAG>`
1. Create a release on Github, using the tag which was just pushed.
1. Attach all the artifacts from the release directory.
1. Add the release note to the release.
1. Announce the release on at least the CNI mailing, IRC and Slack.
This document proposes a generic plugin-based networking solution for application containers on Linux, the _Container Networking Interface_, or _CNI_.
It is derived from the [rkt Networking Proposal][rkt-networking-proposal], which aimed to satisfy many of the [design considerations][rkt-networking-design] for networking in [rkt][rkt-github].
For the purposes of this proposal, we define two terms very specifically:
- _container_ can be considered synonymous with a [Linux _network namespace_][namespaces]. What unit this corresponds to depends on a particular container runtime implementation: for example, in implementations of the [App Container Spec][appc-github] like rkt, each _pod_ runs in a unique network namespace. In [Docker][docker], on the other hand, network namespaces generally exist for each separate Docker container.
- _network_ refers to a group of entities that are uniquely addressable that can communicate amongst each other. This could be either an individual container (as specified above), a machine, or some other network device (e.g. a router). Containers can be conceptually _added to_ or _removed from_ one or more networks.
The intention is for the container runtime to first create a new network namespace for the container.
It then determines which networks this container should belong to and for each network, which plugin must be executed.
The network configuration is in JSON format and can easily be stored in a file.
The network configuration includes mandatory fields such as "name" and "type" as well as plugin (type) specific ones.
The container runtime sequentially sets up the networks by executing the corresponding plugin for each network.
Upon completion of the container lifecycle, the runtime executes the plugins in reverse order (relative to the order in which they were added) to disconnect them from the networks.
## CNI Plugin
### Overview
Each CNI plugin is implemented as an executable that is invoked by the container management system (e.g. rkt or Docker).
A CNI plugin is responsible for inserting a network interface into the container network namespace (e.g. one end of a veth pair) and making any necessary changes on the host (e.g. attaching other end of veth into a bridge).
It should then assign the IP to the interface and setup the routes consistent with IP Address Management section by invoking appropriate IPAM plugin.
### Parameters
The operations that the CNI plugin needs to support are:
- Add container to network
- Parameters:
- **Version**. The version of CNI spec that the caller is using (container management system or the invoking plugin).
- **Container ID**. This is optional but recommended, and should be unique across an administrative domain while the container is live (it may be reused in the future). For example, an environment with an IPAM system may require that each container is allocated a unique ID and that each IP allocation can thus be correlated back to a particular container. As another example, in appc implementations this would be the _pod ID_.
- **Network namespace path**. This represents the path to the network namespace to be added, i.e. /proc/[pid]/ns/net or a bind-mount/link to it.
- **Network configuration**. This is a JSON document describing a network to which a container can be joined. The schema is described below.
- **Extra arguments**. This allows granular configuration of CNI plugins on a per-container basis.
- **Name of the interface inside the container**. This is the name that should be assigned to the interface created inside the container (network namespace); consequently it must comply with the standard Linux restrictions on interface names.
- Result:
- **IPs assigned to the interface**. This is either an IPv4 address, an IPv6 address, or both.
- **DNS information**. Dictionary that includes DNS information for nameservers, domain, search domains and options.
- Delete container from network
- Parameters:
- **Version**. The version of CNI spec that the caller is using (container management system or the invoking plugin).
- **Container ID**, as defined above.
- **Network namespace path**, as defined above.
- **Network configuration**, as defined above.
- **Extra arguments**, as defined above.
- **Name of the interface inside the container**, as defined above.
The executable command-line API uses the type of network (see [Network Configuration](#network-configuration) below) as the name of the executable to invoke.
It will then look for this executable in a list of predefined directories. Once found, it will invoke the executable using the following environment variables for argument passing:
-`CNI_VERSION`: [Semantic Version 2.0](http://semver.org) of CNI specification. This effectively versions the CNI_XXX environment variables.
-`CNI_COMMAND`: indicates the desired operation; either `ADD` or `DEL`
-`CNI_CONTAINERID`: Container ID
-`CNI_NETNS`: Path to network namespace file
-`CNI_IFNAME`: Interface name to set up
-`CNI_ARGS`: Extra arguments passed in by the user at invocation time. Alphanumeric key-value pairs separated by semicolons; for example, "FOO=BAR;ABC=123"
-`CNI_PATH`: Colon-separated list of paths to search for CNI plugin executables
Network configuration in JSON format is streamed through stdin.
### Result
Success is indicated by a return code of zero and the following JSON printed to stdout in the case of the ADD command. This should be the same output as was returned by the IPAM plugin (see [IP Allocation](#ip-allocation) for details).
`cniVersion` specifies a [Semantic Version 2.0](http://semver.org) of CNI specification used by the plugin.
`dns` field contains a dictionary consisting of common DNS information that this network is aware of.
The result is returned in the same format as specified in the [configuration](#network-configuration).
The specification does not declare how this information must be processed by CNI consumers.
Examples include generating an `/etc/resolv.conf` file to be injected into the container filesystem or running a DNS forwarder on the host.
Errors are indicated by a non-zero return code and the following JSON being printed to stdout:
```
{
"cniVersion": "0.1.0",
"code": <numeric-error-code>,
"msg": <short-error-message>,
"details": <long-error-message> (optional)
}
```
`cniVersion` specifies a [Semantic Version 2.0](http://semver.org) of CNI specification used by the plugin.
Error codes 0-99 are reserved for well-known errors (see [Well-known Error Codes](#well-known-error-codes) section).
Values of 100+ can be freely used for plugin specific errors.
In addition, stderr can be used for unstructured output such as logs.
### Network Configuration
The network configuration is described in JSON form. The configuration can be stored on disk or generated from other sources by the container runtime. The following fields are well-known and have the following meaning:
-`cniVersion` (string): [Semantic Version 2.0](http://semver.org) of CNI specification to which this configuration conforms.
-`name` (string): Network name. This should be unique across all containers on the host (or other administrative domain).
-`type` (string): Refers to the filename of the CNI plugin executable.
-`ipMasq` (boolean): Optional (if supported by the plugin). Set up an IP masquerade on the host for this network. This is necessary if the host will act as a gateway to subnets that are not able to route to the IP assigned to the container.
-`ipam`: Dictionary with IPAM specific values:
-`type` (string): Refers to the filename of the IPAM plugin executable.
-`routes` (list): List of subnets (in CIDR notation) that the CNI plugin should ensure are reachable by routing them through the network. Each entry is a dictionary containing:
-`dst` (string): subnet in CIDR notation
-`gw` (string): IP address of the gateway to use. If not specified, the default gateway for the subnet is assumed (as determined by the IPAM plugin).
-`dns`: Dictionary with DNS specific values:
-`nameservers` (list of strings): list of a priority-ordered list of DNS nameservers that this network is aware of. Each entry in the list is a string containing either an IPv4 or an IPv6 address.
-`domain` (string): the local domain used for short hostname lookups.
-`search` (list of strings): list of priority ordered search domains for short hostname lookups. Will be preferred over `domain` by most resolvers.
-`options` (list of strings): list of options that can be passed to the resolver
As part of its operation, a CNI plugin is expected to assign (and maintain) an IP address to the interface and install any necessary routes relevant for that interface. This gives the CNI plugin great flexibility but also places a large burden on it. Many CNI plugins would need to have the same code to support several IP management schemes that users may desire (e.g. dhcp, host-local).
To lessen the burden and make IP management strategy be orthogonal to the type of CNI plugin, we define a second type of plugin -- IP Address Management Plugin (IPAM plugin). It is however the responsibility of the CNI plugin to invoke the IPAM plugin at the proper moment in its execution. The IPAM plugin is expected to determine the interface IP/subnet, Gateway and Routes and return this information to the "main" plugin to apply. The IPAM plugin may obtain the information via a protocol (e.g. dhcp), data stored on a local filesystem, the "ipam" section of the Network Configuration file or a combination of the above.
#### IP Address Management (IPAM) Interface
Like CNI plugins, the IPAM plugins are invoked by running an executable. The executable is searched for in a predefined list of paths, indicated to the CNI plugin via `CNI_PATH`. The IPAM Plugin receives all the same environment variables that were passed in to the CNI plugin. Just like the CNI plugin, IPAM receives the network configuration file via stdin.
Success is indicated by a zero return code and the following JSON being printed to stdout (in the case of the ADD command):
```
{
"cniVersion": "0.1.0",
"ip4": {
"ip": <ipv4-and-subnet-in-CIDR>,
"gateway": <ipv4-of-the-gateway>, (optional)
"routes": <list-of-ipv4-routes> (optional)
},
"ip6": {
"ip": <ipv6-and-subnet-in-CIDR>,
"gateway": <ipv6-of-the-gateway>, (optional)
"routes": <list-of-ipv6-routes> (optional)
},
"dns": {
"nameservers": <list-of-nameservers> (optional)
"domain": <name-of-local-domain> (optional)
"search": <list-of-search-domains> (optional)
"options": <list-of-options> (optional)
}
}
```
`cniVersion` specifies a [Semantic Version 2.0](http://semver.org) of CNI specification used by the plugin.
`gateway` is the default gateway for this subnet, if one exists.
It does not instruct the CNI plugin to add any routes with this gateway: routes to add are specified separately via the `routes` field.
An example use of this value is for the CNI plugin to add this IP address to the linux-bridge to make it a gateway.
Each route entry is a dictionary with the following fields:
-`dst` (string): Destination subnet specified in CIDR notation.
-`gw` (string): IP of the gateway. If omitted, a default gateway is assumed (as determined by the CNI plugin).
The "dns" field contains a dictionary consisting of common DNS information.
-`nameservers` (list of strings): list of a priority-ordered list of DNS nameservers that this network is aware of. Each entry in the list is a string containing either an IPv4 or an IPv6 address.
-`domain` (string): the local domain used for short hostname lookups.
-`search` (list of strings): list of priority ordered search domains for short hostname lookups. Will be preferred over `domain` by most resolvers.
-`options` (list of strings): list of options that can be passed to the resolver
See [CNI Plugin Result](#result) section for more information.
Errors and logs are communicated in the same way as the CNI plugin. See [CNI Plugin Result](#result) section for details.
IPAM plugin examples:
- **host-local**: Select an unused (by other containers on the same host) IP within the specified range.
- **dhcp**: Use DHCP protocol to acquire and maintain a lease. The DHCP requests will be sent via the created container interface; therefore, the associated network must support broadcast.
#### Notes
- Routes are expected to be added with a 0 metric.
- A default route may be specified via "0.0.0.0/0". Since another network might have already configured the default route, the CNI plugin should be prepared to skip over its default route definition.
packetInBytes:=20000// The shaper needs to 'warm'. Send enough to cause it to throttle,
// balanced by run time.
By(fmt.Sprintf("sending tcp traffic to the chained, bridged, traffic shaped container on ip address '%s:%d'\n\n",chainedBridgeIP,chainedBridgeBandwidthPort))
runtimeWithLimit:=b.Time("with chained bridge and bandwidth plugins",func(){
On Linux each OS thread can have a different network namespace. Go's thread scheduling model switches goroutines between OS threads based on OS thread load and whether the goroutine would block other goroutines. This can result in a goroutine switching network namespaces without notice and lead to errors in your code.
### Namespace Switching
Switching namespaces with the `ns.Set()` method is not recommended without additional strategies to prevent unexpected namespace changes when your goroutines switch OS threads.
Go provides the `runtime.LockOSThread()` function to ensure a specific goroutine executes on its current OS thread and prevents any other goroutine from running in that thread until the locked one exits. Careful usage of `LockOSThread()` and goroutines can provide good control over which network namespace a given goroutine executes in.
For example, you cannot rely on the `ns.Set()` namespace being the current namespace after the `Set()` call unless you do two things. First, the goroutine calling `Set()` must have previously called `LockOSThread()`. Second, you must ensure `runtime.UnlockOSThread()` is not called somewhere in-between. You also cannot rely on the initial network namespace remaining the current network namespace if any other code in your program switches namespaces, unless you have already called `LockOSThread()` in that goroutine. Note that `LockOSThread()` prevents the Go scheduler from optimally scheduling goroutines for best performance, so `LockOSThread()` should only be used in small, isolated goroutines that release the lock quickly.
### Do() The Recommended Thing
The `ns.Do()` method provides **partial** control over network namespaces for you by implementing these strategies. All code dependent on a particular network namespace (including the root namespace) should be wrapped in the `ns.Do()` method to ensure the correct namespace is selected for the duration of your code. For example:
```go
err=targetNs.Do(func(hostNsns.NetNS)error{
dummy:=&netlink.Dummy{
LinkAttrs:netlink.LinkAttrs{
Name:"dummy0",
},
}
returnnetlink.LinkAdd(dummy)
})
```
Note this requirement to wrap every network call is very onerous - any libraries you call might call out to network services such as DNS, and all such calls need to be protected after you call `ns.Do()`. All goroutines spawned from within the `ns.Do` will not inherit the new namespace. The CNI plugins all exit very soon after calling `ns.Do()` which helps to minimize the problem.
When a new thread is spawned in Linux, it inherits the namespace of its parent. In versions of go **prior to 1.10**, if the runtime spawns a new OS thread, it picks the parent randomly. If the chosen parent thread has been moved to a new namespace (even temporarily), the new OS thread will be permanently "stuck in the wrong namespace", and goroutines will non-deterministically switch namespaces as they are rescheduled.
In short, **there was no safe way to change network namespaces, even temporarily, from within a long-lived, multithreaded Go process**. If you wish to do this, you must use go 1.10 or greater.
### Creating network namespaces
Earlier versions of this library managed namespace creation, but as CNI does not actually utilize this feature (and it was essentially unmaintained), it was removed. If you're writing a container runtime, you should implement namespace management yourself. However, there are some gotchas when doing so, especially around handling `/var/run/netns`. A reasonably correct reference implementation, borrowed from `rkt`, can be found in `pkg/testutils/netns_linux.go` if you're in need of a source of inspiration.
Some files were not shown because too many files have changed in this diff
Show More
Reference in New Issue
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.