kernel_samsung_a53x/kernel
Connor O'Brien f1c11f7790 bpf, cgroups: Fix cgroup v2 fallback on v1/v2 mixed mode
From: Daniel Borkmann <daniel@iogearbox.net>

commit 8520e224f547cd070c7c8f97b1fc6d58cff7ccaa upstream.

Fix cgroup v1 interference when non-root cgroup v2 BPF programs are used.
Back in the days, commit bd1060a1d671 ("sock, cgroup: add sock->sk_cgroup")
embedded per-socket cgroup information into sock->sk_cgrp_data and in order
to save 8 bytes in struct sock made both mutually exclusive, that is, when
cgroup v1 socket tagging (e.g. net_cls/net_prio) is used, then cgroup v2
falls back to the root cgroup in sock_cgroup_ptr() (&cgrp_dfl_root.cgrp).

The assumption made was "there is no reason to mix the two and this is in line
with how legacy and v2 compatibility is handled" as stated in bd1060a1d671.
However, with Kubernetes more widely supporting cgroups v2 as well nowadays,
this assumption no longer holds, and the possibility of the v1/v2 mixed mode
with the v2 root fallback being hit becomes a real security issue.

Many of the cgroup v2 BPF programs are also used for policy enforcement, just
to pick _one_ example, that is, to programmatically deny socket related system
calls like connect(2) or bind(2). A v2 root fallback would implicitly cause
a policy bypass for the affected Pods.

In production environments, we have recently seen this case due to various
circumstances: i) a different 3rd party agent and/or ii) a container runtime
such as [0] in the user's environment configuring legacy cgroup v1 net_cls
tags, which triggered implicitly mentioned root fallback. Another case is
Kubernetes projects like kind [1] which create Kubernetes nodes in a container
and also add cgroup namespaces to the mix, meaning programs which are attached
to the cgroup v2 root of the cgroup namespace get attached to a non-root
cgroup v2 path from init namespace point of view. And the latter's root is
out of reach for agents on a kind Kubernetes node to configure. Meaning, any
entity on the node setting cgroup v1 net_cls tag will trigger the bypass
despite cgroup v2 BPF programs attached to the namespace root.

Generally, this mutual exclusiveness does not hold anymore in today's user
environments and makes cgroup v2 usage from BPF side fragile and unreliable.
This fix adds proper struct cgroup pointer for the cgroup v2 case to struct
sock_cgroup_data in order to address these issues; this implicitly also fixes
the tradeoffs being made back then with regards to races and refcount leaks
as stated in bd1060a1d671, and removes the fallback, so that cgroup v2 BPF
programs always operate as expected.

  [0] https://github.com/nestybox/sysbox/
  [1] https://kind.sigs.k8s.io/

Fixes: bd1060a1d671 ("sock, cgroup: add sock->sk_cgroup")
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Stanislav Fomichev <sdf@google.com>
Acked-by: Tejun Heo <tj@kernel.org>
Link: https://lore.kernel.org/bpf/20210913230759.2313-1-daniel@iogearbox.net
[resolve trivial conflicts]
Signed-off-by: Connor O'Brien <connor.obrien@crowdstrike.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-11-23 23:20:59 +01:00
..
bpf bpf: Eliminate remaining "make W=1" warnings in kernel/bpf/btf.o 2024-11-23 23:20:08 +01:00
cgroup bpf, cgroups: Fix cgroup v2 fallback on v1/v2 mixed mode 2024-11-23 23:20:59 +01:00
configs Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
debug kdb: Use the passed prompt in kdb_position_cursor() 2024-11-23 23:20:16 +01:00
dma dma-debug: avoid deadlock between dma debug vs printk and netconsole 2024-11-23 23:20:56 +01:00
entry Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
events perf: Prevent passing zero nr_pages to rb_alloc_aux() 2024-11-23 23:20:08 +01:00
futex futex: Don't include process MM in futex key on no-MMU 2024-11-18 11:42:47 +01:00
gcov gcov: add support for GCC 14 2024-11-19 14:19:10 +01:00
irq genirq/irqdesc: Honor caller provided affinity in alloc_desc() 2024-11-23 23:20:29 +01:00
kcsan Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
livepatch kallsyms: refactor {,module_}kallsyms_on_each_symbol 2024-11-19 12:27:32 +01:00
locking locking/rwlocks: introduce write_lock_nested 2024-11-19 17:44:05 +01:00
power power/wakelock: Add a timeout to wakelocks globally 2024-11-19 18:06:07 +01:00
printk printk: Silence useless system log spam 2024-11-19 18:04:40 +01:00
rcu rcutorture: Fix rcu_torture_fwd_cb_cr() data race 2024-11-23 23:20:22 +01:00
sched sched/cputime: Fix mul_u64_u64_div_u64() precision for cputime 2024-11-23 23:20:24 +01:00
time hrtimer: Prevent queuing of hrtimer without a function callback 2024-11-23 23:20:47 +01:00
trace tracing: Fix overflow in get_free_elt() 2024-11-23 23:20:29 +01:00
acct.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
async.c async: Introduce async_schedule_dev_nocall() 2024-11-18 12:12:56 +01:00
audit.c audit: Send netlink ACK before setting connection in auditd_set 2024-11-18 12:13:09 +01:00
audit.h Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
audit_fsnotify.c fsnotify: make allow_dups a property of the group 2024-11-19 12:27:55 +01:00
audit_tree.c fsnotify: pass flags argument to fsnotify_alloc_group() 2024-11-19 12:27:55 +01:00
audit_watch.c fsnotify: pass flags argument to fsnotify_alloc_group() 2024-11-19 12:27:55 +01:00
auditfilter.c ima: Avoid blocking in RCU read-side critical section 2024-11-19 14:19:42 +01:00
auditsc.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
backtracetest.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
bounds.c bounds: Use the right number of bits for power-of-two CONFIG_NR_CPUS 2024-11-19 11:32:40 +01:00
capability.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
cfi.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
compat.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
configs.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
context_tracking.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
cpu.c cpu: Silence log spam when a CPU is brought up 2024-11-19 18:04:55 +01:00
cpu_pm.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
crash_core.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
crash_dump.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
cred.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
delayacct.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
dma.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
exec_domain.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
exit.c mm: optimize the redundant loop of mm_update_owner_next() 2024-11-19 14:19:41 +01:00
extable.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
fail_function.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
fork.c exec: Simplify unshare_files 2024-11-19 12:27:27 +01:00
freezer.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
gen_kheaders.sh kheaders: explicitly define file modes for archived headers 2024-11-19 14:19:30 +01:00
groups.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
hung_task.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
iomem.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
irq_work.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
jump_label.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
kallsyms.c kallsyms: Improve the performance of kallsyms_lookup_name() 2024-11-19 18:06:35 +01:00
kcmp.c kcmp: In get_file_raw_ptr use task_lookup_fd_rcu 2024-11-19 12:27:27 +01:00
Kconfig.freezer Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
Kconfig.hz Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
Kconfig.locks Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
Kconfig.preempt Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
kcov.c kcov: don't lose track of remote references during softirqs 2024-11-19 14:19:10 +01:00
kexec.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
kexec_core.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
kexec_elf.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
kexec_file.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
kexec_internal.h Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
kheaders.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
kmod.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
kprobes.c kprobes: Fix to check symbol prefixes correctly 2024-11-23 23:20:27 +01:00
ksysfs.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
kthread.c exit: Implement kthread_exit 2024-11-19 12:27:49 +01:00
latencytop.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
Makefile kernel: Use the stock defconfig for /proc/config.gz 2024-06-15 16:20:14 -03:00
module-internal.h Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
module.c NFSD: Remove svc_serv_ops::svo_module 2024-11-19 12:27:54 +01:00
module_signature.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
module_signing.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
notifier.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
nsproxy.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
padata.c padata: Fix possible divide-by-0 panic in padata_mt_helper() 2024-11-23 23:20:29 +01:00
panic.c panic: Flush kernel log buffer at the end 2024-11-19 09:23:11 +01:00
params.c params: lift param_set_uint_minmax to common code 2024-11-19 12:27:09 +01:00
pid.c kernel/pid.c: implement additional checks upon pidfd_create() parameters 2024-11-19 12:27:43 +01:00
pid_namespace.c zap_pid_ns_processes: clear TIF_NOTIFY_SIGNAL along with TIF_SIGPENDING 2024-11-19 14:19:05 +01:00
profile.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
ptrace.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
range.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
reboot.c kernel/reboot: emergency_restart: Set correct system_state 2024-11-18 11:43:25 +01:00
regset.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
relay.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
resource.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
rseq.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
scftorture.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
scs.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
seccomp.c seccomp: Invalidate seccomp mode to catch death failures 2024-11-18 22:25:35 +01:00
signal.c kernel: rerun task_work while freezing in get_signal() 2024-11-23 23:20:16 +01:00
smp.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
smpboot.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
smpboot.h Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
softirq.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
stackleak.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
stacktrace.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
static_call.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
stop_machine.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
sys.c kernel/sys.c: implement custom uname override 2024-11-19 17:55:01 +01:00
sys_ni.c syscalls: fix compat_sys_io_pgetevents_time64 usage 2024-11-19 14:19:34 +01:00
sysctl-test.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
sysctl.c Revert "kernel: sysctl: add init protection to common mm-related nodes" 2024-11-19 18:13:49 +01:00
task_work.c task_work: Introduce task_work_cancel() again 2024-11-23 23:20:13 +01:00
taskstats.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
test_kprobes.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
torture.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
tracepoint.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
tsacct.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
ucount.c fanotify: configurable limits via sysfs 2024-11-19 12:27:37 +01:00
uid16.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
uid16.h Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
umh.c security: samsung: defex_lsm: nuke 2024-06-15 16:20:49 -03:00
up.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
user-return-notifier.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
user.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
user_namespace.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
usermode_driver.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
utsname.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
utsname_sysctl.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
watch_queue.c Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00
watchdog.c watchdog: move softlockup_panic back to early_param 2024-11-18 11:43:21 +01:00
watchdog_hld.c watchdog/perf: properly initialize the turbo mode timestamp and rearm counter 2024-11-23 23:20:15 +01:00
workqueue.c Revert "workqueue: Make queue_rcu_work() use call_rcu_flush()" 2024-11-19 18:15:40 +01:00
workqueue_internal.h Import A536BXXU9EXDC 2024-06-15 16:02:09 -03:00