[ Upstream commit f2d5bccaca3e8c09c9b9c8485375f7bdbb2631d2 ]
simple_realloc() frees the original buffer (ptr) even if the
reallocation failed.
Fix it to behave like standard realloc() and only free the original
buffer if the reallocation succeeded.
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20240229115149.749264-1-mpe@ellerman.id.au
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 69b0194ccec033c208b071e019032c1919c2822d ]
simple_malloc() will return NULL when there is not enough memory left.
Check pointer 'new' before using it to copy the old data.
Signed-off-by: Li zeming <zeming@nfschina.com>
[mpe: Reword subject, use change log from Christophe]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20221219021816.3012-1-zeming@nfschina.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 45b1ba7e5d1f6881050d558baf9bc74a2ae13930 ]
kasprintf() returns a pointer to dynamically allocated memory
which can be NULL upon failure. Ensure the allocation was successful
by checking the pointer validity.
Signed-off-by: Kunwu Chan <chentao@kylinos.cn>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20231122030651.3818-1-chentao@kylinos.cn
Signed-off-by: Sasha Levin <sashal@kernel.org>
commit 0db880fc865ffb522141ced4bfa66c12ab1fbb70 upstream.
nmi_enter()/nmi_exit() touches per cpu variables which can lead to kernel
crash when invoked during real mode interrupt handling (e.g. early HMI/MCE
interrupt handler) if percpu allocation comes from vmalloc area.
Early HMI/MCE handlers are called through DEFINE_INTERRUPT_HANDLER_NMI()
wrapper which invokes nmi_enter/nmi_exit calls. We don't see any issue when
percpu allocation is from the embedded first chunk. However with
CONFIG_NEED_PER_CPU_PAGE_FIRST_CHUNK enabled there are chances where percpu
allocation can come from the vmalloc area.
With kernel command line "percpu_alloc=page" we can force percpu allocation
to come from vmalloc area and can see kernel crash in machine_check_early:
[ 1.215714] NIP [c000000000e49eb4] rcu_nmi_enter+0x24/0x110
[ 1.215717] LR [c0000000000461a0] machine_check_early+0xf0/0x2c0
[ 1.215719] --- interrupt: 200
[ 1.215720] [c000000fffd73180] [0000000000000000] 0x0 (unreliable)
[ 1.215722] [c000000fffd731b0] [0000000000000000] 0x0
[ 1.215724] [c000000fffd73210] [c000000000008364] machine_check_early_common+0x134/0x1f8
Fix this by avoiding use of nmi_enter()/nmi_exit() in real mode if percpu
first chunk is not embedded.
Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Tested-by: Shirisha Ganta <shirisha@linux.ibm.com>
Signed-off-by: Mahesh Salgaonkar <mahesh@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20240410043006.81577-1-mahesh@linux.ibm.com
[ Conflicts in arch/powerpc/include/asm/interrupt.h
because machine_check_early() and machine_check_exception()
has been refactored. ]
Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 45547a0a93d85f704b49788cde2e1d9ab9cd363b upstream.
With CONFIG_FSL_IFC now being user-visible, and thus changed from a select
to depends in CONFIG_MTD_NAND_FSL_IFC, the dependencies needs to be
selected in defconfigs.
Depends-on: 9ba0cae3cac0 ("memory: fsl_ifc: Make FSL_IFC config visible and selectable")
Signed-off-by: Esben Haabendal <esben@geanix.com>
Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20240530-fsl-ifc-config-v3-2-1fd2c3d233dd@geanix.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit b4cf5fc01ce83e5c0bcf3dbb9f929428646b9098 ]
missing fdput() on one of the failure exits
Fixes: eacc56bb9de3e # v5.2
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 14196e47c5ffe32af7ed5a51c9e421c5ea5bccce ]
In the xmon disassembly code there are several CPU feature checks to
determine what dialects should be passed to the disassembler. The
dialect controls which instructions the disassembler will recognise.
Unfortunately the checks are incorrect, because instead of passing a
single CPU feature they are passing a mask of feature bits.
For example the code:
if (cpu_has_feature(CPU_FTRS_POWER5))
dialect |= PPC_OPCODE_POWER5;
Is trying to check if the system is running on a Power5 CPU. But
CPU_FTRS_POWER5 is a mask of *all* the feature bits that are enabled on
a Power5.
In practice the test will always return true for any 64-bit CPU, because
at least one bit in the mask will be present in the CPU_FTRS_ALWAYS
mask.
Similarly for all the other checks against CPU_FTRS_xx masks.
Rather than trying to match the disassembly behaviour exactly to the
current CPU, just differentiate between 32-bit and 64-bit, and Altivec,
VSX and HTM.
That will cause some instructions to be shown in disassembly even
on a CPU that doesn't support them, but that's OK, objdump -d output
has the same behaviour, and if anything it's less confusing than some
instructions not being disassembled.
Fixes: 897f112bb42e ("[POWERPC] Import updated version of ppc disassembly code for xmon")
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20240509121248.270878-2-mpe@ellerman.id.au
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit a1216e62d039bf63a539bbe718536ec789a853dd ]
If a PCI device is removed during eeh_pe_report_edev(), edev->pdev
will change and can cause a crash, hold the PCI rescan/remove lock
while taking a copy of edev->pdev->bus.
Signed-off-by: Ganesh Goudar <ganeshgr@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20240617140240.580453-1-ganeshgr@linux.ibm.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit a986fa57fd81a1430e00b3c6cf8a325d6f894a63 ]
Al reported a possible use-after-free (UAF) in kvm_spapr_tce_attach_iommu_group().
It looks up `stt` from tablefd, but then continues to use it after doing
fdput() on the returned fd. After the fdput() the tablefd is free to be
closed by another thread. The close calls kvm_spapr_tce_release() and
then release_spapr_tce_table() (via call_rcu()) which frees `stt`.
Although there are calls to rcu_read_lock() in
kvm_spapr_tce_attach_iommu_group() they are not sufficient to prevent
the UAF, because `stt` is used outside the locked regions.
With an artifcial delay after the fdput() and a userspace program which
triggers the race, KASAN detects the UAF:
BUG: KASAN: slab-use-after-free in kvm_spapr_tce_attach_iommu_group+0x298/0x720 [kvm]
Read of size 4 at addr c000200027552c30 by task kvm-vfio/2505
CPU: 54 PID: 2505 Comm: kvm-vfio Not tainted 6.10.0-rc3-next-20240612-dirty #1
Hardware name: 8335-GTH POWER9 0x4e1202 opal:skiboot-v6.5.3-35-g1851b2a06 PowerNV
Call Trace:
dump_stack_lvl+0xb4/0x108 (unreliable)
print_report+0x2b4/0x6ec
kasan_report+0x118/0x2b0
__asan_load4+0xb8/0xd0
kvm_spapr_tce_attach_iommu_group+0x298/0x720 [kvm]
kvm_vfio_set_attr+0x524/0xac0 [kvm]
kvm_device_ioctl+0x144/0x240 [kvm]
sys_ioctl+0x62c/0x1810
system_call_exception+0x190/0x440
system_call_vectored_common+0x15c/0x2ec
...
Freed by task 0:
...
kfree+0xec/0x3e0
release_spapr_tce_table+0xd4/0x11c [kvm]
rcu_core+0x568/0x16a0
handle_softirqs+0x23c/0x920
do_softirq_own_stack+0x6c/0x90
do_softirq_own_stack+0x58/0x90
__irq_exit_rcu+0x218/0x2d0
irq_exit+0x30/0x80
arch_local_irq_restore+0x128/0x230
arch_local_irq_enable+0x1c/0x30
cpuidle_enter_state+0x134/0x5cc
cpuidle_enter+0x6c/0xb0
call_cpuidle+0x7c/0x100
do_idle+0x394/0x410
cpu_startup_entry+0x60/0x70
start_secondary+0x3fc/0x410
start_secondary_prolog+0x10/0x14
Fix it by delaying the fdput() until `stt` is no longer in use, which
is effectively the entire function. To keep the patch minimal add a call
to fdput() at each of the existing return paths. Future work can convert
the function to goto or __cleanup style cleanup.
With the fix in place the test case no longer triggers the UAF.
Reported-by: Al Viro <viro@zeniv.linux.org.uk>
Closes: https://lore.kernel.org/all/20240610024437.GA1464458@ZenIV/
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20240614122910.3499489-1-mpe@ellerman.id.au
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 8873aab8646194a4446117bb617cc71bddda2dee ]
All these commands end up peeking into the PACA using the user
originated cpu id as an index. Check the cpu id is valid in order
to prevent xmon to crash. Instead of printing an error, this follows
the same behavior as the "lp s #" command : ignore the buggy cpu id
parameter and fall back to the #-less version of the command.
Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/161531347060.252863.10490063933688958044.stgit@bahia.lan
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit be140f1732b523947425aaafbe2e37b41b622d96 ]
There is code that builds with calls to IO accessors even when
CONFIG_PCI=n, but the actual calls are guarded by runtime checks.
If not those calls would be faulting, because the page at virtual
address zero is (usually) not mapped into the kernel. As Arnd pointed
out, it is possible a large port value could cause the address to be
above mmap_min_addr which would then access userspace, which would be
a bug.
To avoid any such issues, set _IO_BASE to POISON_POINTER_DELTA. That
is a value chosen to point into unmapped space between the kernel and
userspace, so any access will always fault.
Note that on 32-bit POISON_POINTER_DELTA is 0, so the patch only has an
effect on 64-bit.
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20240503075619.394467-2-mpe@ellerman.id.au
Signed-off-by: Sasha Levin <sashal@kernel.org>
commit d3882564a77c21eb746ba5364f3fa89b88de3d61 upstream.
Using sys_io_pgetevents() as the entry point for compat mode tasks
works almost correctly, but misses the sign extension for the min_nr
and nr arguments.
This was addressed on parisc by switching to
compat_sys_io_pgetevents_time64() in commit 6431e92fc827 ("parisc:
io_pgetevents_time64() needs compat syscall in 32-bit compat mode"),
as well as by using more sophisticated system call wrappers on x86 and
s390. However, arm64, mips, powerpc, sparc and riscv still have the
same bug.
Change all of them over to use compat_sys_io_pgetevents_time64()
like parisc already does. This was clearly the intention when the
function was originally added, but it got hooked up incorrectly in
the tables.
Cc: stable@vger.kernel.org
Fixes: 48166e6ea47d ("y2038: add 64-bit time_t syscalls to all 32-bit architectures")
Acked-by: Heiko Carstens <hca@linux.ibm.com> # s390
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 03c0f2c2b2220fc9cf8785cd7b61d3e71e24a366 ]
With -Wextra clang warns about pointer arithmetic using a null pointer.
When building with CONFIG_PCI=n, that triggers a warning in the IO
accessors, eg:
In file included from linux/arch/powerpc/include/asm/io.h:672:
linux/arch/powerpc/include/asm/io-defs.h:23:1: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
23 | DEF_PCI_AC_RET(inb, u8, (unsigned long port), (port), pio, port)
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
...
linux/arch/powerpc/include/asm/io.h:591:53: note: expanded from macro '__do_inb'
591 | #define __do_inb(port) readb((PCI_IO_ADDR)_IO_BASE + port);
| ~~~~~~~~~~~~~~~~~~~~~ ^
That is because when CONFIG_PCI=n, _IO_BASE is defined as 0.
Although _IO_BASE is defined as plain 0, the cast (PCI_IO_ADDR) converts
it to void * before the addition with port happens.
Instead the addition can be done first, and then the cast. The resulting
value will be the same, but avoids the warning, and also avoids void
pointer arithmetic which is apparently non-standard.
Reported-by: Naresh Kamboju <naresh.kamboju@linaro.org>
Closes: https://lore.kernel.org/all/CA+G9fYtEh8zmq8k8wE-8RZwW-Qr927RLTn+KqGnq1F=ptaaNsA@mail.gmail.com
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20240503075619.394467-1-mpe@ellerman.id.au
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit ff2e185cf73df480ec69675936c4ee75a445c3e4 ]
plpar_hcall(), plpar_hcall9(), and related functions expect callers to
provide valid result buffers of certain minimum size. Currently this
is communicated only through comments in the code and the compiler has
no idea.
For example, if I write a bug like this:
long retbuf[PLPAR_HCALL_BUFSIZE]; // should be PLPAR_HCALL9_BUFSIZE
plpar_hcall9(H_ALLOCATE_VAS_WINDOW, retbuf, ...);
This compiles with no diagnostics emitted, but likely results in stack
corruption at runtime when plpar_hcall9() stores results past the end
of the array. (To be clear this is a contrived example and I have not
found a real instance yet.)
To make this class of error less likely, we can use explicitly-sized
array parameters instead of pointers in the declarations for the hcall
APIs. When compiled with -Warray-bounds[1], the code above now
provokes a diagnostic like this:
error: array argument is too small;
is of size 32, callee requires at least 72 [-Werror,-Warray-bounds]
60 | plpar_hcall9(H_ALLOCATE_VAS_WINDOW, retbuf,
| ^ ~~~~~~
[1] Enabled for LLVM builds but not GCC for now. See commit
0da6e5fd6c37 ("gcc: disable '-Warray-bounds' for gcc-13 too") and
related changes.
Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20240408-pseries-hvcall-retbuf-v1-1-ebc73d7253cf@linux.ibm.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
commit 2d43cc701b96f910f50915ac4c2a0cae5deb734c upstream.
Building ppc64le_defconfig with GCC 14 fails with assembler errors:
CC fs/readdir.o
/tmp/ccdQn0mD.s: Assembler messages:
/tmp/ccdQn0mD.s:212: Error: operand out of domain (18 is not a multiple of 4)
/tmp/ccdQn0mD.s:226: Error: operand out of domain (18 is not a multiple of 4)
... [6 lines]
/tmp/ccdQn0mD.s:1699: Error: operand out of domain (18 is not a multiple of 4)
A snippet of the asm shows:
# ../fs/readdir.c:210: unsafe_copy_dirent_name(dirent->d_name, name, namlen, efault_end);
ld 9,0(29) # MEM[(u64 *)name_38(D) + _88 * 1], MEM[(u64 *)name_38(D) + _88 * 1]
# 210 "../fs/readdir.c" 1
1: std 9,18(8) # put_user # *__pus_addr_52, MEM[(u64 *)name_38(D) + _88 * 1]
The 'std' instruction requires a 4-byte aligned displacement because
it is a DS-form instruction, and as the assembler says, 18 is not a
multiple of 4.
A similar error is seen with GCC 13 and CONFIG_UBSAN_SIGNED_WRAP=y.
The fix is to change the constraint on the memory operand to put_user(),
from "m" which is a general memory reference to "YZ".
The "Z" constraint is documented in the GCC manual PowerPC machine
constraints, and specifies a "memory operand accessed with indexed or
indirect addressing". "Y" is not documented in the manual but specifies
a "memory operand for a DS-form instruction". Using both allows the
compiler to generate a DS-form "std" or X-form "stdx" as appropriate.
The change has to be conditional on CONFIG_PPC_KERNEL_PREFIXED because
the "Y" constraint does not guarantee 4-byte alignment when prefixed
instructions are enabled.
Unfortunately clang doesn't support the "Y" constraint so that has to be
behind an ifdef.
Although the build error is only seen with GCC 13/14, that appears
to just be luck. The constraint has been incorrect since it was first
added.
Fixes: c20beffeec3c ("powerpc/uaccess: Use flexible addressing with __put_user()/__get_user()")
Cc: stable@vger.kernel.org # v5.10+
Suggested-by: Kewen Lin <linkw@gcc.gnu.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20240529123029.146953-1-mpe@ellerman.id.au
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 460b4f812a9d473d4b39d87d37844f9fc30a9eb3 ]
Also remove the confusing comment about checking if a fd exists. I
could not find one instance in the entire kernel that still matches
the description or the reason for the name fcheck.
The need for better names became apparent in the last round of
discussion of this set of changes[1].
[1] https://lkml.kernel.org/r/CAHk-=wj8BQbgJFLa+J0e=iT-1qpmCRTbPAJ8gd6MJQ=kbRPqyQ@mail.gmail.com
Link: https://lkml.kernel.org/r/20201120231441.29911-10-ebiederm@xmission.com
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 6d4341638516bf97b9a34947e0bd95035a8230a5 ]
Couple of Minor fixes:
- hcall return values are long. Fix that for h_get_mpp, h_get_ppp and
parse_ppp_data
- If hcall fails, values set should be at-least zero. It shouldn't be
uninitialized values. Fix that for h_get_mpp and h_get_ppp
Signed-off-by: Shrikanth Hegde <sshegde@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20240412092047.455483-3-sshegde@linux.ibm.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 01acaf3aa75e1641442cc23d8fe0a7bb4226efb1 ]
vmpic_msi_feature is only used conditionally, which triggers a rare
-Werror=unused-const-variable= warning with gcc:
arch/powerpc/sysdev/fsl_msi.c:567:37: error: 'vmpic_msi_feature' defined but not used [-Werror=unused-const-variable=]
567 | static const struct fsl_msi_feature vmpic_msi_feature =
Hide this one in the same #ifdef as the reference so we can turn on
the warning by default.
Fixes: 305bcf26128e ("powerpc/fsl-soc: use CONFIG_EPAPR_PARAVIRT for hcalls")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20240403080702.3509288-2-arnd@kernel.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
commit 35f20786c481d5ced9283ff42de5c69b65e5ed13 upstream.
arch/powerpc/lib/xor_vmx.o is built with '-msoft-float' (from the main
powerpc Makefile) and '-maltivec' (from its CFLAGS), which causes an
error when building with clang after a recent change in main:
error: option '-msoft-float' cannot be specified with '-maltivec'
make[6]: *** [scripts/Makefile.build:243: arch/powerpc/lib/xor_vmx.o] Error 1
Explicitly add '-mhard-float' before '-maltivec' in xor_vmx.o's CFLAGS
to override the previous inclusion of '-msoft-float' (as the last option
wins), which matches how other areas of the kernel use '-maltivec', such
as AMDGPU.
Cc: stable@vger.kernel.org
Closes: https://github.com/ClangBuiltLinux/linux/issues/1986
Link: 4792f912b2
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20240127-ppc-xor_vmx-drop-msoft-float-v1-1-f24140e81376@kernel.org
[nathan: Fixed conflicts due to lack of 04e85bbf71c9 in older trees]
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 5f491356b7149564ab22323ccce79c8d595bfd0c ]
Binutils 2.38 complains about the use of mfpmr when building
ppc6xx_defconfig:
CC arch/powerpc/kernel/pmc.o
{standard input}: Assembler messages:
{standard input}:45: Error: unrecognized opcode: `mfpmr'
{standard input}:56: Error: unrecognized opcode: `mtpmr'
This is because by default the kernel is built with -mcpu=powerpc, and
the mt/mfpmr instructions are not defined.
It can be avoided by enabling CONFIG_E300C3_CPU, but just adding that to
the defconfig will leave open the possibility of randconfig failures.
So add machine directives around the mt/mfpmr instructions to tell
binutils how to assemble them.
Cc: stable@vger.kernel.org
Reported-by: Jan-Benedict Glaw <jbglaw@lug-owl.de>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20240229122521.762431-3-mpe@ellerman.id.au
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 20933531be0577cdd782216858c26150dbc7936f ]
Move the prototypes into mpc10x.h which is included by all the relevant
C files, fixes:
arch/powerpc/platforms/embedded6xx/ls_uart.c:59:6: error: no previous prototype for 'avr_uart_configure'
arch/powerpc/platforms/embedded6xx/ls_uart.c:82:6: error: no previous prototype for 'avr_uart_send'
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20240305123410.3306253-1-mpe@ellerman.id.au
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit ad86d7ee43b22aa2ed60fb982ae94b285c1be671 ]
Running event hv_gpci/dispatch_timebase_by_processor_processor_time_in_timebase_cycles,phys_processor_idx=0/
in one of the system throws below error:
---Logs---
# perf list | grep hv_gpci/dispatch_timebase_by_processor_processor_time_in_timebase_cycles
hv_gpci/dispatch_timebase_by_processor_processor_time_in_timebase_cycles,phys_processor_idx=?/[Kernel PMU event]
# perf stat -v -e hv_gpci/dispatch_timebase_by_processor_processor_time_in_timebase_cycles,phys_processor_idx=0/ sleep 2
Using CPUID 00800200
Control descriptor is not initialized
Warning:
hv_gpci/dispatch_timebase_by_processor_processor_time_in_timebase_cycles,phys_processor_idx=0/ event is not supported by the kernel.
failed to read counter hv_gpci/dispatch_timebase_by_processor_processor_time_in_timebase_cycles,phys_processor_idx=0/
Performance counter stats for 'system wide':
<not supported> hv_gpci/dispatch_timebase_by_processor_processor_time_in_timebase_cycles,phys_processor_idx=0/
2.000700771 seconds time elapsed
The above error is because of the hcall failure as required
permission "Enable Performance Information Collection" is not set.
Based on current code, single_gpci_request function did not check the
error type incase hcall fails and by default returns EINVAL. But we can
have other reasons for hcall failures like H_AUTHORITY/H_PARAMETER with
detail_rc as GEN_BUF_TOO_SMALL, for which we need to act accordingly.
Fix this issue by adding new checks in the single_gpci_request and
h_gpci_event_init functions.
Result after fix patch changes:
# perf stat -e hv_gpci/dispatch_timebase_by_processor_processor_time_in_timebase_cycles,phys_processor_idx=0/ sleep 2
Error:
No permission to enable hv_gpci/dispatch_timebase_by_processor_processor_time_in_timebase_cycles,phys_processor_idx=0/ event.
Fixes: 220a0c609ad1 ("powerpc/perf: Add support for the hv gpci (get performance counter info) interface")
Reported-by: Akanksha J N <akanksha@linux.ibm.com>
Signed-off-by: Kajol Jain <kjain@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20240229122847.101162-1-kjain@linux.ibm.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 27646b2e02b096a6936b3e3b6ba334ae20763eab ]
It can be easy to miss that the notifier mechanism invokes the callbacks
in an atomic context, so add some comments to that effect on the two
handlers we register here.
Signed-off-by: Benjamin Gray <bgray@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20230829063457.54157-4-bgray@linux.ibm.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 3d2ffcdd2a982e8bbe65fa0f94fb21bf304c281e ]
POWER10 DD1 has an issue where it generates watchpoint exceptions when
it shouldn't. The conditions where this occur are:
- octword op
- ending address of DAWR range is less than starting address of op
- those addresses need to be in the same or in two consecutive 512B
blocks
- 'op address + 64B' generates an address that has a carry into bit
52 (crosses 2K boundary)
Handle such spurious exception by considering them as extraneous and
emulating/single-steeping instruction without generating an event.
[ravi: Fixed build warning reported by lkp@intel.com]
Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20201106045650.278987-1-ravi.bangoria@linux.ibm.com
Stable-dep-of: 27646b2e02b0 ("powerpc/watchpoints: Annotate atomic context in more places")
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 4a7aee96200ad281a5cc4cf5c7a2e2a49d2b97b0 ]
In kasan_init_region, when k_start is not page aligned, at the begin of
for loop, k_cur = k_start & PAGE_MASK is less than k_start, and then
`va = block + k_cur - k_start` is less than block, the addr va is invalid,
because the memory address space from va to block is not alloced by
memblock_alloc, which will not be reserved by memblock_reserve later, it
will be used by other places.
As a result, memory overwriting occurs.
for example:
int __init __weak kasan_init_region(void *start, size_t size)
{
[...]
/* if say block(dcd97000) k_start(feef7400) k_end(feeff3fe) */
block = memblock_alloc(k_end - k_start, PAGE_SIZE);
[...]
for (k_cur = k_start & PAGE_MASK; k_cur < k_end; k_cur += PAGE_SIZE) {
/* at the begin of for loop
* block(dcd97000) va(dcd96c00) k_cur(feef7000) k_start(feef7400)
* va(dcd96c00) is less than block(dcd97000), va is invalid
*/
void *va = block + k_cur - k_start;
[...]
}
[...]
}
Therefore, page alignment is performed on k_start before
memblock_alloc() to ensure the validity of the VA address.
Fixes: 663c0c9496a6 ("powerpc/kasan: Fix shadow area set up for modules.")
Signed-off-by: Jiangfeng Xiao <xiaojiangfeng@huawei.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/1705974359-43790-1-git-send-email-xiaojiangfeng@huawei.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 8f9abaa6d7de0a70fc68acaedce290c1f96e2e59 ]
Some of the fp/vmx code in sstep.c assume a certain maximum size for the
instructions being emulated. The size of those operations however is
determined separately in analyse_instr().
Add a check to validate the assumption on the maximum size of the
operations, so as to prevent any unintended kernel stack corruption.
Signed-off-by: Naveen N Rao <naveen@kernel.org>
Reviewed-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Build-tested-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20231123071705.397625-1-naveen@kernel.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 0d555b57ee660d8a871781c0eebf006e855e918d ]
The linux-next build of powerpc64 allnoconfig fails with:
arch/powerpc/mm/book3s64/pgtable.c:557:5: error: no previous prototype for 'pmd_move_must_withdraw'
557 | int pmd_move_must_withdraw(struct spinlock *new_pmd_ptl,
| ^~~~~~~~~~~~~~~~~~~~~~
Caused by commit:
c6345dfa6e3e ("Makefile.extrawarn: turn on missing-prototypes globally")
Fix it by moving the function definition under
CONFIG_TRANSPARENT_HUGEPAGE like the prototype. The function is only
called when CONFIG_TRANSPARENT_HUGEPAGE=y.
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
[mpe: Flesh out change log from linux-next patch]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20231127132809.45c2b398@canb.auug.org.au
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit d8c3f243d4db24675b653f0568bb65dae34e6455 ]
With NUMA=n and FA_DUMP=y or PRESERVE_FA_DUMP=y the build fails with:
arch/powerpc/kernel/fadump.c:1739:22: error: no previous prototype for ‘arch_reserved_kernel_pages’ [-Werror=missing-prototypes]
1739 | unsigned long __init arch_reserved_kernel_pages(void)
| ^~~~~~~~~~~~~~~~~~~~~~~~~~
The prototype for arch_reserved_kernel_pages() is in include/linux/mm.h,
but it's guarded by __HAVE_ARCH_RESERVED_KERNEL_PAGES. The powerpc
headers define __HAVE_ARCH_RESERVED_KERNEL_PAGES in asm/mmzone.h, which
is not included into the generic headers when NUMA=n.
Move the definition of __HAVE_ARCH_RESERVED_KERNEL_PAGES into asm/mmu.h
which is included regardless of NUMA=n.
Additionally the ifdef around __HAVE_ARCH_RESERVED_KERNEL_PAGES needs to
also check for CONFIG_PRESERVE_FA_DUMP.
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20231130114433.3053544-1-mpe@ellerman.id.au
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit f8d3555355653848082c351fa90775214fb8a4fa ]
With CONFIG_GENERIC_BUG=n the build fails with:
arch/powerpc/kernel/traps.c:1442:5: error: no previous prototype for ‘is_valid_bugaddr’ [-Werror=missing-prototypes]
1442 | int is_valid_bugaddr(unsigned long addr)
| ^~~~~~~~~~~~~~~~
The prototype is only defined, and the function is only needed, when
CONFIG_GENERIC_BUG=y, so move the implementation under that.
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20231130114433.3053544-2-mpe@ellerman.id.au
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit f46c8a75263f97bda13c739ba1c90aced0d3b071 ]
kasprintf() returns a pointer to dynamically allocated memory
which can be NULL upon failure. Ensure the allocation was successful
by checking the pointer validity.
Suggested-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Suggested-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Kunwu Chan <chentao@kylinos.cn>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20231204023223.2447523-1-chentao@kylinos.cn
Signed-off-by: Sasha Levin <sashal@kernel.org>
commit 37b2a6510a48ca361ced679f92682b7b7d7d0330 upstream.
Allocations whose size is related to the memslot size can be arbitrarily
large. Do not use kvzalloc/kvcalloc, as those are limited to "not crazy"
sizes that fit in 32 bits.
Cc: stable@vger.kernel.org
Fixes: 7661809d493b ("mm: don't allow oversized kvmalloc() calls")
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Alexander Ofitserov <oficerovas@altlinux.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit e123015c0ba859cf48aa7f89c5016cc6e98e018d ]
kasprintf() returns a pointer to dynamically allocated memory
which can be NULL upon failure.
Fixes: b9ef7b4b867f ("powerpc: Convert to using %pOFn instead of device_node.name")
Signed-off-by: Kunwu Chan <chentao@kylinos.cn>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20231126095739.1501990-1-chentao@kylinos.cn
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 8649829a1dd25199bbf557b2621cedb4bf9b3050 ]
kasprintf() returns a pointer to dynamically allocated memory
which can be NULL upon failure.
Fixes: 2717a33d6074 ("powerpc/opal-irqchip: Use interrupt names if present")
Signed-off-by: Kunwu Chan <chentao@kylinos.cn>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20231127030755.1546750-1-chentao@kylinos.cn
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 9a260f2dd827bbc82cc60eb4f4d8c22707d80742 ]
kasprintf() returns a pointer to dynamically allocated memory
which can be NULL upon failure.
Add a null pointer check, and release 'ent' to avoid memory leaks.
Fixes: bfd2f0d49aef ("powerpc/powernv: Get rid of old scom_controller abstraction")
Signed-off-by: Kunwu Chan <chentao@kylinos.cn>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20231208085937.107210-1-chentao@kylinos.cn
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit bd68ffce69f6cf8ddd3a3c32549d1d2275e49fc5 ]
dlpar_memory_remove_by_index() may access beyond the bounds of the
drmem lmb array when the LMB lookup fails to match an entry with the
given DRC index. When the search fails, the cursor is left pointing to
&drmem_info->lmbs[drmem_info->n_lmbs], which is one element past the
last valid entry in the array. The debug message at the end of the
function then dereferences this pointer:
pr_debug("Failed to hot-remove memory at %llx\n",
lmb->base_addr);
This was found by inspection and confirmed with KASAN:
pseries-hotplug-mem: Attempting to hot-remove LMB, drc index 1234
==================================================================
BUG: KASAN: slab-out-of-bounds in dlpar_memory+0x298/0x1658
Read of size 8 at addr c000000364e97fd0 by task bash/949
dump_stack_lvl+0xa4/0xfc (unreliable)
print_report+0x214/0x63c
kasan_report+0x140/0x2e0
__asan_load8+0xa8/0xe0
dlpar_memory+0x298/0x1658
handle_dlpar_errorlog+0x130/0x1d0
dlpar_store+0x18c/0x3e0
kobj_attr_store+0x68/0xa0
sysfs_kf_write+0xc4/0x110
kernfs_fop_write_iter+0x26c/0x390
vfs_write+0x2d4/0x4e0
ksys_write+0xac/0x1a0
system_call_exception+0x268/0x530
system_call_vectored_common+0x15c/0x2ec
Allocated by task 1:
kasan_save_stack+0x48/0x80
kasan_set_track+0x34/0x50
kasan_save_alloc_info+0x34/0x50
__kasan_kmalloc+0xd0/0x120
__kmalloc+0x8c/0x320
kmalloc_array.constprop.0+0x48/0x5c
drmem_init+0x2a0/0x41c
do_one_initcall+0xe0/0x5c0
kernel_init_freeable+0x4ec/0x5a0
kernel_init+0x30/0x1e0
ret_from_kernel_user_thread+0x14/0x1c
The buggy address belongs to the object at c000000364e80000
which belongs to the cache kmalloc-128k of size 131072
The buggy address is located 0 bytes to the right of
allocated 98256-byte region [c000000364e80000, c000000364e97fd0)
==================================================================
pseries-hotplug-mem: Failed to hot-remove memory at 0
Log failed lookups with a separate message and dereference the
cursor only when it points to a valid entry.
Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com>
Fixes: 51925fb3c5c9 ("powerpc/pseries: Implement memory hotplug remove in the kernel")
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20231114-pseries-memhp-fixes-v1-1-fb8f2bb7c557@linux.ibm.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 20e9de85edae3a5866f29b6cce87c9ec66d62a1b ]
When attempting to remove by index a set of LMBs a lot of messages are
displayed on the console, even when everything goes fine:
pseries-hotplug-mem: Attempting to hot-remove LMB, drc index 8000002d
Offlined Pages 4096
pseries-hotplug-mem: Memory at 2d0000000 was hot-removed
The 2 messages prefixed by "pseries-hotplug-mem" are not really
helpful for the end user, they should be debug outputs.
In case of error, because some of the LMB's pages couldn't be
offlined, the following is displayed on the console:
pseries-hotplug-mem: Attempting to hot-remove LMB, drc index 8000003e
pseries-hotplug-mem: Failed to hot-remove memory at 3e0000000
dlpar: Could not handle DLPAR request "memory remove index 0x8000003e"
Again, the 2 messages prefixed by "pseries-hotplug-mem" are useless,
and the generic DLPAR prefixed message should be enough.
These 2 first changes are mainly triggered by the changes introduced
in drmgr:
https://groups.google.com/g/powerpc-utils-devel/c/Y6ef4NB3EzM/m/9cu5JHRxAQAJ
Also, when adding a bunch of LMBs, a message is displayed in the console per LMB
like these ones:
pseries-hotplug-mem: Memory at 7e0000000 (drc index 8000007e) was hot-added
pseries-hotplug-mem: Memory at 7f0000000 (drc index 8000007f) was hot-added
pseries-hotplug-mem: Memory at 800000000 (drc index 80000080) was hot-added
pseries-hotplug-mem: Memory at 810000000 (drc index 80000081) was hot-added
When adding 1TB of memory and LMB size is 256MB, this leads to 4096
messages to be displayed on the console. These messages are not really
helpful for the end user, so moving them to the DEBUG level.
Signed-off-by: Laurent Dufour <ldufour@linux.ibm.com>
[mpe: Tweak change log wording]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20201211145954.90143-1-ldufour@linux.ibm.com
Stable-dep-of: bd68ffce69f6 ("powerpc/pseries/memhp: Fix access beyond end of drmem array")
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 4a74197b65e69c46fe6e53f7df2f4d6ce9ffe012 ]
Fix build errors when CURRITUCK=y and I2C is not builtin (=m or is
not set). Fixes these build errors:
powerpc-linux-ld: arch/powerpc/platforms/44x/ppc476.o: in function `avr_halt_system':
ppc476.c:(.text+0x58): undefined reference to `i2c_smbus_write_byte_data'
powerpc-linux-ld: arch/powerpc/platforms/44x/ppc476.o: in function `ppc47x_device_probe':
ppc476.c:(.init.text+0x18): undefined reference to `i2c_register_driver'
Fixes: 2a2c74b2efcb ("IBM Akebono: Add the Akebono platform")
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Reported-by: kernel test robot <lkp@intel.com>
Closes: lore.kernel.org/r/202312010820.cmdwF5X9-lkp@intel.com
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20231201055159.8371-1-rdunlap@infradead.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 09ca497528dac12cbbceab8197011c875a96d053 ]
Last user of in_kernel_text() stopped using in with
commit 549e8152de80 ("powerpc: Make the 64-bit kernel as a
position-independent executable").
Generic function is_kernel_text() does the same.
So remote it.
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/2a3a5b6f8cc0ef4e854d7b764f66aa8d2ee270d2.1624813698.git.christophe.leroy@csgroup.eu
Stable-dep-of: 1b1e38002648 ("powerpc: add crtsavres.o to always-y instead of extra-y")
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 1b1e38002648819c04773647d5242990e2824264 ]
crtsavres.o is linked to modules. However, as explained in commit
d0e628cd817f ("kbuild: doc: clarify the difference between extra-y
and always-y"), 'make modules' does not build extra-y.
For example, the following command fails:
$ make ARCH=powerpc LLVM=1 KBUILD_MODPOST_WARN=1 mrproper ps3_defconfig modules
[snip]
LD [M] arch/powerpc/platforms/cell/spufs/spufs.ko
ld.lld: error: cannot open arch/powerpc/lib/crtsavres.o: No such file or directory
make[3]: *** [scripts/Makefile.modfinal:56: arch/powerpc/platforms/cell/spufs/spufs.ko] Error 1
make[2]: *** [Makefile:1844: modules] Error 2
make[1]: *** [/home/masahiro/workspace/linux-kbuild/Makefile:350: __build_one_by_one] Error 2
make: *** [Makefile:234: __sub-make] Error 2
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Fixes: baa25b571a16 ("powerpc/64: Do not link crtsavres.o in vmlinux")
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20231120232332.4100288-1-masahiroy@kernel.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
commit b684c09f09e7a6af3794d4233ef785819e72db79 upstream.
ppc_save_regs() skips one stack frame while saving the CPU register states.
Instead of saving current R1, it pulls the previous stack frame pointer.
When vmcores caused by direct panic call (such as `echo c >
/proc/sysrq-trigger`), are debugged with gdb, gdb fails to show the
backtrace correctly. On further analysis, it was found that it was because
of mismatch between r1 and NIP.
GDB uses NIP to get current function symbol and uses corresponding debug
info of that function to unwind previous frames, but due to the
mismatching r1 and NIP, the unwinding does not work, and it fails to
unwind to the 2nd frame and hence does not show the backtrace.
GDB backtrace with vmcore of kernel without this patch:
---------
(gdb) bt
#0 0xc0000000002a53e8 in crash_setup_regs (oldregs=<optimized out>,
newregs=0xc000000004f8f8d8) at ./arch/powerpc/include/asm/kexec.h:69
#1 __crash_kexec (regs=<optimized out>) at kernel/kexec_core.c:974
#2 0x0000000000000063 in ?? ()
#3 0xc000000003579320 in ?? ()
---------
Further analysis revealed that the mismatch occurred because
"ppc_save_regs" was saving the previous stack's SP instead of the current
r1. This patch fixes this by storing current r1 in the saved pt_regs.
GDB backtrace with vmcore of patched kernel:
--------
(gdb) bt
#0 0xc0000000002a53e8 in crash_setup_regs (oldregs=0x0, newregs=0xc00000000670b8d8)
at ./arch/powerpc/include/asm/kexec.h:69
#1 __crash_kexec (regs=regs@entry=0x0) at kernel/kexec_core.c:974
#2 0xc000000000168918 in panic (fmt=fmt@entry=0xc000000001654a60 "sysrq triggered crash\n")
at kernel/panic.c:358
#3 0xc000000000b735f8 in sysrq_handle_crash (key=<optimized out>) at drivers/tty/sysrq.c:155
#4 0xc000000000b742cc in __handle_sysrq (key=key@entry=99, check_mask=check_mask@entry=false)
at drivers/tty/sysrq.c:602
#5 0xc000000000b7506c in write_sysrq_trigger (file=<optimized out>, buf=<optimized out>,
count=2, ppos=<optimized out>) at drivers/tty/sysrq.c:1163
#6 0xc00000000069a7bc in pde_write (ppos=<optimized out>, count=<optimized out>,
buf=<optimized out>, file=<optimized out>, pde=0xc00000000362cb40) at fs/proc/inode.c:340
#7 proc_reg_write (file=<optimized out>, buf=<optimized out>, count=<optimized out>,
ppos=<optimized out>) at fs/proc/inode.c:352
#8 0xc0000000005b3bbc in vfs_write (file=file@entry=0xc000000006aa6b00,
buf=buf@entry=0x61f498b4f60 <error: Cannot access memory at address 0x61f498b4f60>,
count=count@entry=2, pos=pos@entry=0xc00000000670bda0) at fs/read_write.c:582
#9 0xc0000000005b4264 in ksys_write (fd=<optimized out>,
buf=0x61f498b4f60 <error: Cannot access memory at address 0x61f498b4f60>, count=2)
at fs/read_write.c:637
#10 0xc00000000002ea2c in system_call_exception (regs=0xc00000000670be80, r0=<optimized out>)
at arch/powerpc/kernel/syscall.c:171
#11 0xc00000000000c270 in system_call_vectored_common ()
at arch/powerpc/kernel/interrupt_64.S:192
--------
Nick adds:
So this now saves regs as though it was an interrupt taken in the
caller, at the instruction after the call to ppc_save_regs, whereas
previously the NIP was there, but R1 came from the caller's caller and
that mismatch is what causes gdb's dwarf unwinder to go haywire.
Signed-off-by: Aditya Gupta <adityag@linux.ibm.com>
Fixes: d16a58f8854b1 ("powerpc: Improve ppc_save_regs()")
Reivewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20230615091047.90433-1-adityag@linux.ibm.com
Cc: stable@vger.kernel.org
Signed-off-by: Aditya Gupta <adityag@linux.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 4b3338aaa74d7d4ec5b6734dc298f0db94ec83d2 upstream.
Commit 41a506ef71eb ("powerpc/ftrace: Create a dummy stackframe to fix
stack unwind") added use of a new stack frame on ftrace entry to fix
stack unwind. However, the commit missed updating the offset used while
tearing down the ftrace stack when ftrace is disabled. Fix the same.
In addition, the commit missed saving the correct stack pointer in
pt_regs. Update the same.
Fixes: 41a506ef71eb ("powerpc/ftrace: Create a dummy stackframe to fix stack unwind")
Cc: stable@vger.kernel.org # v6.5+
Signed-off-by: Naveen N Rao <naveen@kernel.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20231130065947.2188860-1-naveen@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 5e1d824f9a283cbf90f25241b66d1f69adb3835b upstream.
During floating point and vector save to thread data f0/vs0 are
clobbered by the FPSCR/VSCR store routine. This has been obvserved to
lead to userspace register corruption and application data corruption
with io-uring.
Fix it by restoring f0/vs0 after FPSCR/VSCR store has completed for
all the FP, altivec, VMX register save paths.
Tested under QEMU in kvm mode, running on a Talos II workstation with
dual POWER9 DD2.2 CPUs.
Additional detail (mpe):
Typically save_fpu() is called from __giveup_fpu() which saves the FP
regs and also *turns off FP* in the tasks MSR, meaning the kernel will
reload the FP regs from the thread struct before letting the task use FP
again. So in that case save_fpu() is free to clobber f0 because the FP
regs no longer hold live values for the task.
There is another case though, which is the path via:
sys_clone()
...
copy_process()
dup_task_struct()
arch_dup_task_struct()
flush_all_to_thread()
save_all()
That path saves the FP regs but leaves them live. That's meant as an
optimisation for a process that's using FP/VSX and then calls fork(),
leaving the regs live means the parent process doesn't have to take a
fault after the fork to get its FP regs back. The optimisation was added
in commit 8792468da5e1 ("powerpc: Add the ability to save FPU without
giving it up").
That path does clobber f0, but f0 is volatile across function calls,
and typically programs reach copy_process() from userspace via a syscall
wrapper function. So in normal usage f0 being clobbered across a
syscall doesn't cause visible data corruption.
But there is now a new path, because io-uring can call copy_process()
via create_io_thread() from the signal handling path. That's OK if the
signal is handled as part of syscall return, but it's not OK if the
signal is handled due to some other interrupt.
That path is:
interrupt_return_srr_user()
interrupt_exit_user_prepare()
interrupt_exit_user_prepare_main()
do_notify_resume()
get_signal()
task_work_run()
create_worker_cb()
create_io_worker()
copy_process()
dup_task_struct()
arch_dup_task_struct()
flush_all_to_thread()
save_all()
if (tsk->thread.regs->msr & MSR_FP)
save_fpu()
# f0 is clobbered and potentially live in userspace
Note the above discussion applies equally to save_altivec().
Fixes: 8792468da5e1 ("powerpc: Add the ability to save FPU without giving it up")
Cc: stable@vger.kernel.org # v4.6+
Closes: https://lore.kernel.org/all/480932026.45576726.1699374859845.JavaMail.zimbra@raptorengineeringinc.com/
Closes: https://lore.kernel.org/linuxppc-dev/480221078.47953493.1700206777956.JavaMail.zimbra@raptorengineeringinc.com/
Tested-by: Timothy Pearson <tpearson@raptorengineering.com>
Tested-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Timothy Pearson <tpearson@raptorengineering.com>
[mpe: Reword change log to describe exact path of corruption & other minor tweaks]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/1921539696.48534988.1700407082933.JavaMail.zimbra@raptorengineeringinc.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit ea142e590aec55ba40c5affb4d49e68c713c63dc upstream.
When the PMU is disabled, MMCRA is not updated to disable BHRB and
instruction sampling. This can lead to those features remaining enabled,
which can slow down a real or emulated CPU.
Fixes: 1cade527f6e9 ("powerpc/perf: BHRB control to disable BHRB logic when not used")
Cc: stable@vger.kernel.org # v5.9+
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20231018153423.298373-1-npiggin@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 95f1a128cd728a7257d78e868f1f5a145fc43736 ]
If the vcpu_associativity alloc memory successfully but the
pcpu_associativity fails to alloc memory, the vcpu_associativity
memory leaks.
Fixes: d62c8deeb6e6 ("powerpc/pseries: Provide vcpu dispatch statistics")
Signed-off-by: Wang Yufen <wangyufen@huawei.com>
Reviewed-by: "Naveen N. Rao" <naveen.n.rao@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/1671003983-10794-1-git-send-email-wangyufen@huawei.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 007240d59c11f87ac4f6cfc6a1d116630b6b634c ]
The macro __SPIN_LOCK_INITIALIZER() is implementation specific. Users
that desire to initialize a spinlock in a struct must use
__SPIN_LOCK_UNLOCKED().
Use __SPIN_LOCK_UNLOCKED() for the spinlock_t in imc_global_refc.
Fixes: 76d588dddc459 ("powerpc/imc-pmu: Fix use of mutex in IRQs disabled section")
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20230309134831.Nz12nqsU@linutronix.de
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit ff7a60ab1e065257a0e467c13b519f4debcd7fcf ]
Sparse reports a size mismatch in the endian swap. The Opal
implementation[1] passes the value as a __be64, and the receiving
variable out_qsize is a u64, so the use of be32_to_cpu() appears to be
an error.
[1]: https://github.com/open-power/skiboot/blob/80e2b1dc73/hw/xive.c#L3854
Fixes: 88ec6b93c8e7 ("powerpc/xive: add OPAL extensions for the XIVE native exploitation support")
Signed-off-by: Benjamin Gray <bgray@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20231011053711.93427-2-bgray@linux.ibm.com
Signed-off-by: Sasha Levin <sashal@kernel.org>