> Date: Sat, 10 Jan 2026 23:33:30 +0100 > From: Manuel Bouyer <bouyer%antioche.eu.org@localhost> > > And this assembly work may have allowed me to find the problem. > Basically, the CLI and STI macros are not atomic on Xen PV, and if preemption > happens at a bad time we may end up updating the upcall_mask of our previous > CPU. The attached patch should close this race condition; hopefully it's > the last one. Nice! I fixed some of this a few years ago but I must have neglected to search further for other uses like the assembly stubs using EVTCHN_UPCALL_MASK and EVTCHN_UPCALL_PENDING: https://mail-index.netbsd.org/source-changes/2023/02/25/msg143510.html There are still a lot of suspicious cases that turn up when I grep -i for evtchn_upcall: - i386/spl.S -- might be fine because they're in the spllower and doreti paths, so maybe preemption is already blocked by virtue of having raised ipl, but needs another set of eyeballs to be sure: https://nxr.netbsd.org/xref/src/sys/arch/i386/i386/spl.S?r=1.58#218 https://nxr.netbsd.org/xref/src/sys/arch/i386/i386/spl.S?r=1.58#291 - i386 CLI/STI macros: https://nxr.netbsd.org/xref/src/sys/arch/i386/include/frameasm.h?r=1.35#12 (Maybe we can just nix XEN_BLOCK/UNBLOCK_EVENTS and XEN_TEST_PENDING and merge those into the CLI/STI macros?) - amd64 profile stubs: https://nxr.netbsd.org/xref/src/sys/arch/amd64/include/profile.h?r=1.21#82 (Not sure why there's no obvious i386 analogue of that? The i386 code has no XENPV conditionals.) - xen_init_amd64/i386_vcpuctxt -- guessing this runs before interrupts or preemption can happen, but unclear, needs more eyeballs: https://nxr.netbsd.org/xref/src/sys/arch/xen/x86/cpu.c?r=1.145#850 https://nxr.netbsd.org/xref/src/sys/arch/xen/x86/cpu.c?r=1.145#947 - do_hypervisor_callback -- does this run with ipl raised so preemption is not possible? If so, can we assert the ipl is raised? https://nxr.netbsd.org/xref/src/sys/arch/xen/x86/hypervisor_machdep.c?r=1.46#301 - hypervisor_send_event -- this is #if 0 code, can we delete it and the one #if 0 caller in sys/arch/xen/xen/evtchn.c so we don't have to think about it? If not, I think we should either KASSERT(kpreempt_disabled()) or assert the ipl is raised. https://nxr.netbsd.org/xref/src/sys/arch/xen/x86/hypervisor_machdep.c?r=1.46#336 https://nxr.netbsd.org/xref/src/sys/arch/xen/xen/evtchn.c?r=1.100#383 - xen_debug_handler -- this runs with ipl raised, right? Maybe not worth asserting because this is a debug vector so dumping info is better than crashing. https://nxr.netbsd.org/xref/src/sys/arch/xen/xen/evtchn.c?r=1.100#1142 Also, I think we should either: (a) delete the PUSHF/POPF macros altogether, if they're unused, or (b) make sure they too block preemption in the critical path.