Nested virtualization, the ability to run a virtual machine inside another virtual machine, is increasingly important because of the need to deploy virtual machines running software stacks on top of virtualized cloud infrastructure. As ARM servers make inroads in cloud infrastructure deployments, supporting nested virtualization on ARM is a key requirement, which has been met recently with the introduction of nested virtualization support to the ARM architecture. We build the first hypervisor to use ARM nested virtualization support and show that despite similarities between ARM and x86 nested virtualization support, performance on ARM is much worse than on x86. This is due to excessive traps to the hypervisor caused by differences in non-nested virtualization support. To address this problem, we introduce a novel paravirtualization technique to rapidly prototype architectural changes for virtualization and evaluate their performance impact using existing hardware. Using this technique, we propose Nested Virtualization Extensions for ARM (NEVE), a set of simple architectural changes to ARM that can be used by software to coalesce and defer traps by logging the results of hypervisor instructions until the results are actually needed by the hypervisor or virtual machines. We show that NEVE allows hypervisors running real application workloads to provide an order of magnitude better performance than current ARM nested virtualization support and up to three times less overhead than x86 nested virtualization. NEVE will be included in ARMv8.4, the next version of the ARM architecture.