/* * De-multiplexing posted interrupts is on the performance path, the code * below is written to optimize the cache performance based on the following * considerations: * 1.Posted interrupt descriptor (PID) fits in a cache line that is frequently * accessed by both CPU and IOMMU. * 2.During software processing of posted interrupts, the CPU needs to do * natural width read and xchg for checking and clearing posted interrupt * request (PIR), a 256 bit field within the PID. * 3.On the other side, the IOMMU does atomic swaps of the entire PID cache * line when posting interrupts and setting control bits. * 4.The CPU can access the cache line a magnitude faster than the IOMMU. * 5.Each time the IOMMU does interrupt posting to the PIR will evict the PID * cache line. The cache line states after each operation are as follows, * assuming a 64-bit kernel: * CPU IOMMU PID Cache line state * --------------------------------------------------------------- *...read64 exclusive *...lock xchg64 modified *... post/atomic swap invalid *...------------------------------------------------------------- * * To reduce L1 data cache miss, it is important to avoid contention with * IOMMU's interrupt posting/atomic swap. Therefore, a copy of PIR is used * when processing posted interrupts in software, e.g. to dispatch interrupt * handlers for posted MSIs, or to move interrupts from the PIR to the vIRR * in KVM. * * In addition, the code is trying to keep the cache line state consistent * as much as possible. e.g. when making a copy and clearing the PIR * (assuming non-zero PIR bits are present in the entire PIR), it does: * read, read, read, read, xchg, xchg, xchg, xchg * instead of: * read, xchg, read, xchg, read, xchg, read, xchg
*/ static __always_inline bool pi_harvest_pir(unsignedlong *pir, unsignedlong *pir_vals)
{ unsignedlong pending = 0; int i;
for (i = 0; i < NR_PIR_WORDS; i++) {
pir_vals[i] = READ_ONCE(pir[i]);
pending |= pir_vals[i];
}
if (!pending) returnfalse;
for (i = 0; i < NR_PIR_WORDS; i++) { if (!pir_vals[i]) continue;
#ifdef CONFIG_X86_POSTED_MSI /* * Not all external vectors are subject to interrupt remapping, e.g. IOMMU's * own interrupts. Here we do not distinguish them since those vector bits in * PIR will always be zero.
*/ staticinlinebool pi_pending_this_cpu(unsignedint vector)
{ struct pi_desc *pid = this_cpu_ptr(&posted_msi_pi_desc);
if (WARN_ON_ONCE(vector > NR_VECTORS || vector < FIRST_EXTERNAL_VECTOR)) returnfalse;
Die Informationen auf dieser Webseite wurden
nach bestem Wissen sorgfältig zusammengestellt. Es wird jedoch weder Vollständigkeit, noch Richtigkeit,
noch Qualität der bereit gestellten Informationen zugesichert.
Bemerkung:
Die farbliche Syntaxdarstellung und die Messung sind noch experimentell.