Files
abseil-cpp/absl/crc
J. Neuschäfer 55a99fb37a PR #1944: Use same element-width for non-temporal loads and stores on Arm
Imported from GitHub PR https://github.com/abseil/abseil-cpp/pull/1944

Increase the consistency between _mm_loadu_si128 and _mm_stream_si128 by using vector loads/stores of 64-bit elements in both. This should have no impact on existing users. On aarch64 (release build, GCC 15.2), crc_non_temporal_memcpy.cc.o stays effectively the same, the only change being as follows:

```
--- crc_non_temporal_memcpy.cc.o (original)
+++ crc_non_temporal_memcpy.cc.o (patched)
├── objdump --line-numbers --disassemble --demangle --reloc --no-show-raw-insn --section=.text {} │ @@ -255,15 +255,15 @@
│       add     x2, x21, x2
│       mov     x0, x21
│       ldp     q31, q30, [x0, #32]
│       add     x1, x1, #0x40
│       ldp     q29, q28, [x0], #64
│       stp     q31, q30, [x1, #-32]
│       stp     q29, q28, [x1, #-64]
│ -     cmp     x0, x2
│ +     cmp     x2, x0
│       b.ne    3b0 <absl::crc_internal::CrcNonTemporalMemcpyEngine::Compute(void*, void const*, unsigned long, absl::crc32c_t) const+0x270>  // b.any
│       and     x0, x3, #0xffffffffffffffc0
│       and     x23, x23, #0x3f
│       dmb     ish
│       add     x22, x22, x0
│       add     x21, x21, x0
│       b       380 <absl::crc_internal::CrcNonTemporalMemcpyEngine::Compute(void*, void const*, unsigned long, absl::crc32c_t) const+0x240>
```

On big-endian Arm (aarch64_be), this fixes a bug in non_temporal_store_memcpy, in which each 32-bit half out of a 64-bit parcel of memory was swapped with the other. For example, the byte sequence 218edf0b 13c68753 would be copied as 13c68753 218edf0b.

Merge 8f08d4c792 into e5c6ccbc96

Merging this change closes #1944

COPYBARA_INTEGRATE_REVIEW=https://github.com/abseil/abseil-cpp/pull/1944 from neuschaefer:nontemp 8f08d4c792
PiperOrigin-RevId: 819779377
Change-Id: I46c8c5540fb4786948c5f16d25630fbbab892602
2025-10-15 09:03:00 -07:00
..
2025-07-14 15:00:34 -07:00