mirror of
https://github.com/abseil/abseil-cpp.git
synced 2026-06-04 12:07:05 +08:00
Optimize multiply() (renamed to MultiplyWithExtraX33()) to eliminate
several instructions that were present only to avoid introducing an
extra factor of x^33 into the multiplication. It's actually fine to
introduce the extra factor of x^33 as long as it's canceled out with an
extra factor of x^-33 in all the kCRC32CPowers[] entries.
To make this work, the number of bits dropped by ComputeZeroConstant()
had to be increased from 2 to at least 3, since 2^(i + 3 +
kNumDroppedBits) - 33 must be >= 0 for all i including i=0; otherwise
kCRC32CPowers[0] would need a negative power of x. However, this is
fine since it's more efficient to utilize CRC32_u32() and CRC32_u64()
for bits 2 and 3 anyway. So, increase kNumDroppedBits to 4.
Add a Python script that generates the updated kCRC32CPowers[]. It
isn't wired up to the build system, but rather is just added so that
kCRC32CPowers[] can be reproduced.
Also add a test which tests ExtendCrc32cByZeroes() with all the length
bits, thus testing all the entries of kCRC32CPowers[].
Note that the kCRC32CPowers[] generation script and new test case are
things we should have had anyway, regardless of the x^33 optimization.
This change slightly improves the performance of Extend() for lengths
greater than or equal to 2048 bytes, and also the performance of
ExtendByZeroes(). It also slightly reduces the binary code size.
Before:
BM_Calculate/2048 84.3 ns 84.3 ns 8307735
BM_Calculate/10000 376 ns 375 ns 1865976
BM_Calculate/500000 18538 ns 18531 ns 37813
BM_ExtendByZeroes/1 3.55 ns 3.55 ns 197111095
BM_ExtendByZeroes/10 3.90 ns 3.89 ns 179773877
BM_ExtendByZeroes/100 6.06 ns 6.06 ns 115242160
BM_ExtendByZeroes/1000 12.0 ns 12.0 ns 58078004
BM_ExtendByZeroes/10000 9.97 ns 9.97 ns 70335772
BM_ExtendByZeroes/100000 12.1 ns 12.1 ns 58157829
BM_ExtendByZeroes/1000000 14.4 ns 14.4 ns 48527365
After:
BM_Calculate/2048 82.8 ns 82.7 ns 8478296
BM_Calculate/10000 375 ns 375 ns 1869663
BM_Calculate/500000 18547 ns 18538 ns 37846
BM_ExtendByZeroes/1 2.96 ns 2.96 ns 236772500
BM_ExtendByZeroes/10 3.85 ns 3.85 ns 182059238
BM_ExtendByZeroes/100 5.42 ns 5.42 ns 129077546
BM_ExtendByZeroes/1000 9.43 ns 9.42 ns 74232457
BM_ExtendByZeroes/10000 8.14 ns 8.14 ns 86244218
BM_ExtendByZeroes/100000 10.7 ns 10.7 ns 65467391
BM_ExtendByZeroes/1000000 11.0 ns 11.0 ns 63575936
PiperOrigin-RevId: 786828855
Change-Id: I6208625fd1c35c2c137e756cf5fadc1adccfdd5d