mirror of
https://github.com/abseil/abseil-cpp.git
synced 2026-06-04 12:07:05 +08:00
We have AVX encoded vector PCLMULQDQ on Milan, so use it to make crc32c computations ~10% faster. We need to use inline asm, since building this twice with different complier flags for dynamic dispatch performed worse due to missing inlining. BM_Calculate/0 1.136n ± 0% 1.136n ± 1% ~ (p=0.968 n=6) BM_Calculate/1 1.420n ± 0% 1.421n ± 1% ~ (p=0.870 n=6) BM_Calculate/100 9.089n ± 0% 9.660n ± 1% +6.29% (p=0.002 n=6) BM_Calculate/2048 75.30n ± 1% 67.67n ± 1% -10.13% (p=0.002 n=6) BM_Calculate/10000 313.1n ± 0% 286.1n ± 0% -8.63% (p=0.002 n=6) BM_Calculate/500000 14.91µ ± 4% 13.49µ ± 1% -9.48% (p=0.002 n=6) BM_Extend/0 1.136n ± 1% 1.136n ± 1% ~ (p=0.636 n=6) BM_Extend/1 1.420n ± 0% 1.420n ± 1% ~ (p=0.636 n=6) BM_Extend/100 9.247n ± 2% 9.800n ± 2% +5.99% (p=0.002 n=6) BM_Extend/2048 75.73n ± 1% 67.37n ± 1% -11.04% (p=0.002 n=6) BM_Extend/10000 313.2n ± 1% 286.2n ± 0% -8.62% (p=0.002 n=6) BM_Extend/500000 14.87µ ± 1% 13.57µ ± 1% -8.74% (p=0.002 n=6) BM_Extend/100000000 3.185m ± 2% 2.816m ± 3% -11.60% (p=0.002 n=6) BM_ExtendCacheMiss/10 26.07m ± 1% 26.06m ± 1% ~ (p=1.000 n=6) BM_ExtendCacheMiss/100 13.86m ± 4% 14.36m ± 2% +3.61% (p=0.026 n=6) BM_ExtendCacheMiss/1000 27.02m ± 4% 27.28m ± 4% ~ (p=0.699 n=6) BM_ExtendCacheMiss/100000 5.114m ± 5% 4.600m ± 8% -10.07% (p=0.002 n=6) BM_ExtendByZeroes/1 1.420n ± 0% 1.420n ± 0% ~ (p=0.670 n=12) BM_ExtendByZeroes/10 1.704n ± 1% 1.704n ± 0% ~ (p=1.000 n=6) BM_ExtendByZeroes/100 3.128n ± 0% 3.128n ± 0% ~ (p=1.000 n=6) BM_ExtendByZeroes/1000 6.758n ± 0% 6.638n ± 1% -1.78% (p=0.002 n=6) BM_ExtendByZeroes/10000 6.619n ± 1% 6.503n ± 0% -1.75% (p=0.002 n=6) BM_ExtendByZeroes/100000 8.537n ± 1% 8.479n ± 0% -0.67% (p=0.019 n=6) BM_ExtendByZeroes/1000000 9.766n ± 1% 9.692n ± 1% -0.75% (p=0.002 n=6) PiperOrigin-RevId: 900897540 Change-Id: I57d8df2bf10690afc07009d61f8c4ea61e88ce50