- **GMP-ECM**
(*https://www.mersenneforum.org/forumdisplay.php?f=55*)

- - **GMP 5.0.1 vs GMP 4.1.4 benchmarking**
(*https://www.mersenneforum.org/showthread.php?t=15471*)

GMP 5.0.1 vs GMP 4.1.4 benchmarkingI compared two GMP-ECM 6.3 builds under Linux. One compiled with GMP 5.0.1 and another with GMP 4.1.4
I got several strange results. In overall GMP 5.0.1 is better by 5-15% but with B1=11e6 with some ranges (tested 100-300digits) 4.1.4 was better. Some examples follows. [CODE]1. C121 from near-repdigits GMP-ECM 6.3 [configured with GMP 4.1.4 and --enable-asm-redc] [ECM] Input number is 1800485013924273616277080302416213714297702488568072032612888194660755338496630976045963259724803581322873645120627538429 (121 digits) Using B1=11000000, B2=35133391030, polynomial Dickson(12), sigma=334640802 Step 1 took 36869ms Step 2 took 19737ms GMP-ECM 6.3 [configured with GMP 5.0.1 and --enable-asm-redc] [ECM] Input number is 1800485013924273616277080302416213714297702488568072032612888194660755338496630976045963259724803581322873645120627538429 (121 digits) Using B1=11000000, B2=35133391030, polynomial Dickson(12), sigma=2340904304 Step 1 took 35097ms Step 2 took 33626ms GMP 5.0.1 is significantly slower again on step 2. 2. C156 from aliquot seq 283752:i7004 GMP-ECM 6.3 [configured with GMP 4.1.4 and --enable-asm-redc] [ECM] Input number is 150334450606011724019777200211010468220565590046299234402254345532711750018652367487259651931850319063498312781804011647293058067263942651704486104870980321 (156 digits) Using B1=11000000, B2=35133391030, polynomial Dickson(12), sigma=4153245810 Step 1 took 55526ms Step 2 took 26975ms GMP-ECM 6.3 [configured with GMP 5.0.1 and --enable-asm-redc] [ECM] Input number is 150334450606011724019777200211010468220565590046299234402254345532711750018652367487259651931850319063498312781804011647293058067263942651704486104870980321 (156 digits) Using B1=11000000, B2=35133391030, polynomial Dickson(12), sigma=2955949299 Step 1 took 57614ms Step 2 took 39257ms Again step 2 with GMP 5.0.1 is much slower. 3. C209 from near-repdigits GMP-ECM 6.3 [configured with GMP 4.1.4 and --enable-asm-redc] [ECM] Input number is 99999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999899999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999 (209 digits) Using B1=11000000, B2=35133391030, polynomial Dickson(12), sigma=2560444052 Step 1 took 75055ms Step 2 took 36402ms GMP-ECM 6.3 [configured with GMP 5.0.1 and --enable-asm-redc] [ECM] Input number is 99999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999899999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999 (209 digits) Using B1=11000000, B2=35133391030, polynomial Dickson(12), sigma=3908589128 Step 1 took 76103ms Step 2 took 46634ms Step 2 with GMP 5.0.1 is slower by 10sec. With B1=3e6 all is OK - 5.0.1 is slightly better than 4.1.4 1. C121 Step 1 took 9562ms Step 2 took 4803ms vs. Step 1 took 10009ms Step 2 took 6219ms 2. C156 Step 1 took 15440ms Step 2 took 6315ms vs. Step 1 took 15102ms Step 2 took 8532ms 3. C209 Step 1 took 20846ms Step 2 took 8188ms vs. Step 1 took 20306ms Step 2 took 11598ms [/CODE]I repeated tests 10x times and always got the same results. What's wrong? Compile options: --enable-openmp --with-gmp=/usr/local/ --enable-shellcmd --enable-sse2 --enable-asm-redc Test system: Xeon E5620 2.40GHz Centos 5.5 x86_64 on 2.6.18 kernel |

1 Attachment(s)
Thats exactly what I figured out some time ago. Especially on step 2 GMP 4.x is a lot faster - and I have no idea why.
The fastest combination for my Phenom 2 1090T is GMP 4.3.2 combined with GMP-ECM 6.3, all compiled with --march=barcelona and, of cause, linked statically. For large numbers > ~ 400 digits linking against gwnum gave a huge speedup. Table attached: All times in ms, mesaured on Phenom 2, 3.6Ghz, Linux kernel 2.6.35, 64 bit |

[QUOTE=Syd;257229]For large numbers > ~ 400 digits linking against gwnum gave a huge speedup.
[/QUOTE] I think it is only for 2^n-1 and 2^n+1 numbers. I decided to recomplile binaries from scratch and there are some questions again. Why ecm-params.h.athlon64 is used instead of ecm-params.h.core2 ? Why SSE2 instructions were not used in NTT code? [CODE] config.status: linking ecm-params.h.athlon64 to ecm-params.h config.status: linking mul_fft-params.h.athlon64 to mul_fft-params.h config.status: executing depfiles commands config.status: executing libtool commands configure: Configuration: configure: Build for host type x86_64-unknown-linux-gnu configure: CC=gcc -std=gnu99, CFLAGS=-W -Wall -Wundef -O2 -pedantic -m64 -mtune=core2 -march=core2 configure: Linking GMP with /usr/local//lib/libgmp.a configure: Using asm redc code from directory x86_64 configure: Not using SSE2 instructions in NTT code [/CODE] |

[QUOTE=unconnected;257275]
Why SSE2 instructions were not used in NTT code? [/QUOTE] The developers have had a little trouble detecting SSE2 across a wide enough range of platforms. Is this with the latest SVN? IIRC it has fixes for a problem somewhat like yours. |

How are you compiling GMP 4.3.2 for 64bit?
I get this error: [QUOTE]configure: error: Oops, mp_limb_t is 32 bits, but the assembler code in this configuration expects 64 bits. You appear to have set $CFLAGS, perhaps you also need to tell GMP the intended ABI, see "ABI and ISA" in the manual.[/QUOTE] I compile in Mingw64 with: ./configure CC=gcc CFLAGS="-O2 -pedantic -m64 -std=gnu99 -mtune=core2 -march=core2" ABI=64 --build=x86_64-w64-mingw32 I also tried just: ./configure ABI=64 and variations. I read on GMP website: "Gcc 4.3.2 miscompiles GMP on 64-bit machines", but I'm using gcc 4.6.0. |

Here is my 32bit test of GMP 4.3.2 vs 5.0.1 and MPIR: [URL="http://www.hoegge.dk/mersenne/gmp4test.html"]gmp4test.html[/URL]
I can't see the effect you describe. On a core2 the GMP 4.3.2 binary is alot slower than both GMP 5.0.1 and MPIR 2.3.0/2.2.1. On a pentium4 its only slightly slower than GMP 5.0.1 and faster than MPIR. If you have a link to GMP 4.1.4 I'm willing to test it. |

All times are UTC. The time now is 11:04. |

Powered by vBulletin® Version 3.8.11

Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.