Wraparound multiplicaton vs mullo
Niels Möller
nisse at lysator.liu.se
Sat Oct 17 23:03:52 CEST 2009
David Harvey <dmharvey at cims.nyu.edu> writes:
>> I've tried the x^2 - 1 algorithm now, and benchmarked on an AMD
>> x86_32. A first implementation is attached.
>
> Cool!
Now I've done a first version of x^4 -1 too, see attachment.
> How about recursing into the same routine for the multiplication
> modulo 2^(n/2) - 1?
The current code is not organized that way, it evaluates the
polynomial at +1, -1 (for x^2-1) and +1, -1, i (for x^4-1), performs
full balanced multiplication for each point, then interpolates a
polynomial of the same degree as the inputs, evaluates that polynomial
at B^{n/2} or B^{n/4}, and in that evaluation, the highest coefficient
wraps around.
The linear work can surely be done a lot better, in this first
implementation I did it in straight-forward but maybe not so clever
way.
> I suppose you would need n divisible by 4, or would need to
> add some bit shifts.
In the current code, the x^2-1 algorithm requires that n is even, and
the x^4-1 algorithm requires that n is divisible by 4. That could of
course be relaxed at the cost ofextra shifting, assuming that
the number of bits is divisible by 2 and 4, respectively.
Benchmarking mul_n, mullo_n, and the x^2-1 and x^4-1 algorithms over
the same range of 2-500 limbs as before gives the following (still on
AMD x86_32):
2-10: mpn_mul_n wins.
12-22: mpn_mullo_n wins, with at best a 14% saving (all savings are
compared to mpn_mul_n). This is basecase mullo.
24-50: x^2-1 wins, with at best a 23% saving.
52-500: x^4-1 wins, with at best a 35% saving.
I'm not sure if I should be surprised that x^4-1 gets into the game so
early. For the range of 64-500, it saves 30% or more.
With cleverer evaluation and interpolation, speeding up the linear
work of x^4-1, I'd expect the range where x^2-1 wins will get pretty
narrow.
Regards,
/Niels
-------------- next part --------------
A non-text attachment was scrubbed...
Name: mul_2nm1.c
Type: text/x-c-code
Size: 13486 bytes
Desc: not available
URL: <http://gmplib.org/list-archives/gmp-devel/attachments/20091017/5f9dc7da/attachment-0001.bin>
More information about the gmp-devel
mailing list