Sunday, August 18, 2013

Prime Factors

Problem description

It is classical unsolved problem to quickly prime factor a really big number. There are quite few methods for. I am here discussing implementation of some known methods.

Prime factoring number smaller than C++ primitive data type max limit

This is very generic way to get all prime factors. This is very slow and can not be used for factoring big numbers.
typedef unsigned long long LONG;
const LONG sz = "count"
LONG primes[] = {fill it with primes below sqrt(numeric_limits<LONG>::max())};
void prime_factors(LONG N, vector<Factor>& R) {
        R.clear();
        if(N <= 1) {
                R.push_back(make_pair(N, 1));
                return;
        };
        const LONG *up = lower_bound(primes, primes+sz, N);
        const LONG *down = primes;
        if(*up == N) {
                // last factor A^1
                R.push_back(make_pair(N, 1));
                return;
        };
        up = lower_bound(down, up, ceil(sqrt(N)));
        LONG A = 0, B = 0;
        while (N > 1 && up >= down) {
                A = *down;
                if(N%A == 0) {
                        B = 0;
                        do {// Extract Powers
                                B++;
                                N /= A;
                        } while(N%A == 0 && N > 1); 
                        // factor A^B
                        R.push_back(make_pair(A, B));
                        up = lower_bound(down, up, N);
                        if(*up == N) {
                                // last factor A^1
                                R.push_back(make_pair(N, 1));
                                return;
                        };
                        up = lower_bound(down, up, ceil(sqrt(N)));
                } else ++down;
        }
        if(N > 1) {
                // last factor A^1
                R.push_back(make_pair(N, 1));
        };
        return;
}

Prime factoring with Fermat theorem

Here we are going to factor N = P*Q where P and Q are big primes numbers. Though you can apply it for any number but you got to loop this until you get all prime factors.
We know -
N = V2 - I2 -- E(1)
V2 = N + I2 -- E(2)
Now we got to brute force I such that E(2) holds true, there we will bet our factors -
P = V+I
Q = V-I
// build: g++ factor.cpp -lgmpxx -lgmp
#include <gmpxx.h>
typedef mpz_class Number;
typedef pair<Number, Number> Factor;
void fermat_factoring(const Number& N, Factor& f) {
        Number V, I, S;
        I = 0;
        S = 0;
        V = N;
        while(mpz_perfect_square_p(V.get_mpz_t()) == 0) {
                S += 1 + 2*I; 
                I += 1;
                V = N + S;
        }
        V = sqrt(V);
        f = make_pair(V-I, V+I);
        return;
}
With above implementation we can get factors of N but if N is more than 10 digits, We can not deduce factors in reasonable time limit.

Prime factoring with Euler's theorem

Here again we are going to factor N = P*Q where P and Q are big primes numbers. But this time we will use another.
We know -
V2 = (I2)%N --- E(1)
then one factor P = gcd(V+I, N)
other factor Q = N/P
Now we got to brute force I such that E(1) holds true. To brute force I you can very well replace my implementation with some kind of number field sieve.
// build: g++ factor.cpp -lgmpxx -lgmp
#include <gmpxx.h>
typedef mpz_class Number;
typedef pair<Number, Number> Factor;
void Eular_factoring(const Number& N, Factor& f) {
        Number V=N, I=0, P=0;
        if(mpz_perfect_square_p(V.get_mpz_t()) == true) { 
                V = sqrt(V);
                f = make_pair(V, V);
                return;
        }
        P = sqrt(V)+1;
        I = P*P;
        V = I%N;
        while(mpz_perfect_square_p(V.get_mpz_t()) == false) {
                // You can device number field seive here
                // NFS will make the search more faster
                // To keep it simple I am using next perfect square.
                //next perfect square I + 2*P + 1
                I = I + 2*P +1;
                P = sqrt(I);
                V = I % N;
        }
        V = sqrt(V)+P;
        mpz_gcd(I.get_mpz_t(), V.get_mpz_t(), N.get_mpz_t());
        f.first = I;
        f.second = N/I;
        return;
}
With above implementation we can get factors of N but if N is more than 10 digits, We can not deduce factors in reasonable time limit.

Prime factoring with Number field sieve

Yet to implement some thing good.

Test Data

Though this problem looks very common but you can learn, earn fame and help technology while solving this problem. There is RSA factoring challenge. It is good if we take our test data directly form here - RSA factoring challenge

Decision Table

  • Both factors are prime
  • N is not perfect square
  • (N+|-1) is divisible by 6
  • Both factors are of the form 6i+|-1
  • One factor is below sqrt(N) other is above it.
  • Both are not near sqrt(N)
  • Both will be having equal or greater digit count than half of the digit count of N
  • P and Q won't be prime mersenne i.e of the form 2p-1
  • Primes less than or equal to x pi(x) ~ x/(log x -a) a be any contant
  • The nth prime is about n log n
  • Kuttaka Algorithm (Bhaskara Procedure)
      N = ax + r1
      N = by +r2
      ax - by = r1 -r2 = c
      Algorithm says "Any factor common to a and b should be a factor of c"
     Now to solve this lets reduce it to the form ax + by = gcd(a, b). 

    This is particularly useful when a and b are coprime, since x is the modular multiplicative inverse of a     modulo b, and y is the modular multiplicative inverse of b modulo a. This has value in a key                   calculation of the RSA public-key encryption algorithm; finding an integer decryption exponent d         that is the modular multiplicative inverse of the chosen encryption exponent e modulo φ(n), where e       and φ(n) are coprime.
       a^φ(n) = N*X +1
       This infers that figure out multiple of N in which if we add 1, should become perfect power. This may result in figuring out φ(n).
           P*Q = N  ------------ (1)
          (P-1)(Q-1) = φ(n)  
           P+Q = N+1-φ(n) = 2M
          (P+Q)/2 =  M  --- (2)
          from 1 and 2
          P, Q = M ±√M2-N
          

Factored so far without any chache or sieve

Factored N = P*Q where P and Q both are prime. I am using only one simple and common M/C. The result shown here are computed in just no time.
  • N with fermat theorem, implementation posted above
    1. F-2 Took 670 clicks and 1.093s.
    2.         109043201203  == 999049 * 109147
  • N with fermat theorem, variant of implementation posted above, will post this version soon
    1. F-(?) Took 31 clicks and 0.031s.
    2.         109043201203  == 999049 * 109147
    3. F-(?) Took 6614 clicks and 6.614s.
    4.         3352081285229839  == 3367900313 * 995303
    5. F-(?) Took 1452509 clicks and 1452.51s.
    6.         199231082435617938841 == 33679003133 * 5915587277
  • N with new algorithm, will post this version soon. Running 7 process on single simple M/C and broke -
    1. F-(?) Took 2.35758minutes.
    2.         199231082435617938841 == 33679003133 * 5915587277
  • With clean implementation, required data boundaries are pre-computed and written into file so that we can enable distributed parallel computing.
    1. F-(?) Says, 150 computers required for 24 hrs to break below 31 digit number. This is making me crazy
    2.         635962022049618629767748013211 == P*Q

No comments:

Post a Comment