python - Why factorization of products of close primes is much slower than products of dissimilar primes - Stack Overflow

admin•2025-04-23 14:11:33•questions•阅读3

This is a purely academic question without any practical consideration. This is not homework, I dropped

This is a purely academic question without any practical consideration. This is not homework, I dropped out of high school long ago. I am just curious, and I can't sleep well without knowing why.

I was messing around with Python. I decided to factorize big integers and measure the runtime of calls for each input.

I used a bunch of numbers and found that some numbers take much longer to factorize than others.

I then decided to investigate further, I quickly wrote a prime sieve function to generate primes for testing. I found out that a product of a pair of moderately large primes (two four-digit primes) take much longer to be factorized than a product of one very large prime (six-digits+) and a small prime (<=three-digits).

At first I thought my first simple function for testing is inefficient, that is indeed the case, so I wrote a second function that pulled primes directly from pre-generated list of primes, the second function was indeed more efficient, but strangely it exhibits the same pattern.

Here are some numbers that I used:

13717421 == 3607 * 3803
13189903 == 3593 * 3671
56267023 == 7187 * 7829
65415743 == 8087 * 8089

12345679 == 37 * 333667
38760793 == 37 * 1047589
158202851 == 151 * 1047701
762312571 == 727 * 1048573

Code:

import numpy as np
from itertools import cycle

def factorize(n):
    factors = []
    while not n % 2:
        factors.append(2)
        n //= 2

    i = 3
    while i**2 <= n:
        while not n % i:
            factors.append(i)
            n //= i
        i += 2
    
    return factors if n == 1 else factors + [n]

TRIPLE = ((4, 2), (9, 6), (25, 10))
WHEEL = ( 4, 2, 4, 2, 4, 6, 2, 6 )
def prime_sieve(n):
    primes = np.ones(n + 1, dtype=bool)
    primes[:2] = False
    for square, double in TRIPLE:
        primes[square::double] = False
    
    wheel = cycle(WHEEL)
    k = 7
    while (square := k**2) <= n:
        if primes[k]:
            primes[square::2*k] = False
        
        k += next(wheel)
    
    return np.flatnonzero(primes)
    
PRIMES = list(map(int, prime_sieve(1048576)))
TEST_LIMIT = PRIMES[-1] ** 2

def factorize_sieve(n):
    if n > TEST_LIMIT:
        raise ValueError('Number too large')

    factors = []
    for p in PRIMES:
        if p**2 > n:
            break
        while not n % p:
            factors.append(p)
            n //= p
    
    return factors if n == 1 else factors + [n]

Test result:

In [2]: %timeit factorize(13717421)
279 μs ± 4.29 μs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)

In [3]: %timeit factorize(12345679)
39.6 μs ± 749 ns per loop (mean ± std. dev. of 7 runs, 10,000 loops each)

In [4]: %timeit factorize_sieve(13717421)
64.1 μs ± 688 ns per loop (mean ± std. dev. of 7 runs, 10,000 loops each)

In [5]: %timeit factorize_sieve(12345679)
12.6 μs ± 146 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)

In [6]: %timeit factorize_sieve(13189903)
64.6 μs ± 964 ns per loop (mean ± std. dev. of 7 runs, 10,000 loops each)

In [7]: %timeit factorize_sieve(56267023)
117 μs ± 3.88 μs per loop (mean ± std. dev. of 7 runs, 10,000 loops each)

In [8]: %timeit factorize_sieve(65415743)
130 μs ± 1.38 μs per loop (mean ± std. dev. of 7 runs, 10,000 loops each)

In [9]: %timeit factorize_sieve(38760793)
21.1 μs ± 232 ns per loop (mean ± std. dev. of 7 runs, 10,000 loops each)

In [10]: %timeit factorize_sieve(158202851)
21.4 μs ± 385 ns per loop (mean ± std. dev. of 7 runs, 10,000 loops each)

In [11]: %timeit factorize_sieve(762312571)
22.1 μs ± 409 ns per loop (mean ± std. dev. of 7 runs, 10,000 loops each)

As you can clearly see, factorization of two medium primes on average takes much longer than two extremes. Why is it this case?

This is a purely academic question without any practical consideration. This is not homework, I dropped out of high school long ago. I am just curious, and I can't sleep well without knowing why.

I was messing around with Python. I decided to factorize big integers and measure the runtime of calls for each input.

I used a bunch of numbers and found that some numbers take much longer to factorize than others.

Here are some numbers that I used:

13717421 == 3607 * 3803
13189903 == 3593 * 3671
56267023 == 7187 * 7829
65415743 == 8087 * 8089

12345679 == 37 * 333667
38760793 == 37 * 1047589
158202851 == 151 * 1047701
762312571 == 727 * 1048573

Code:

import numpy as np
from itertools import cycle

def factorize(n):
    factors = []
    while not n % 2:
        factors.append(2)
        n //= 2

    i = 3
    while i**2 <= n:
        while not n % i:
            factors.append(i)
            n //= i
        i += 2
    
    return factors if n == 1 else factors + [n]

TRIPLE = ((4, 2), (9, 6), (25, 10))
WHEEL = ( 4, 2, 4, 2, 4, 6, 2, 6 )
def prime_sieve(n):
    primes = np.ones(n + 1, dtype=bool)
    primes[:2] = False
    for square, double in TRIPLE:
        primes[square::double] = False
    
    wheel = cycle(WHEEL)
    k = 7
    while (square := k**2) <= n:
        if primes[k]:
            primes[square::2*k] = False
        
        k += next(wheel)
    
    return np.flatnonzero(primes)
    
PRIMES = list(map(int, prime_sieve(1048576)))
TEST_LIMIT = PRIMES[-1] ** 2

def factorize_sieve(n):
    if n > TEST_LIMIT:
        raise ValueError('Number too large')

    factors = []
    for p in PRIMES:
        if p**2 > n:
            break
        while not n % p:
            factors.append(p)
            n //= p
    
    return factors if n == 1 else factors + [n]

Test result:

In [2]: %timeit factorize(13717421)
279 μs ± 4.29 μs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)

In [3]: %timeit factorize(12345679)
39.6 μs ± 749 ns per loop (mean ± std. dev. of 7 runs, 10,000 loops each)

In [4]: %timeit factorize_sieve(13717421)
64.1 μs ± 688 ns per loop (mean ± std. dev. of 7 runs, 10,000 loops each)

In [5]: %timeit factorize_sieve(12345679)
12.6 μs ± 146 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)

In [6]: %timeit factorize_sieve(13189903)
64.6 μs ± 964 ns per loop (mean ± std. dev. of 7 runs, 10,000 loops each)

In [7]: %timeit factorize_sieve(56267023)
117 μs ± 3.88 μs per loop (mean ± std. dev. of 7 runs, 10,000 loops each)

In [8]: %timeit factorize_sieve(65415743)
130 μs ± 1.38 μs per loop (mean ± std. dev. of 7 runs, 10,000 loops each)

In [9]: %timeit factorize_sieve(38760793)
21.1 μs ± 232 ns per loop (mean ± std. dev. of 7 runs, 10,000 loops each)

In [10]: %timeit factorize_sieve(158202851)
21.4 μs ± 385 ns per loop (mean ± std. dev. of 7 runs, 10,000 loops each)

In [11]: %timeit factorize_sieve(762312571)
22.1 μs ± 409 ns per loop (mean ± std. dev. of 7 runs, 10,000 loops each)

As you can clearly see, factorization of two medium primes on average takes much longer than two extremes. Why is it this case?

Share Improve this question edited Jan 17 at 18:10 asked Jan 17 at 17:58 Ξένη Γήινος 3,8101 gold badge18 silver badges60 bronze badges

Taken to the extreme, would you not expect a number with 2 and some big prime to be discovered almost immediately? – JonSG Commented Jan 17 at 21:42
@JonSG I wouldn't. Would you? – no comment Commented Jan 17 at 22:41
@nocomment If I'm factoring a number that i know is a factor of two primes like a 1223.........222 then yes, I know immediately via inspection or by testing the first prime number 2 to discover the pair. If the number is odd and with factors close to root n then I have to try comparatively many failures. It follows the the lower the min factor of the prime pair is, the faster we would expect to uncover the pair. Why else would we use huge relatively prime pairs in security rather than (3,5)? – JonSG Commented Jan 17 at 23:08
@JonSG But their factorize function doesn't know/assume that it's a factor of two primes. After immediately finding factor 2, it continues trying to factorize the big prime. And that takes time. – no comment Commented Jan 17 at 23:41

Add a comment |

2 Answers 2

Sorted by: Reset to default 4

I don't recognize, that this question is related to programming.

Your algorithm executes trial divisions (starting with the smallest numbers) and terminates at the square root of the input number, so the worst case is actually trying to factorize the square number of a prime (maximum factor possible). Encountering a low factor speeds up the process, since you immediately divide n and there is a chance the test factor reached when squared already exceeds the reduced n.

Your loop stops when i**2 exceeds n, with all known factors divided out of n. For a squarefree input, this happens once i is bigger than both

the second-largest factor of the original input, and
the square root of the largest factor of the original input.

This is the point when your code knows it's found all the factors. So for a product of two primes p*q with p<q, your code takes time proportional to max(p, sqrt(q)). This value is smaller for your "lopsided" tests.

发布者：admin，转转请注明出处：http://www.yc00.com/questions/1745350436a4623799.html

admin

questions
internet explorer - Javascript "Permission Denied" Error in IE.x - Stack Overflow
I did the mistake of creating my entire web application and not testing it on IE along the way. I only
admin
32分钟前
00
questions
Taxonomy page template changing when using query variables
I'm hoping someone can help me fill in the gaps in my understanding of page templates and query vars:I've regi
admin
28分钟前
10
questions
javascript - AddThis in AJAX - Stack Overflow
I'm trying to get Addthis to work in a div tag that's being loaded with AJAX I read on their
admin
27分钟前
00
questions
javascript - HighchartHighstock how to set color of individual ohlc or candle? - Stack Overflow
I have set up a highstock ohlc with angular 2. But can not set the up and down color for OHLC chart typ
admin
26分钟前
10
questions
jquery - How do I prevent the concurrent execution of a javascript function? - Stack Overflow
I am making a ticker similar to the "From the AP" one at The Huffington Post, using jQuery. T
admin
20分钟前
00
questions
javascript - Modify default slot isOpen in data table Vuetify 2.0 - Stack Overflow
Good afternoon, I modified the group header slot to customize the group row, only I would like to set t
admin
20分钟前
00
questions
php - Automatic clock for web site - Stack Overflow
I am trying to put a clock on a website, i am currently getting the time via php built in function. But
admin
15分钟前
10
questions
javascript - Why is window.close() suddenly not working? - Stack Overflow
My application opens new PHP from an icon or link in a new browser tab.For several months now, I'
admin
11分钟前
00
questions
javascript - ElectronJS: Must use import to load ES Module - Stack Overflow
I've just started learning Electron from Electron Docs. I used ES6 (importexport) while they used
admin
11分钟前
00
questions
javascript - Redirect and scroll to specific content - Stack Overflow
hi guys can any one have ideawhenever i click from footer link a new page will be open and the will s
admin
10分钟前
00
questions
Detect small (mobile) screen in Javascript (or via css) - Stack Overflow
I want to make some of the fonts on my website larger, if a visitor is using a small screen. Ideally wi
admin
9分钟前
10
questions
javascript - Why TS complains with function declarations inside function body - Stack Overflow
I have this error from TS:It's pretty clear why the error occurs:function outer(){if (true) {funct
admin
8分钟前
00
questions
javascript - Native .click() not triggering Blob download in Chrome 57 - Stack Overflow
A snippet of client-side Blob saving code has suddenly stopped working in Google Chrome.The same code
admin
8分钟前
00
questions
javascript - Interactive Selection highlighting of text inside the browser - Stack Overflow
what I need is interactive selection highlighting of text in the browser via Javascript.To be more spec
admin
7分钟前
00
questions
MySQL Actibity without any active visitors via Google Analytics
I am hosting a WP website on a VPS (Debian) and I noticed that my MySQL service is constantly being processed, even with
admin
7分钟前
10
questions
advanced custom fields - ACF don't save the selected value
firstly i created a select dropdownlist using acf that dynamically populate like this:add_filter('acfload_fieldna
admin
5分钟前
00
questions
compare two date in javascriptjquery using day accuracy - Stack Overflow
I need to pare two dates using best accuracy, for me day will be ok and take into consideration leap ye
admin
5分钟前
00
questions
jquery - How can publish my bootstrap created table ( huge amount of data) on my word-press theme?
I created a bootstrap table which is interactive with data-tables also.I use jquery too.How can i publish this table on
admin
4分钟前
00
questions
javascript - HideShow Multiple Column in V-Data-Table - Stack Overflow
May I ask for your help, I'm currently working making the columns showhide using vuetifyjs, I stu
admin
3分钟前
00
questions
javascript - Facebook like button dialog overflow issue - Stack Overflow
Using Facebook like button inside an app iframe, e.g.The like button is usually positioned somewhere on
admin
3分钟前
00

发表回复

评论列表（0条）

暂无评论

python - Why factorization of products of close primes is much slower than products of dissimilar primes - Stack Overflow

2 Answers 2

发表回复

评论列表（0条）

联系我们

400-800-8888

python - Why factorization of products of close primes is much slower than products of dissimilar primes - Stack Overflow

2 Answers 2

相关推荐

发表回复

评论列表（0条）

联系我们

400-800-8888