Programmers Dojo

Posts

Showing posts with the label list_of_primes

How big are primes?

So, If we were to calculate the first billion primes how much ram would we need? At the end the primes won't fit in a 32bit integer. If we store them all in 64bit integers we'd need 8GB. Even if they all fit in 32bit integers we'd still need 4GB of storage just to store the list. Seeing at my little machine has 4GB ram that isn't going to work either. How about just storing the intervals between primes? For the first primes (2, 3, 5, 7, 11, 13, 17) we would store (2, 1, 2, 2, 4, 2, 4) Using a byte for every interval less than 128 and a short for the rest lets us store the first billion primes in just about 1GB. Now that we can do! Maybe we can do even better though...

Segmented Sieve

Our next version works just like the previous sieve but works on a segment of the number line at a time. That might be why it is called a segmented sieve . Now we are sieving multiples of the primes from 1M numbers at a time, adding those primes to the list and then moving on to the next 1M numbers. I hit 140M primes in 60 seconds on my box... but now I run into a new problem. This version of the sieve quickly consumes all the memory in my box and slows to a crawl. To progress we will need to find a more efficient way of storing the primes we have found.

Basic Sieve

I've implemented a basic sieve to see if it does any better than trial division. While trial division found the first million primes in 8.28 seconds the sieve does the same task in 0.84 seconds. That's a little better! Running the sieve on the range [2..1,000,000,002) yields 50,847,534 primes in 78.14 seconds. I'm currently storing a byte for each number so that means I'm using 1GB ram. We can't just keep increasing the range because we will eventually run out of memory. Even storing a single bit for each number (we only need to know if it has been crossed our or not) would only divide this by a factor of 8. Perhaps we can work on a section of the number line at a time?

Sieve of Eratosthenes

Another way to generate a list of primes is through the Sieve of Eratosthenes . Essentially you create a long list of numbers starting at 2. 1. Set p equal to 2. This is your first prime. 2. Cross out every pth number because they are all divisible by p. 3. The first number after p which hasn't been crossed out is the new prime. 4. Repeat from setup 2. As an example lets take the number line from 2 to 10. list = 2, 3, 4, 5, 6, 7, 8, 9, 10 1. p = 2 2. list = 2 , 3, 4 , 5, 6 , 7, 8 , 9, 10 3. p = 3 4. goto 2 2. list = 2 , 3 , 4 , 5, 6 , 7, 8 , 9 , 10 3. p = 5 4. goto 2 2. list = 2 , 3 , 4 , 5, 6 , 7, 8 , 9 , 10 3. p = 7

Simple list of primes using trial division.

Using the list from an earlier post here is a simple prime generator. trial_division.c This will generate one million primes in around 8 seconds on my machine. Not bad but I think we can probably do better.

Our first list.

For our list of primes we need a list (naturally). The best way to understand data structures is to implement them in a low level language. Since I don't feel like inflicting ASM on everyone (although I do like assembler) we will use the next best thing. C Out list needs a data element (the found prime) and a pointer to the next element. In C that looks like this. struct prime_elem_s { uint64_t prime; struct prime_elem_s *next; }; typedef struct prime_elem_s prime_elem_t; We define a structure containing the current prime and a pointer to the next element. Then we define a type prime_elem_t to refer to that structure. In this case we want to add elements to the end of the list so we will also need a tail pointer. We will create a list header which will keep a head and tail pointer as well as the number of elements in the list. typedef struct { uint64_t length; prime_elem_t *head; prime_elem_t *tail; } pri...

A Prime Vacation

I like to do small programming exercises to keep my skills sharp. Unfortunately I sometimes get a little carried away with them. Problem: generate a list of primes numbers. Sounds simple right? Well lets start to look at this. First lets define what prime numbers are. A prime number is a number which is divisible by only itself and one. A simple way to test if a number if prime is by trial division . To test to see if a number is prime you check whether it is division by any prime less then the square root of that number. The general algorithm for finding the list of primes by trial division is therefore the following. list of primes = {2, 3} for each number begining at 4 check to see if it is divisible by any prime less then the square root of number if number is prime then append it to list of primes Well go through a sample implementation then see what we can do to improve it.