From fe993373146938341e3417258a96abeef7d7b5ff Mon Sep 17 00:00:00 2001
From: "Michael I. Bushnell" <mib@gnu.org>
Date: Fri, 27 May 1994 20:02:45 +0000
Subject: Formerly primes.c.~9~

---
 proc/primes.c | 67 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 66 insertions(+), 1 deletion(-)

diff --git a/proc/primes.c b/proc/primes.c
index 1561cf21..a0be5acb 100644
--- a/proc/primes.c
+++ b/proc/primes.c
@@ -29,7 +29,8 @@ nextprime (int n)
   int *m;
   int i, j;
   
-  /* You are not expected to understand this. */
+  /* See the comment at the end for an explanation of the algorithm 
+       used.   */
 
   if (!q)
     {
@@ -86,4 +87,68 @@ nextprime (int n)
   return q[p];
 }
 
+/* [This code originally contained the comment "You are not expected
+to understand this" (on the theory that every Unix-like system should
+have such a comment somewhere, and now I have to find somewhere else
+to put it).  I then offered this function as a challenge to Jim
+Blandy (jimb@totoro.bio.indiana.edu).  At that time only the six
+comments in the function and the description at the top were present.
+Jim produced the following brilliant explanation.
 
+ -mib]
+
+
+The static variable q points to a sorted array of the first l natural
+prime numbers.  k is the number of elements which have been allocated
+to q, l <= k; we occasionally double k and realloc q accordingly to
+maintain this invariant.
+
+The table is initialized to contain a few primes (lines 26, 27,
+34-40).  Subsequent code assumes the table isn't empty.
+
+When passed a number n, we grow q until it contains a prime >= n
+(lines 45--70), do a binary search in q to find the least prime >= n
+(lines 72--84), and return that.
+
+We grow q using a "sieve of Eratosthenes" procedure.  Let p be the
+largest prime we've yet found, q[l-1].  We allocate a boolean array m
+of p^2 elements, and initialize all its elements to false.  (Upon
+completion, m[j] will be false iff j is a prime.)  For each number
+q[i] in q, we set all m[j] to true, where j is a multiple of q[i], and
+j is a valid index in m.  Once this is done, since every number j (p <
+j < p^2) is either prime, or has a prime factor not greater than p,
+m[j] will be false iff j is prime.  We scan m for false elements, and
+add their indices to q.
+
+As an optimization, we take advantage of the fact that 2 is the first
+prime.  But essentially, the code works as described above.
+
+Why is m's size chosen as it is?  Note that the sieve only guarantees
+to mark multiples of the numbers in q.  Given that q contains all the
+prime numbers from 2 to p, we can safely sieve for prime numbers
+between 2 and p^2, because any composite number in that range must
+have a prime factor not greater than its square root, and thus not
+greater than p; q already contains all such primes.  I suppose we
+could trust the sieve a bit farther ahead, but I'm not sure we could
+go very far at all.
+
+
+Possible bug: if there is no prime j such that p < j <= p^2, then the
+growth loop will add no primes to q, and thus never exit.  Is there
+any guarantee that there will be such a j?  I thought that people had
+found proofs that the average density of primes was logarithmic in
+their size, or something like that, but no guarantees.  Perhaps there
+is no bug, and this is the part I am not expected to understand; oh,
+well.
+
+
+[I don't know if this is a bug or not.  The proofs of the density of
+primes only deal with average long-term density, and there might be
+stretches where prims thin out to the point that this algorithm fails.
+However, it can only compute primes up to maxint (because ints are
+used as indexes into the array M in the seive step).  It's certainly
+the case that no such thinning out happens before 2^32, so we're safe.
+And since this function is only used to size process tables, I don't
+think it will ever get that far even.  -mib]
+
+*/
-- 
cgit v1.2.3