Thanks @Ksharp for chiming in. Yes, given that it's one of the oldest algorithms (Eratosthenes, 3rd century B.C.), it's stunningly fast.
I think your current implementation is somewhat below its best, though: The idea of setting every x{i}-th array element to missing could be implemented using a DO ... BY statement rather than calling the MOD function to identify the elements to be canceled. This change reduces the run time for the primes up to 1 million from about 2 to 0.1 s (on my computer).
The version below takes only 6 seconds for the 5,761,455 primes up to 100 million.
%let max=100000000;
data want;
array x{&max} _temporary_;
do i=1 to &max;
x{i}=i;
end;
do i=2 to int(sqrt(&max));
if not missing(x{i}) then do;
do j=i*i to dim(x) by x{i};
x{j}=.;
end;
end;
end;
do k=2 to &max;
if not missing(x{k}) then do; prime_number=x{k}; output; end;
end;
keep prime_number;
run;
I was tempted to replace the first DO loop by specifying initial values for the array, (1:&max), but that turned out to be significantly slower.
Edit: By skipping the even numbers >2 the run time can be further reduced to <5.0 seconds.
data want;
array x{&max} _temporary_;
do i=2, 3 to &max by 2;
x{i}=i;
end;
do i=3 to int(sqrt(&max)) by 2;
if not missing(x{i}) then do;
do j=i*i to dim(x) by x{i};
x{j}=.;
end;
end;
end;
do k=2, 3 to &max by 2;
if not missing(x{k}) then do; prime_number=x{k}; output; end;
end;
keep prime_number;
run;
... View more