Did I understand that you were try to set the bottom/top (low and high) 84 columns independently?
If so, the SQL magic might require that you write 164 SQL sub-query, and thus 184 table scans.
Astounding's code, using the data step reads the table just twice.
But... we could super-charge the SQL code.
I might suggest adding some SAS magic dust, which could reduce the164 table scans into one scan for summary totals.
data classv / view=classv;
set sashelp.class;
array n(*) age--weight;
do i=1 to dim(n); drop i ;
if n{i}<=50 then n{i}=.;
else if n{i}>=100 then n{i}=.;
end;
run;
The SQL magic would look like this...
proc sql;
create view classv as
select case when (var1 not between 50 and 500) then . else var1 end as var1
...
case when (var84 not between 50 and 500) then . else var84 end as var84
from class;
quit;
(we won't split hair yet, but, recoding out of range value to missing... 7 lines of data step view code vs 88 lines of SQL view code)
What's my point...the magic here is to first recode the value outside of the 50-500 range to missing, without reading/writing the table, by using a view, SQL or DATA, your preference. With values recoded to missing, and unlike the subqueries which required independent WHERE expression, the SQL MIN/MAX summaries of this view would be the first values found. There is no need for independent WHERE expressions, nothing is outside the range for any of the columns except missing.
So, by re-coding extreme values to missing, we can summarize in one pass and put the results in a table (or inline view)...
proc sql;
create table lowhigh as
select min(age) as age_min, max(age) as max(age),
...
min(var84) as var84_max, max(var84) as var84_max
from classv;
quit;
The point again is to avoid 184 subqueries/table scans. However, the summary query would still need to have 168 summary expressions written for the 84 columns... so maybe another 88 lines of SQL code.
Now, Neeza's suggestion of PROC MEANS, and later recanted, when it was realized that one might be looking at writing 84 PROC MEANS step each with 84 independent WHERE expression with 84 table scans, just like the 84 subqueries. However, the magic of the recoded to missing view is one can easily consumed the view with a single pass of PROC MEANS much like SQL.
data classv / view=classv;
set sashelp.class;
array n(*) age--weight;
do i=1 to dim(n); drop i ;
if n{i}<=50 then n{i}=.;
else if n{i}>=100 then n{i}=.;
end;
run;
proc means data=classv noprint ;
var age--weight;
output out=lowhigh min()= max()= / autoname ;
run;
If you can live with some more SAS magic, the autoname option avoid writing those pesky 168 SQL summary expressions.
But, wait, we still have to do the bottom/top coding. We would next do a blind many-to-one join of the lowhigh bottom/top coding values back into the detail table...
proc sql;
create view classv as
select case when (var1<=50) then var1_min
when (var1>=500) then var1_max var1
else var1 end as var1,
....
case when (var84<=50) then var84_min
when (var84>=500) then var1_max var84
else var84 end as var84
from class, lowhigh;
quit;
We can call that, what, 84 case expressions or 250+ lines of code.
So in SQL, the view to recode to missing view, 84 lines, find the lowhigh summary table, 168 summary expressions, and finally bottom/top-coding the 84 columns, 84 case expressions... but just two table scans. (Albeit, you could write a macro program.)
The data step code for bottom/top coding...
data newclass;
set sashelp.class;
if _n_=1 then set lowhigh;
array actual{3} age--weight;
array stat{2,3} age_min--weight_max;
if stat{1,_i}=. then actual{_i}=75;
else if actual{_i}<=50 then actual{_i}=stat{1,_i};
else if actual{_i}>=100 then actual{_i}=stat{2,_i};
drop _: age_min--weight_max;
run;
Data step view, 8 lines of code; the PROC means summarization, 4 lines of code, the bottom/top coding, 10 lines of code (all of which could be tightened)
Astounding single data step code, 20 lines.
Where's the beef/magic?
... View more