Hello,
I am new to SAS programming and am wondering if there is a better way to subset my output in the print procedure. I am trying to print the oldest and youngest observations in my data set based on the variable "DOB". Below I have tried using proc sort and proc print seperately to create my output. My question is this: Is there a more efficient way to select specific observations in the print procedure, or print a subset data based on a condition?
proc sort data = sds.school out = work.sorted_youngest ;
by descending DOB ;
run;
proc print data = work.sorted_youngest (obs=1) noobs ;
var ssn gender type ethnic;
title 'Youngest Student in Data Set';
run ;
******************************************************
I do the exact same steps below, but without the 'descending' option.
******************************************************; 
proc sort data = Fitch.ProjectData out = work.sorted_oldest ;
by TOT_DOB ;
run;
proc print data = work.sorted_oldest (obs=1) noobs;
var SSN SEX TYPE ETHNIC ;
title 'Oldest Student in Data Set' ;
run; 
Thank you in advance for the help!
With small data sets, it doesn't matter. Learn as you go, and anything that gets the right answer is OK. However ...
In this case, you're not getting the right answer. When you sort your data (using DESCENDING), the first observation is actually the oldest, not the youngest. That's easy to fix ... just a matter of re-labeling the results.
With larger data sets, sorting can be expensive. This would be a way to get both the youngest and the oldest in one step, sorting only once.
proc sort data = sds.school;
by DOB ;
run;
data youngest oldest;
set sds.school end=done;
if _n_=1 then output youngest;
if done then output oldest;
run;
You could then print each data set separately.
Now if you are going to step it up a level beyond that, the DATA step could become:
data youngest oldest;
set sds.school nobs=_nobs_;
output youngest;
set sds.school point=_nobs_;
output oldest;
stop;
run;
That avoids reading all the observations in the middle of the data set, but it's getting into heavier programming techniques. It's better to start with the earlier steps, and fully understand how they work. The techniques there will be more widely applicable, while the last program would be needed infrequently.
And SQL without a CREATE TABLE somewhat functions as a PRINT procedure as well.
Note that in the first case there are ties - there are two people with the smallest value. The sort methodology would not account for ties.
proc sql;
*Youngest age - note that there are ties here! How would you want to handle that;
title 'Youngest person - note ties';
select *
from sashelp.class
having age=min(age);
quit;
proc sql;
title 'Oldest person - no ties';
select * from sashelp.class
having age=max(age);
quit;It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.
