DATA Step, Macro, Functions and more

Using Subsets in Proc Print

Reply
Regular Learner
Posts: 1

Using Subsets in Proc Print

Hello, 

I am new to SAS programming and am wondering if there is a better way to subset my output in the print procedure. I am trying to print the oldest and youngest observations in my data set based on the variable "DOB". Below I have tried using proc sort and proc print seperately to create my output. My question is this: Is there a more efficient way to select specific observations in the print procedure, or print a subset data based on a condition? 

 

proc sort data = sds.school out = work.sorted_youngest ;
by descending DOB ;
run;


proc print data = work.sorted_youngest (obs=1) noobs ;
var ssn gender type ethnic;
title 'Youngest Student in Data Set';
run ;

******************************************************

I do the exact same steps below, but without the 'descending' option.

******************************************************; 
proc sort data = Fitch.ProjectData out = work.sorted_oldest ;
by TOT_DOB ;
run;


proc print data = work.sorted_oldest (obs=1) noobs;
var SSN SEX TYPE ETHNIC ;
title 'Oldest Student in Data Set' ;
run; 

 

Thank you in advance for the help! 

Super User
Posts: 5,504

Re: Using Subsets in Proc Print

Posted in reply to Proc_Help

With small data sets, it doesn't matter.  Learn as you go, and anything that gets the right answer is OK.  However ...

 

In this case, you're not getting the right answer.  When you sort your data (using DESCENDING), the first observation is actually the oldest, not the youngest.  That's easy to fix ... just a matter of re-labeling the results.

 

With larger data sets, sorting can be expensive.  This would be a way to get both the youngest and the oldest in one step, sorting only once.

 

proc sort data = sds.school;
by DOB ;
run;

 

data youngest oldest;

set sds.school end=done;

if _n_=1 then output youngest;

if done then output oldest;

run;

 

You could then print each data set separately.

 

Now if you are going to step it up a level beyond that, the DATA step could become:

 

data youngest oldest;

set sds.school nobs=_nobs_;

output youngest;

set sds.school point=_nobs_;

output oldest;

stop;

run;

 

That avoids reading all the observations in the middle of the data set, but it's getting into heavier programming techniques.  It's better to start with the earlier steps, and fully understand how they work.  The techniques there will be more widely applicable, while the last program would be needed infrequently.

Super User
Posts: 19,802

Re: Using Subsets in Proc Print

Posted in reply to Proc_Help

And SQL without a CREATE TABLE somewhat functions as a PRINT procedure as well.

Note that in the first case there are ties - there are two people with the smallest value. The sort methodology would not account for ties. 

 

proc sql;
*Youngest age - note that there are ties here! How would you want to handle that;
title 'Youngest person - note ties';
select *
from sashelp.class
having age=min(age);
quit;

proc sql;
title 'Oldest person - no ties';
select * from sashelp.class
having age=max(age);
quit;
Ask a Question
Discussion stats
  • 2 replies
  • 164 views
  • 0 likes
  • 3 in conversation