01-08-2016 04:21 PM
proc sort data=mydata; by age; run; data count; set mydata; by age; zero=0; hundred=0; if age <0 then zero+1; else if age >100 then hundred+1; run;
with above code, I can get zero and hundred as two new variables added in the data set. I actually only want to have a simple number of how many cases with age <0 and how many with age>100. How to do that?
01-09-2016 10:07 AM
Great that you found a good solution using PROC SQL.
Please note that your data step approach could be simplified considerably:
With these suggestions your initial code could have looked like this:
data count; set mydata; zero+(age<0); hundred+(age>100); run;
You should be aware that not only negative values, but also missing values of AGE would satisfy the condition AGE<0. If you want to restrict your count to negative values, the third line should read zero+(.<age<0) or, to exclude special missing values as well, zero+(.z<age<0).
Of course, you don't need to create a new (possibly large) dataset COUNT to count those observations. You could avoid this with a data step as the following:
data _null_; set mydata end=last; zero+(age<0); hundred+(age>100); if last then put zero= hundred=; run;
By default, the PUT statement would write the desired numbers to the log (not to the output window like PROC SQL).
Need further help from the community? Please ask a new question.