Hello.
I am currently trying to exclude all observations within an ID if the first observation of a variable is larger than 0.
So for example,
PID Gender Income Year Exercise
1 1 300 2000 0
1 1 400 2002 1
1 1 50001 2004 2
2 2 90000 2000 4
2 2 5999 2002 10
2 2 3443 2004 0
3 2 1333 2000 2
3 2 4000 2002 3
3 2 5000 2004 10
For PID 2 and 3, since exercise at year 2000 (the first year), I would like to delete everything under PID 2 and 3 (including the IDs) so that they won't show as having missing values.
Thank you.
I assume that your data is sorted by PID and Year
data have;
input PID Gender Income Year Exercise;
datalines;
1 1 300 2000 0
1 1 400 2002 1
1 1 50001 2004 2
2 2 90000 2000 4
2 2 5999 2002 10
2 2 3443 2004 0
3 2 1333 2000 2
3 2 4000 2002 3
3 2 5000 2004 10
;
data want(drop=dummy);
dummy=0;
do until (last.PID);
set have;
by PID;
if first.PID & Exercise gt 0 then dummy=1;
end;
do until (last.PID);
set have;
by PID;
if dummy=0 then output;
end;
run;
Since you condition is "first observation" for each ID, it's a single-pass job:
data have ;
input PID Gender Income Year Exercise ;
cards ;
1 1 300 2000 0
1 1 400 2002 1
1 1 50001 2004 2
2 2 90000 2000 4
2 2 5999 2002 10
2 2 3443 2004 0
3 2 1333 2000 2
3 2 4000 2002 3
3 2 5000 2004 10
run ;
data want ;
set have ;
by pid ;
if first.pid then _iorc_ = ifn (exercise, 0, 1) ;
if _iorc_ ;
run ;
Kind regards
Paul D.
Please note that this condition
if first.pid then _iorc_ = ifn (exercise, 0, 1) ;
will not work as specified in the initial post if the variable exercise takes negative values.
To work around that the condition should be
if first.pid then _iorc_ = ifn (exercise gt 0, 0, 1) ;
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.