I wonder what is the general rule for extracting eligible subjects in a counting process format in SAS.
For example, take a look at the table below. Let's assume that I set a condition that a subject (ID) is not eligible if psyc_c is both 0 during 2002 and 2003. Then I wouldn't want to include the subject in the dataset. I think that I deal with a problem like this very often in programming in a counting process format. Another example can be that I want to eliminate subjects if there is a certain number in a variable in a certain year. For example, I want to eliminate a subject if their psyc_c is 3 in 2004, then ID 3 will be excluded from the data. Please help me to figure out.. Thanks..
Year | Id | psyc_c | dprs_c |
2002 | 1 | 0 | 0 |
2003 | 1 | 0 | 1 |
2004 | 1 | 0 | 0 |
2005 | 1 | 0 | 0 |
2006 | 1 | 0 | 0 |
2002 | 2 | 0 | 0 |
2003 | 2 | 0 | 0 |
2004 | 2 | 0 | 0 |
2005 | 2 | 0 | 0 |
2006 | 2 | 0 | 0 |
2002 | 3 | 0 | 2 |
2003 | 3 | 0 | 0 |
2004 | 3 | 3 | 0 |
2005 | 3 | 0 | 0 |
2006 | 3 | 0 | 0 |
2002 | 4 | 1 | 0 |
2003 | 4 | 0 | 0 |
2004 | 4 | 0 | 0 |
2005 | 4 | 0 | 3 |
2006 | 4 | 0 | 3 |
The best strategy depends a lot on the type of rules that you want to implement. For the example given, a flexible approach would be to transpose your data and remerge the selected ids:
data have;
input Year Id psyc_c dprs_c;
datalines;
2002 1 0 0
2003 1 0 1
2004 1 0 0
2005 1 0 0
2006 1 0 0
2002 2 0 0
2003 2 0 0
2004 2 0 0
2005 2 0 0
2006 2 0 0
2002 3 0 2
2003 3 0 0
2004 3 3 0
2005 3 0 0
2006 3 0 0
2002 4 1 0
2003 4 0 0
2004 4 0 0
2005 4 0 3
2006 4 0 3
;
proc transpose data=have out=full(drop=_name_) prefix=psyc_c_;
by id;
id year;
var psyc_c;
run;
proc print data=full; run;
data selected(keep=id);
set full;
if psyc_c_2002 = 0 and psyc_c_2003 = 0 then delete;
if psyc_c_2004 = 3 then delete;
/* .... */
run;
data want;
merge have selected(in=ok);
by id;
if ok;
run;
proc print data=want; run;
@asinusdk wrote:
I wonder what is the general rule for extracting eligible subjects in a counting process format in SAS.
For example, take a look at the table below. Let's assume that I set a condition that a subject (ID) is not eligible if psyc_c is both 0 during 2002 and 2003. Then I wouldn't want to include the subject in the dataset. I think that I deal with a problem like this very often in programming in a counting process format. Another example can be that I want to eliminate subjects if there is a certain number in a variable in a certain year. For example, I want to eliminate a subject if their psyc_c is 3 in 2004, then ID 3 will be excluded from the data. Please help me to figure out.. Thanks..
Year Id psyc_c dprs_c 2002 1 0 0 2003 1 0 1 2004 1 0 0 2005 1 0 0 2006 1 0 0 2002 2 0 0 2003 2 0 0 2004 2 0 0 2005 2 0 0 2006 2 0 0 2002 3 0 2 2003 3 0 0 2004 3 3 0 2005 3 0 0 2006 3 0 0 2002 4 1 0 2003 4 0 0 2004 4 0 0 2005 4 0 3 2006 4 0 3
The logical approach is pretty straight forward-->
You just need to structure your piecemeal like
1. check for condition
2. flag if true/false accordingly
3. update/delete/or just retain the flagged group/records
4. Outout to one or many datasets based on the flagged condition if any
5. Translate the above 4 into SAS language
And you are done
The best strategy depends a lot on the type of rules that you want to implement. For the example given, a flexible approach would be to transpose your data and remerge the selected ids:
data have;
input Year Id psyc_c dprs_c;
datalines;
2002 1 0 0
2003 1 0 1
2004 1 0 0
2005 1 0 0
2006 1 0 0
2002 2 0 0
2003 2 0 0
2004 2 0 0
2005 2 0 0
2006 2 0 0
2002 3 0 2
2003 3 0 0
2004 3 3 0
2005 3 0 0
2006 3 0 0
2002 4 1 0
2003 4 0 0
2004 4 0 0
2005 4 0 3
2006 4 0 3
;
proc transpose data=have out=full(drop=_name_) prefix=psyc_c_;
by id;
id year;
var psyc_c;
run;
proc print data=full; run;
data selected(keep=id);
set full;
if psyc_c_2002 = 0 and psyc_c_2003 = 0 then delete;
if psyc_c_2004 = 3 then delete;
/* .... */
run;
data want;
merge have selected(in=ok);
by id;
if ok;
run;
proc print data=want; run;
I thought about transposing it, but I couldn't handle a problem some subjects who are missing for some years....
Some subjects missing for some years wouldn't be a problem, unless they are missing for all years. Experiment!
Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.