I wonder what is the general rule for extracting eligible subjects in a counting process format in SAS.
For example, take a look at the table below. Let's assume that I set a condition that a subject (ID) is not eligible if psyc_c is both 0 during 2002 and 2003. Then I wouldn't want to include the subject in the dataset. I think that I deal with a problem like this very often in programming in a counting process format. Another example can be that I want to eliminate subjects if there is a certain number in a variable in a certain year. For example, I want to eliminate a subject if their psyc_c is 3 in 2004, then ID 3 will be excluded from the data. Please help me to figure out.. Thanks..
Year | Id | psyc_c | dprs_c |
2002 | 1 | 0 | 0 |
2003 | 1 | 0 | 1 |
2004 | 1 | 0 | 0 |
2005 | 1 | 0 | 0 |
2006 | 1 | 0 | 0 |
2002 | 2 | 0 | 0 |
2003 | 2 | 0 | 0 |
2004 | 2 | 0 | 0 |
2005 | 2 | 0 | 0 |
2006 | 2 | 0 | 0 |
2002 | 3 | 0 | 2 |
2003 | 3 | 0 | 0 |
2004 | 3 | 3 | 0 |
2005 | 3 | 0 | 0 |
2006 | 3 | 0 | 0 |
2002 | 4 | 1 | 0 |
2003 | 4 | 0 | 0 |
2004 | 4 | 0 | 0 |
2005 | 4 | 0 | 3 |
2006 | 4 | 0 | 3 |
The best strategy depends a lot on the type of rules that you want to implement. For the example given, a flexible approach would be to transpose your data and remerge the selected ids:
data have;
input Year Id psyc_c dprs_c;
datalines;
2002 1 0 0
2003 1 0 1
2004 1 0 0
2005 1 0 0
2006 1 0 0
2002 2 0 0
2003 2 0 0
2004 2 0 0
2005 2 0 0
2006 2 0 0
2002 3 0 2
2003 3 0 0
2004 3 3 0
2005 3 0 0
2006 3 0 0
2002 4 1 0
2003 4 0 0
2004 4 0 0
2005 4 0 3
2006 4 0 3
;
proc transpose data=have out=full(drop=_name_) prefix=psyc_c_;
by id;
id year;
var psyc_c;
run;
proc print data=full; run;
data selected(keep=id);
set full;
if psyc_c_2002 = 0 and psyc_c_2003 = 0 then delete;
if psyc_c_2004 = 3 then delete;
/* .... */
run;
data want;
merge have selected(in=ok);
by id;
if ok;
run;
proc print data=want; run;
@asinusdk wrote:
I wonder what is the general rule for extracting eligible subjects in a counting process format in SAS.
For example, take a look at the table below. Let's assume that I set a condition that a subject (ID) is not eligible if psyc_c is both 0 during 2002 and 2003. Then I wouldn't want to include the subject in the dataset. I think that I deal with a problem like this very often in programming in a counting process format. Another example can be that I want to eliminate subjects if there is a certain number in a variable in a certain year. For example, I want to eliminate a subject if their psyc_c is 3 in 2004, then ID 3 will be excluded from the data. Please help me to figure out.. Thanks..
Year Id psyc_c dprs_c 2002 1 0 0 2003 1 0 1 2004 1 0 0 2005 1 0 0 2006 1 0 0 2002 2 0 0 2003 2 0 0 2004 2 0 0 2005 2 0 0 2006 2 0 0 2002 3 0 2 2003 3 0 0 2004 3 3 0 2005 3 0 0 2006 3 0 0 2002 4 1 0 2003 4 0 0 2004 4 0 0 2005 4 0 3 2006 4 0 3
The logical approach is pretty straight forward-->
You just need to structure your piecemeal like
1. check for condition
2. flag if true/false accordingly
3. update/delete/or just retain the flagged group/records
4. Outout to one or many datasets based on the flagged condition if any
5. Translate the above 4 into SAS language
And you are done
The best strategy depends a lot on the type of rules that you want to implement. For the example given, a flexible approach would be to transpose your data and remerge the selected ids:
data have;
input Year Id psyc_c dprs_c;
datalines;
2002 1 0 0
2003 1 0 1
2004 1 0 0
2005 1 0 0
2006 1 0 0
2002 2 0 0
2003 2 0 0
2004 2 0 0
2005 2 0 0
2006 2 0 0
2002 3 0 2
2003 3 0 0
2004 3 3 0
2005 3 0 0
2006 3 0 0
2002 4 1 0
2003 4 0 0
2004 4 0 0
2005 4 0 3
2006 4 0 3
;
proc transpose data=have out=full(drop=_name_) prefix=psyc_c_;
by id;
id year;
var psyc_c;
run;
proc print data=full; run;
data selected(keep=id);
set full;
if psyc_c_2002 = 0 and psyc_c_2003 = 0 then delete;
if psyc_c_2004 = 3 then delete;
/* .... */
run;
data want;
merge have selected(in=ok);
by id;
if ok;
run;
proc print data=want; run;
I thought about transposing it, but I couldn't handle a problem some subjects who are missing for some years....
Some subjects missing for some years wouldn't be a problem, unless they are missing for all years. Experiment!
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.