BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
asinusdk
Calcite | Level 5

I wonder what is the general rule for extracting eligible subjects in a counting process format in SAS.

For example, take a look at the table below. Let's assume that I set a condition that a subject (ID) is not eligible if psyc_c is both 0 during 2002 and 2003. Then I wouldn't want to include the subject in the dataset. I think that I deal with a problem like this very often in programming in a counting process format. Another example can be that I want to eliminate subjects if there is a certain number in a variable in a certain year. For example, I want to eliminate a subject if their psyc_c is 3 in 2004, then ID 3 will be excluded from the data. Please help me to figure out.. Thanks.. 

 

YearIdpsyc_cdprs_c
2002100
2003101
2004100
2005100
2006100
2002200
2003200
2004200
2005200
2006200
2002302
2003300
2004330
2005300
2006300
2002410
2003400
2004400
2005403
2006403
1 ACCEPTED SOLUTION

Accepted Solutions
PGStats
Opal | Level 21

The best strategy depends a lot on the type of rules that you want to implement. For the example given, a flexible approach would be to transpose your data and remerge the selected ids:

 

data have;
input Year Id psyc_c dprs_c;
datalines;
2002 1 0 0 
2003 1 0 1 
2004 1 0 0 
2005 1 0 0 
2006 1 0 0 
2002 2 0 0 
2003 2 0 0 
2004 2 0 0 
2005 2 0 0 
2006 2 0 0 
2002 3 0 2 
2003 3 0 0 
2004 3 3 0 
2005 3 0 0 
2006 3 0 0 
2002 4 1 0 
2003 4 0 0 
2004 4 0 0 
2005 4 0 3 
2006 4 0 3 
;

proc transpose data=have out=full(drop=_name_) prefix=psyc_c_;
by id;
id year;
var psyc_c;
run;

proc print data=full; run;

data selected(keep=id);
set full;
if psyc_c_2002 = 0 and psyc_c_2003 = 0 then delete;
if psyc_c_2004 = 3 then delete;
/* .... */
run;

data want;
merge have selected(in=ok);
by id;
if ok;
run;

proc print data=want; run;
PG

View solution in original post

5 REPLIES 5
novinosrin
Tourmaline | Level 20

@asinusdk wrote:

I wonder what is the general rule for extracting eligible subjects in a counting process format in SAS.

For example, take a look at the table below. Let's assume that I set a condition that a subject (ID) is not eligible if psyc_c is both 0 during 2002 and 2003. Then I wouldn't want to include the subject in the dataset. I think that I deal with a problem like this very often in programming in a counting process format. Another example can be that I want to eliminate subjects if there is a certain number in a variable in a certain year. For example, I want to eliminate a subject if their psyc_c is 3 in 2004, then ID 3 will be excluded from the data. Please help me to figure out.. Thanks.. 

 

Year Id psyc_c dprs_c
2002 1 0 0
2003 1 0 1
2004 1 0 0
2005 1 0 0
2006 1 0 0
2002 2 0 0
2003 2 0 0
2004 2 0 0
2005 2 0 0
2006 2 0 0
2002 3 0 2
2003 3 0 0
2004 3 3 0
2005 3 0 0
2006 3 0 0
2002 4 1 0
2003 4 0 0
2004 4 0 0
2005 4 0 3
2006 4 0 3

The logical approach is pretty straight forward-->

You just need to structure your piecemeal like

1. check for condition 

2. flag if true/false accordingly

3. update/delete/or just retain the flagged group/records

4. Outout to one or many datasets based on the flagged condition if any

5. Translate the above 4 into SAS language

 

And you are done

PGStats
Opal | Level 21

The best strategy depends a lot on the type of rules that you want to implement. For the example given, a flexible approach would be to transpose your data and remerge the selected ids:

 

data have;
input Year Id psyc_c dprs_c;
datalines;
2002 1 0 0 
2003 1 0 1 
2004 1 0 0 
2005 1 0 0 
2006 1 0 0 
2002 2 0 0 
2003 2 0 0 
2004 2 0 0 
2005 2 0 0 
2006 2 0 0 
2002 3 0 2 
2003 3 0 0 
2004 3 3 0 
2005 3 0 0 
2006 3 0 0 
2002 4 1 0 
2003 4 0 0 
2004 4 0 0 
2005 4 0 3 
2006 4 0 3 
;

proc transpose data=have out=full(drop=_name_) prefix=psyc_c_;
by id;
id year;
var psyc_c;
run;

proc print data=full; run;

data selected(keep=id);
set full;
if psyc_c_2002 = 0 and psyc_c_2003 = 0 then delete;
if psyc_c_2004 = 3 then delete;
/* .... */
run;

data want;
merge have selected(in=ok);
by id;
if ok;
run;

proc print data=want; run;
PG
asinusdk
Calcite | Level 5

I thought about transposing it, but I couldn't handle a problem some subjects who are missing for some years.... 

PGStats
Opal | Level 21

Some subjects missing for some years wouldn't be a problem, unless they are missing for all years. Experiment!

PG
Peter_C
Rhodochrosite | Level 12
When data are already ordered, rather than re-merge or transpose I prefer a parallel pass in (this case) ID by- group. The first pass collects the criteria that are applied on the second pass.

Data selected ;
Set have(in= a) have(in=b) ;
By ID ;
If first.id then do ;
Retain flag1-flag3 counter1-counter3 ;
Call missing( of flag1-flag3, of counter1-counter3).;
IF A then do ;
" update counters if appropriate ;
* check if criteria apply and set flags ;
End ;

If ( criteria are right ) then output ;
Run;

hackathon24-white-horiz.png

2025 SAS Hackathon: There is still time!

Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!

Register Now

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 5 replies
  • 1547 views
  • 0 likes
  • 4 in conversation