Hello
I want to create an exclusion criteria/variable for a large dataset that includes the variables PT_Number, date, status, and rank.
I want to create an exclusion criteria so that I can later use a "Where" statement. I basically want this exclusion "
If the dates where rank does not equal 1 and status does not equal P1 are less than the dates with the rank that equal 1 and status that equal P1 then the Exclusion_Variable will equal 1.
*Note: there are other entries for rank such as (2,3,4) and for status such as (C1 CI, EA, etc). below is what I thought my code should look like and a table that I have and that I want. I also did test my code below and did a freq check and did not get any '1' in my exclusion variable frequency.
Am I missing something?
data Want;
set Have;
if date and rank ne 1 and status in ('C1,CI,EA,EB,F1,F2,F3,M1,M2,M4,M5,M7,M8,M9,MB,MH,MP,MS,N9,P1,P3,S4,VL,c1') < date and rank='1' and status='P1' then exclusion_variable='1';
run;
Have
PT_Number | date | status | rank |
A1 | 22733 | M1 | 1 |
A1 | 22733 | P1 | 2 |
A1 | 22733 | P1 | 1 |
A1 | 22733 | P1 | 3 |
A1 | 22740 | P1 | 2 |
A1 | 22740 | P1 | 1 |
A1 | 22740 | P1 | 1 |
A1 | 22740 | P1 | 3 |
Want
PT_Number | date | status | rank | Exclusion_Variable |
A1 | 22733 | M1 | 1 | 1 |
A1 | 22733 | P1 | 2 | 1 |
A1 | 22733 | P1 | 1 | |
A1 | 22733 | P1 | 3 | 1 |
A1 | 22740 | P1 | 2 | |
A1 | 22740 | P1 | 1 | |
A1 | 22740 | P1 | 1 | |
A1 | 22740 | P1 | 3 |
Your IF conditions has many issues.
I'd suggest starting out with a simple single condition if test. Once that works as expected add other conditions step by step.
Please write your exclusion logic in plain English as this makes no sense at all:
If date and rank not = 1 and status not = P1 < date and rank=1 and status=P1 then Exclusion_Variable='1'.
"If date" , for example, is like saying "if date is not missing".
Where does PT_Number come into it? Are you applying this logic by PT_Number?
I am having so much trouble with this if then statement with multiple conditions.
I'm hoping someone can help me as I try to explain this in plain english and turn this into SAS code.
If the dates where Status is P1 and Rank is 1 is greater than the dates where any other combination of status and rank (not P1 and 1 respectively) Then that will equal 1 in the exclusion variable.
Have
PT_Number | date | status | rank |
A1 | 22733 | M1 | 1 |
A1 | 22733 | P1 | 2 |
A1 | 22733 | P1 | 1 |
A1 | 22733 | P1 | 3 |
A1 | 22740 | P1 | 2 |
A1 | 22740 | P1 | 1 |
A1 | 22740 | P1 | 1 |
A1 | 22740 | P1 | 3 |
Want
PT_Number | date | status | rank | Exclusion_Variable |
A1 | 22733 | M1 | 1 | 1 |
A1 | 22733 | P1 | 2 | 1 |
A1 | 22733 | P1 | 1 | |
A1 | 22733 | P1 | 3 | 1 |
A1 | 22740 | P1 | 2 | |
A1 | 22740 | P1 | 1 | |
A1 | 22740 | P1 | 1 | |
A1 | 22740 | P1 | 3 |
Please do not start a new thread for the same question, keep everything in one place. I merged this backin here.
Double DO loop:
data want;
_date = 99999;
do until (last.pt_number);
set have;
by pt_number;
if status ne "P1" or rank ne 1 then _date = min(_date,date);
end;
do until (last.pt_number);
set have;
by pt_number;
if status = "P1" and rank = 1 and date gt _date then exclusion_variable = 1;
output;
end;
drop _date;
run;
This applies your rule, but it won't create your expected output, as 22733 is not greater than any other date.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.