Hello,
How can I create a new column for preafib and prepfo BUT ignore the missing row when assigning a value to the new variable.
without assigning a value to the new variable when both First_Ischemic or First_Hemorrhagic is blank. For example, ID = 032462 my code was able to calculate when afib is less then First_Ischemic or First_Hemorrhagic but it is also assigning a value for when both columns are missing. I do not want it.
Also, ID=023434 is assigning a value of YES to preafib because First_Ischemic = . so it recognizes First_Ischemic to be less than First_AFIB=21JUN2003. How can I rewrite my code to ignore the missing and only calculate for the non-missing rows. Thanks
data a;
input ID $6. First_Ischemic First_Hemorrhagic first_AFIB first_PFO ;
format First_Ischemic First_Hemorrhagic first_AFIB first_PFO date9.;
informat First_Ischemic First_Hemorrhagic first_AFIB first_PFO date9.;
datalines;
011386 23SEP2004 10FEB2020 . .
034627 01DEC2009 30NOV2009 . 10FEB2020
011427 10JUL2003 . . .
012666 . 18SEP2006 20JUN2002 .
023434 . 18OCT2002 21JUN2003 .
020485 15JUL2009 . . .
032462 . . . 20JUN2002
011386 23SEP2004 10FEB2020 . .
;
run;
proc sort data=a; by id; run;
data Prepost;
set a;
by id;
if First_AFIB ne . then do;
if First_AFIB<First_Ischemic or First_Hemorrhagic then Preafib='Yes';
else if First_AFIB> First_Ischemic or First_Hemorrhagic then Preafib='No';
end;
if first_PFO ne . then do;
if first_PFO<First_Ischemic or First_Hemorrhagic then Prepfo='Yes';
if first_PFO> First_Ischemic or First_Hemorrhagic then Prepfo='No';
end;
run;
Output of interest
Obs | ID | First_Ischemic | First_Hemorrhagic | first_AFIB | first_PFO | Preafib | Prepfo |
1 | 011386 | 23SEP2004 | 10FEB2020 | . | . | ||
2 | 011386 | 23SEP2004 | 10FEB2020 | . | . | ||
3 | 011427 | 10JUL2003 | . | . | . | ||
4 | 012666 | . | 18SEP2006 | 20JUN2002 | . | Yes | |
5 | 020485 | 15JUL2009 | . | . | . | ||
6 | 023434 | . | 18OCT2002 | 21JUN2003 | . | No | |
7 | 032462 | . | . | . | 20JUN2002 | ||
8 | 034627 | 01DEC2009 | 30NOV2009 | . | 10FEB2020 | No |
if First_AFIB ne . then do;
if First_AFIB<First_Ischemic or First_AFIB<First_Hemorrhagic then Preafib='Yes';
else if (not missing(first_ischemic) or not missing(first_Hemorrhagic)) and
(First_AFIB> First_Ischemic or First_AFIB>First_Hemorrhagic) then Preafib='No';
end;
if first_PFO ne . then do;
if first_PFO<First_Ischemic or first_PFO<First_Hemorrhagic then Prepfo='Yes';
else if (not missing(first_ischemic) or not missing(first_Hemorrhagic)) and
(first_PFO> First_Ischemic or first_PFO<First_Hemorrhagic) then Prepfo='No';
end;
You will find it extremely helpful if you work out the logic in your head and/or on a piece of paper, so all possible situations are accounted for, and then program from that.
You have to test to see if first_ischemic and first_hemmorrhagic are both missing. You haven't done that, so in your code this isn't accounted for. I think this is what you want
if first_PFO ne . then do;
if first_PFO<First_Ischemic or first_PFO<First_Hemorrhagic then Prepfo='Yes';
if (not missing(first_ischemic) or not missing(first_Hemorrhagic)) and
(first_PFO> First_Ischemic or first_PFO>First_Hemorrhagic) then Prepfo='No';
end;
Also your constructed IF statement
if first_PFO<First_Ischemic or First_Hemorrhagic
seems wrong to me and I have adjusted this in my code.
What do you think is wrong here. I expected ID=012666 to be "Yes" if First_AFIB<First_Ischemic or First_AFIB<First_Hemorrhagic but am getting a "NO". I have modified the code few times but not the right output.
data Prepost1;
set a;
by id;
if First_AFIB ne . then do;
if First_AFIB<First_Ischemic or First_AFIB<First_Hemorrhagic then Preafib='Yes';
if (not missing(first_ischemic) or not missing(first_Hemorrhagic)) and
(First_AFIB> First_Ischemic or First_AFIB>First_Hemorrhagic) then Preafib='No';
end;
if first_PFO ne . then do;
if first_PFO<First_Ischemic or first_PFO<First_Hemorrhagic then Prepfo='Yes';
if (not missing(first_ischemic) or not missing(first_Hemorrhagic)) and
(first_PFO> First_Ischemic or first_PFO<First_Hemorrhagic) then Prepfo='No';
end;
run;
if First_AFIB ne . then do;
if First_AFIB<First_Ischemic or First_AFIB<First_Hemorrhagic then Preafib='Yes';
else if (not missing(first_ischemic) or not missing(first_Hemorrhagic)) and
(First_AFIB> First_Ischemic or First_AFIB>First_Hemorrhagic) then Preafib='No';
end;
if first_PFO ne . then do;
if first_PFO<First_Ischemic or first_PFO<First_Hemorrhagic then Prepfo='Yes';
else if (not missing(first_ischemic) or not missing(first_Hemorrhagic)) and
(first_PFO> First_Ischemic or first_PFO<First_Hemorrhagic) then Prepfo='No';
end;
You will find it extremely helpful if you work out the logic in your head and/or on a piece of paper, so all possible situations are accounted for, and then program from that.
SAS very nicely provides the ability to have multiple comparisons such as a < b < c as a simple construct.
So if you want to exclude the missing values for First_afib from returning true with code like First_AFIB<First_Ischemic then provide a lower bound for First_afib:
if 0< First_AFIB<First_Ischemic then ...
Since your values are mostly dates you could use a date literal that you expect never to appear in the data a valid such as
if '01JAN1800'd < First_AFIB<First_Ischemic then ...
What test do you think this code is performing?
First_AFIB<First_Ischemic or First_Hemorrhagic
SAS will treat that as this test:
(First_AFIB<First_Ischemic) or (First_Hemorrhagic)
So if FIRST_HEMORRHAGIC is not missing and not zero (zero is the date '01JAN1960'd) then that test will be TRUE.
Catch the best of SAS Innovate 2025 — anytime, anywhere. Stream powerful keynotes, real-world demos, and game-changing insights from the world’s leading data and AI minds.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.