Hello,
How can I create a new column for preafib and prepfo BUT ignore the missing row when assigning a value to the new variable.
without assigning a value to the new variable when both First_Ischemic or First_Hemorrhagic is blank. For example, ID = 032462 my code was able to calculate when afib is less then First_Ischemic or First_Hemorrhagic but it is also assigning a value for when both columns are missing. I do not want it.
Also, ID=023434 is assigning a value of YES to preafib because First_Ischemic = . so it recognizes First_Ischemic to be less than First_AFIB=21JUN2003. How can I rewrite my code to ignore the missing and only calculate for the non-missing rows. Thanks
data a;
input ID $6. First_Ischemic First_Hemorrhagic first_AFIB first_PFO ;
format First_Ischemic First_Hemorrhagic first_AFIB first_PFO date9.;
informat First_Ischemic First_Hemorrhagic first_AFIB first_PFO date9.;
datalines;
011386 23SEP2004 10FEB2020 . .
034627 01DEC2009 30NOV2009 . 10FEB2020
011427 10JUL2003 . . .
012666 . 18SEP2006 20JUN2002 .
023434 . 18OCT2002 21JUN2003 .
020485 15JUL2009 . . .
032462 . . . 20JUN2002
011386 23SEP2004 10FEB2020 . .
;
run;
proc sort data=a; by id; run;
data Prepost;
set a;
by id;
if First_AFIB ne . then do;
if First_AFIB<First_Ischemic or First_Hemorrhagic then Preafib='Yes';
else if First_AFIB> First_Ischemic or First_Hemorrhagic then Preafib='No';
end;
if first_PFO ne . then do;
if first_PFO<First_Ischemic or First_Hemorrhagic then Prepfo='Yes';
if first_PFO> First_Ischemic or First_Hemorrhagic then Prepfo='No';
end;
run;
Output of interest
Obs | ID | First_Ischemic | First_Hemorrhagic | first_AFIB | first_PFO | Preafib | Prepfo |
1 | 011386 | 23SEP2004 | 10FEB2020 | . | . | ||
2 | 011386 | 23SEP2004 | 10FEB2020 | . | . | ||
3 | 011427 | 10JUL2003 | . | . | . | ||
4 | 012666 | . | 18SEP2006 | 20JUN2002 | . | Yes | |
5 | 020485 | 15JUL2009 | . | . | . | ||
6 | 023434 | . | 18OCT2002 | 21JUN2003 | . | No | |
7 | 032462 | . | . | . | 20JUN2002 | ||
8 | 034627 | 01DEC2009 | 30NOV2009 | . | 10FEB2020 | No |
if First_AFIB ne . then do;
if First_AFIB<First_Ischemic or First_AFIB<First_Hemorrhagic then Preafib='Yes';
else if (not missing(first_ischemic) or not missing(first_Hemorrhagic)) and
(First_AFIB> First_Ischemic or First_AFIB>First_Hemorrhagic) then Preafib='No';
end;
if first_PFO ne . then do;
if first_PFO<First_Ischemic or first_PFO<First_Hemorrhagic then Prepfo='Yes';
else if (not missing(first_ischemic) or not missing(first_Hemorrhagic)) and
(first_PFO> First_Ischemic or first_PFO<First_Hemorrhagic) then Prepfo='No';
end;
You will find it extremely helpful if you work out the logic in your head and/or on a piece of paper, so all possible situations are accounted for, and then program from that.
You have to test to see if first_ischemic and first_hemmorrhagic are both missing. You haven't done that, so in your code this isn't accounted for. I think this is what you want
if first_PFO ne . then do;
if first_PFO<First_Ischemic or first_PFO<First_Hemorrhagic then Prepfo='Yes';
if (not missing(first_ischemic) or not missing(first_Hemorrhagic)) and
(first_PFO> First_Ischemic or first_PFO>First_Hemorrhagic) then Prepfo='No';
end;
Also your constructed IF statement
if first_PFO<First_Ischemic or First_Hemorrhagic
seems wrong to me and I have adjusted this in my code.
What do you think is wrong here. I expected ID=012666 to be "Yes" if First_AFIB<First_Ischemic or First_AFIB<First_Hemorrhagic but am getting a "NO". I have modified the code few times but not the right output.
data Prepost1;
set a;
by id;
if First_AFIB ne . then do;
if First_AFIB<First_Ischemic or First_AFIB<First_Hemorrhagic then Preafib='Yes';
if (not missing(first_ischemic) or not missing(first_Hemorrhagic)) and
(First_AFIB> First_Ischemic or First_AFIB>First_Hemorrhagic) then Preafib='No';
end;
if first_PFO ne . then do;
if first_PFO<First_Ischemic or first_PFO<First_Hemorrhagic then Prepfo='Yes';
if (not missing(first_ischemic) or not missing(first_Hemorrhagic)) and
(first_PFO> First_Ischemic or first_PFO<First_Hemorrhagic) then Prepfo='No';
end;
run;
if First_AFIB ne . then do;
if First_AFIB<First_Ischemic or First_AFIB<First_Hemorrhagic then Preafib='Yes';
else if (not missing(first_ischemic) or not missing(first_Hemorrhagic)) and
(First_AFIB> First_Ischemic or First_AFIB>First_Hemorrhagic) then Preafib='No';
end;
if first_PFO ne . then do;
if first_PFO<First_Ischemic or first_PFO<First_Hemorrhagic then Prepfo='Yes';
else if (not missing(first_ischemic) or not missing(first_Hemorrhagic)) and
(first_PFO> First_Ischemic or first_PFO<First_Hemorrhagic) then Prepfo='No';
end;
You will find it extremely helpful if you work out the logic in your head and/or on a piece of paper, so all possible situations are accounted for, and then program from that.
SAS very nicely provides the ability to have multiple comparisons such as a < b < c as a simple construct.
So if you want to exclude the missing values for First_afib from returning true with code like First_AFIB<First_Ischemic then provide a lower bound for First_afib:
if 0< First_AFIB<First_Ischemic then ...
Since your values are mostly dates you could use a date literal that you expect never to appear in the data a valid such as
if '01JAN1800'd < First_AFIB<First_Ischemic then ...
What test do you think this code is performing?
First_AFIB<First_Ischemic or First_Hemorrhagic
SAS will treat that as this test:
(First_AFIB<First_Ischemic) or (First_Hemorrhagic)
So if FIRST_HEMORRHAGIC is not missing and not zero (zero is the date '01JAN1960'd) then that test will be TRUE.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.