BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
CathyVI
Pyrite | Level 9

Hello,

I am unable to produce the follow output below.
I tried to use an array for my variable because in the real data, I have R_Seizure_1 to R_Seizure_20.
Writing R_Seizure_1=. or R_Seizure_2=. or R_Seizure_3=. all the way to R_Seizure_20 is not efficient.
I want SAS to ignore the missing variable.

data have;
input ID $6. First_Ischemic  First_Hemorrhagic R_Seizure_1 R_Seizure_2 R_Seizure_3 R_Seizure_4 ;
format First_Ischemic  First_Hemorrhagic  R_Seizure_1 R_Seizure_2 R_Seizure_3 R_Seizure_4 date9.;
informat First_Ischemic  First_Hemorrhagic  R_Seizure_1 R_Seizure_2 R_Seizure_3 R_Seizure_4 date9.;
datalines;
011396   23SEP2004  10FEB2004  .   .   .   11FEB2020 
034627   01DEC2009  30NOV2009  .   10FEB2010  .   .     
011427   11SEP2010   09AUG2010   10SEP2010   03FEB2012   .   . 
012666   .   18SEP2006   20JUN2002   .   .   .
023434   .   18OCT2002   21JUN2003   .   .   . 
020485   15JUL2019   .   .   .   15AUG2009   25JUL2010
032462   13AUG2014   .   12AUG2014    20JUN2002   .   .
011386   23SEP2004  10FEB2020  .   .   .   12AUG2015

;
run;
proc sort data=have; by id; run;

data want;
set have;
by id;
array vars R_Seizure_1--R_Seizure_4;
if R_Seizure_1--R_Seizure_4=. or First_Ischemic=. or First_Hemorrhagic=. then n='N/A';
if R_Seizure_1--R_Seizure_4 ne .  then do;
if R_Seizure_1--R_Seizure_4 > First_Ischemic or First_Hemorrhagic then n=1;
else n=0;
end;
run;

Expected output

IDFirst_IschemicFirst_HemorrhagicR_Seizure_1R_Seizure_2R_Seizure_3R_Seizure_4n
1138623SEP200410FEB2020...12AUG2015no
1139623SEP200410FEB2004 ...11FEB2020yes
1142711SEP201009AUG201010SEP201003FEB2012..yes
12666.18SEP200620JUN2002...no
2048515JUL2019 ...15AUG200925JUL2010no
23434.18OCT200221JUN2003...yes
3246213AUG2014.12AUG2014 20JUN2002..no
3462701DEC200930NOV2009.10FEB2010..yes
1 ACCEPTED SOLUTION

Accepted Solutions
PaigeMiller
Diamond | Level 26

Depending on whether or not you @CathyVI are correct or @Kurt_Bremser is correct, you want some variation of this command (note: no arrays needed)

 

if max (of r_seizure1-r_seizure4) < min (first_ischemic,first_herorrhagic) then n=1;

 

 

If that's not exactly what you want, then your homework assignment is to figure out what variation of the above code is correct for you.

--
Paige Miller

View solution in original post

12 REPLIES 12
PaigeMiller
Diamond | Level 26

Replace this 

 

array vars R_Seizure_1--R_Seizure_4;
if R_Seizure_1--R_Seizure_4=. or First_Ischemic=. or First_Hemorrhagic=. then n='N/A';
if R_Seizure_1--R_Seizure_4 ne .  then do;
if R_Seizure_1--R_Seizure_4 > First_Ischemic or First_Hemorrhagic then n=1;
else n=0;

 

 

with this

 

array vars R_Seizure_1--R_Seizure_4;
if n(of vars{*})=0 and First_Ischemic=. or First_Hemorrhagic=. then n='N/A';
if n(of vars{*})=4 then do;
/* if R_Seizure_1--R_Seizure_4 > First_Ischemic or First_Hemorrhagic then n=1; else n=0; */

 

Above, one line is commented out because I cannot figure out what you are trying to do (and in fact, I am guessing about the other lines as well). I'm pretty sure this has been requested from you in the past: please DESCRIBE what you are doing in words. DESCRIBE what each line is supposed to be doing. Do NOT make us guess what your incorrect code is supposed to be doing. Do this for every post in the future, as well as for this one. 

 

--
Paige Miller
CathyVI
Pyrite | Level 9

@PaigeMiller Sorry. I wrote an extensive message but I made a  little cut/paste mistake and I could not undo all the messages on this community note.

What I want is to find when 

R_Seizure_1 or R_Seizure_1 or R_Seizure_3 or R_Seizure_4 is greater than First_Ischemic or First_Hemorrhagic date and make a variable called N to be 'Yes'
 In the line you comment out, I want the N to indicate if any seizure date comes before the first stroke dates (ischemic or hemorrhagic) if this is true then seizure is a post_stroke (n=yes) but if seizure date comes after either stroke dates then the post_stroke (n=no).

@Tom   I want the default nmiss(of R_Seizure_1--R_Seizure_4). as if I want sas to make ANY missing=0, I would loose alot of record to missing. Only when ALL of the seizure is miss is when SAS should consider output as missing.
@ballardw I want N to be a character variable

Kurt_Bremser
Super User

@CathyVI wrote:

@PaigeMiller Sorry. I wrote an extensive message but I made a  little cut/paste mistake and I could not undo all the messages on this community note.

What I want is to find when 

R_Seizure_1 or R_Seizure_1 or R_Seizure_3 or R_Seizure_4 is greater than First_Ischemic or First_Hemorrhagic date and make a variable called N to be 'Yes'
 In the line you comment out, I want the N to indicate if any seizure date comes before the first stroke dates (ischemic or hemorrhagic) if this is true then seizure is a post_stroke (n=yes) but if seizure date comes after either stroke dates then the post_stroke (n=no).


You contradict yourself.

In the first sentence, you say you want "Yes" when any seizure comes after a stroke, but then you say at the end of the second sentence that it should be "No". Please make up your mind.

CathyVI
Pyrite | Level 9

@Kurt_BremserThis is a typo...

In the line you comment out, I want the N to indicate if any seizure date comes before the first stroke dates (ischemic or hemorrhagic) if this is true then seizure is a post_stroke (n=yes) but if seizure date DOES NOT comes after either stroke dates then the post_stroke (n=no).

Kurt_Bremser
Super User

@CathyVI wrote:

@Kurt_BremserThis is a typo...

In the line you comment out, I want the N to indicate if any seizure date comes before the first stroke dates (ischemic or hemorrhagic) if this is true then seizure is a post_stroke (n=yes) but if seizure date DOES NOT comes after either stroke dates then the post_stroke (n=no).


If it is before, you want yes, but if it is not after, you want no.

Note: "before" and "not after" indicate the same time frame in English.

PaigeMiller
Diamond | Level 26

Depending on whether or not you @CathyVI are correct or @Kurt_Bremser is correct, you want some variation of this command (note: no arrays needed)

 

if max (of r_seizure1-r_seizure4) < min (first_ischemic,first_herorrhagic) then n=1;

 

 

If that's not exactly what you want, then your homework assignment is to figure out what variation of the above code is correct for you.

--
Paige Miller
Tom
Super User Tom
Super User

This is not valid syntax:

if R_Seizure_1--R_Seizure_4=.

The = comparison operator can only compare 2 values.

 

Please explain in words what test you are trying to perform.

Do you want the result to be TRUE when ALL of the values are missing?

n(of R_Seizure_1--R_Seizure_4) = 0

Or any ANY of the values are missing?

nmiss(of R_Seizure_1--R_Seizure_4) > 0

Or even simpler just:

nmiss(of R_Seizure_1--R_Seizure_4)

since in boolean expressions SAS will treat 0 (or missing) as FALSE and any other number as TRUE.

Astounding
PROC Star
Not invalid, just incorrect.
R_Seizure_1--R_Seizure_4 actually is equivalent to
R_Seizure_1+R_Seizure_4
ballardw
Super User

Is the variable N supposed to be numeric or character? The code you have written creates it as character as first use is n='N/A'; Then you use statements that try to assign numeric values which gets you into the automatic conversion of numeric.

If you want to display 'N/A' for a numeric variable then assign a special missing and then create custom format to display that.

Patrick
Opal | Level 21

@CathyVI 

Below code populates variable FLG with values that match your variable N in the sample data.

Please add additional rows/data for use cases where you believe below code won't return what you desire. Given the already ongoing discussion I believe sample data with desired outcome that covers all your cases will get us faster to your desired solution.

data have;
    input ID $ First_Ischemic :date9. First_Hemorrhagic :date9. R_Seizure_1 :date9. R_Seizure_2 :date9. R_Seizure_3 :date9. R_Seizure_4 :date9. n $;
    format First_Ischemic First_Hemorrhagic R_Seizure_1 R_Seizure_2 R_Seizure_3 R_Seizure_4 date9.;
    datalines;
11386 23SEP2004 10FEB2020 . . . 12AUG2015 no
11396 23SEP2004 10FEB2004 . . . 11FEB2020 yes
11427 11SEP2010 09AUG2010 10SEP2010 03FEB2012 . . yes
12666 . 18SEP2006 20JUN2002 . . . no
20485 15JUL2019 . . . 15AUG2009 25JUL2010 no
23434 . 18OCT2002 21JUN2003 . . . yes
32462 13AUG2014 . 12AUG2014 20JUN2002 . . no
34627 01DEC2009 30NOV2009 . 10FEB2010 . . yes
;
run;

proc format;
  value yesno (default=3)
    1='yes'
    0='no'
    other='n/a'
    ;
run;

data want;
  set have;
  length flg 3;
  format flg yesno.;
  flg = max(of R_Seizure_:) > max(First_Ischemic,First_Hemorrhagic); 
run;

proc print data=want;
run;

Patrick_0-1729640322070.png

 

Tom
Super User Tom
Super User

An array does not really help with this problem since there is no need to loop.

Use the OF keyword to pass use a variable list when calling a function that takes an flexible number of arguments.

 

First let's add a few more examples to handle the missing values issues.

data have;
  input ID $6. First_Ischemic  First_Hemorrhagic R_Seizure_1-R_Seizure_3 ;
  format First_Ischemic  First_Hemorrhagic  R_Seizure_1-R_Seizure_3 date9.;
  informat First_Ischemic  First_Hemorrhagic  R_Seizure_1-R_Seizure_3 date.;
datalines;
011386 23SEP2004 10FEB2020         .         . 12AUG2015
011396 23SEP2004 10FEB2004         .         . 11FEB2020 
011427 11SEP2010 09AUG2010 10SEP2010 03FEB2012         . 
012666         . 18SEP2006 20JUN2002         .         .
020485 15JUL2019         .         . 15AUG2009 25JUL2010
023434         . 18OCT2002 21JUN2003         .         .
032462 13AUG2014         . 12AUG2014 20JUN2002         .
034627 01DEC2009 30NOV2009         . 10FEB2010         .
555555 01JAN2024         .         .         .         .
666666         .         . 01JAN2024         .         .
777777         .         .         .         .         .
888888 01JAN2024         . 01JAN2023 01MAY2024         .
;

Now let's create some flags to indicate if there are any seizure dates or stroke dates by using the N() function.  We can then use MIN and/or MAX to check if ANY of the seizure dates were before the first stroke.  Or if  or any seizure dates where after the first stroke. You could also test if ALL of the seizure dates are after the first stroke.

proc format;
 value ynu 0='No' 1='Yes' .='N/A';
run;

data want;
  set have;
  Any_Seizure = 0<N(of R_Seizure_1-R_Seizure_3);
  Any_Stroke = 0<N(of First_Ischemic  First_Hemorrhagic);
  if Any_seizure and Any_Stroke then do;
     Any_pre = min(of R_Seizure_1-R_Seizure_3) < min(of First_Ischemic  First_Hemorrhagic);
     All_pre = max(of R_Seizure_1-R_Seizure_3) < min(of First_Ischemic  First_Hemorrhagic);
     Any_post = max(of R_Seizure_1-R_Seizure_3) > min(of First_Ischemic  First_Hemorrhagic);
     All_post = min(of R_Seizure_1-R_Seizure_3) > min(of First_Ischemic  First_Hemorrhagic);
  end;
  format any_: all_: ynu.;
run;

Results:

Tom_0-1729647980939.png

 

 

CathyVI
Pyrite | Level 9

@Tom @PaigeMiller @Patrick @Kurt_Bremser @Astounding 

THANK YOU ALL!!! I appreciate all your comments and guidance. All of you are right. If I could pick multiple solutions I would i pick all but SAS only allows one solution so I just pick the one I found simple to understand - knowing that i continue to learn SAS, am not an expert yet. Thank you all again. 

SAS Innovate 2025: Register Now

Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 12 replies
  • 1128 views
  • 8 likes
  • 7 in conversation