Hello, I have an array that searches seven variables for numbers. The seven variables contain the number of days between two dates (this could range from -300 to +300). Based on the numbers that it finds it is classified into 1 of 8 categories. In this case I want certain categories to take precedence over others, so I have created IF/else statements so that if the best match is found. My code is the following:
Treat_cat = "No treatment listed";
array diff{7} Days_bt_PHAC_Tx1-Days_bt_PHAC_Tx7;
do i= 1 to 7;
if diff(i) GE -14 and diff(i) LE 3 then Treat_cat = "1. Within 3 days";
else if diff(i) GE 4 and diff(i) LE 7 then Treat_cat = "2. Between 4 to 7 days";
else if diff(i) GE 8 and diff(i) LE 14 then Treat_cat = "3. Between 8 to 14 days";
else if diff(i) GE 15 and diff(i) LE 30 then Treat_cat = "4. Between 15 to 30 days";
else if diff(i) GE 31 and diff(i) LE 365 then Treat_cat = "5. Between 31 to 365 days";
else if diff(i) GT 365 then Treat_cat = "6. More than 365 days";
else if diff(i) NE '' and diff(i) LE -15 then Treat_cat = "7. Below categories";
end;
drop i;
However, i'm finding that the code is working in that it is assigning all cases to a category. However it will assign them to the last category in the set of else/if statements. For example, if the days between dates for a case are 0, 16, the case will be assigned to the "Between 15 and 30 days" instead of "within 3 days". If the same case had 0, 16, and 76 then they would be assigned to "Between 31 and 365". It seems to always take the last category in my else if statements instead of the first.
Any help would be appreciated.
So move the IF/THEN tree outside of the DO loop. Use the DO loop to find the min instead.
array diff Days_bt_PHAC_Tx1-Days_bt_PHAC_Tx7;
do i= 1 to dim(diff);
if diff[i]>=-14 then min_diff = min(min_diff , diff[i] );
end;
if -14 le min_diff le 3 then ...
else ...
I may collapse the number of categories later but for right now I'm using them look up some of the far outliers (e.g. below category and above 365)
I think you still need to have an array of TREAT_CAT variables.
array TREAT_CAT[7] $96;
Then you could see how your categories assigned for all 7 items.
I think your life is going to be a lot easier with a format (and also an array of treat cat variables as already suggested below):
proc format;
value fdiff
-14-3='between -14 and 3'
4-7='between 4 and 7'
(list your others here)
;
run;
data test;
set test;
array d {*} days_btw_1-days_btw_7;
array cats {*} cat1-cat7;
do i=1 to dim(d);
cats[i]=put(d[i],fdiff.);
end;
run;
Because you are making only one Treat_cat variable its value is assigned by the iteration of the DO loop.
You are effectively running this DO loop.
do i= 7;
If you want 7 variables then define another array and modify the code to use it.
If you want only one TREAT_CAT variable then what value do you want to base it on? The first? last? min? max? mean?
Thanks everyone for the feedback. My original thought was to create 1 variable with different values as opposed 7 different variables. In terms of min/max etc., for the most part it's the minimum (i.e. if it had options of 0 and 25, chosse 0 and assign it to "within 3 days"). However, there is an exception and that was for values below -14. For these i didn't want them to take precedence (e.g. if it was -100 and 2, I wanted it to take the 2). This was why i tried to assign the categories using the else/if statements, but it sounds like this is not possible do to the nature of how the array works.
So i might be to do an array to create the extra variables and then do if/then statements based on those variables to create the 1 variable that i want.
For that, use a LEAVE statement to exit the loop when it finds something:
proc format;
value fdiff
-14-3='between -14 and 3'
4-7='between 4 and 7'
(list your others here)
other='not valid' /* add this at the end -- exactly as typed here */
;
run;
data test;
set test;
array d {*} days_btw_1-days_btw_7;
length c $50;
do i=1 to dim(d);
c=put(d[i],fdiff.);
if c^='not valid' then LEAVE;
end;
run;
So move the IF/THEN tree outside of the DO loop. Use the DO loop to find the min instead.
array diff Days_bt_PHAC_Tx1-Days_bt_PHAC_Tx7;
do i= 1 to dim(diff);
if diff[i]>=-14 then min_diff = min(min_diff , diff[i] );
end;
if -14 le min_diff le 3 then ...
else ...
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.