Hello, I would like to create a flag (binary, 1 or 0) using a do loop, with a 2 month look back period to see if a test that an observation has taken a test (also a month-based binary flag) that had occurred in the current month, the previous month, and the month prior to the previous month (each test occurrence is based on a monthly flag of 1 or 0; 1 being a positive test) . For example, if the subject had a positive test in January and was also flagged positive for the test in December and November, then a new flag indicating that the months of January, December, and November will also be 1. This will also be performed for the following months for 12 more months. For more clarity, please the code here without the DO loop:
data test_wide;
set test_merge;
*January;
if (test_in_mo_3 = 1) AND (test_in_mo_2 = 1) AND (test_in_mo_1 = 1) THEN test_3mo_1 = 1;
else test_3mo_1 = 0;
*Feb;
if (test_in_mo_4 = 1) AND (test_in_mo_3 = 1) AND (test_in_mo_2 = 1) THEN test_3mo_2 = 1;
else test_3mo_2 = 0;
*March;
if (test_in_mo_5 = 1) AND (test_in_mo_4 = 1) AND (test_in_mo_3 = 1) THEN test_3mo_3 = 1;
else test_3mo_3 = 0;
*April;
if (test_in_mo_6 = 1) AND (test_in_mo_5 = 1) AND (test_in_mo_4 = 1) THEN test_3mo_4 = 1;
else test_3mo_4 = 0;
*May;
if (test_in_mo_7 = 1) AND (test_in_mo_6 = 1) AND (test_in_mo_5 = 1) THEN test_3mo_5 = 1;
else test_3mo_5 = 0;
*June;
if (test_in_mo_8 = 1) AND (test_in_mo_7 = 1) AND (test_in_mo_6 = 1) THEN test_3mo_6 = 1;
else test_3mo_6 = 0;
*July;
if (test_in_mo_9 = 1) AND (test_in_mo_8 = 1) AND (test_in_mo_7 = 1) THEN test_3mo_7 = 1;
else test_3mo_7 = 0;
*August;
if (test_in_mo_10 = 1) AND (test_in_mo_9 = 1) AND (test_in_mo_8 = 1) THEN test_3mo_8 = 1;
else test_3mo_8 = 0;
*Sept;
if (test_in_mo_11 = 1) AND (test_in_mo_10 = 1) AND (test_in_mo_9 = 1) THEN test_3mo_9 = 1;
else test_3mo_9 = 0;
*Oct;
if (test_in_mo_12 = 1) AND (test_in_mo_11 = 1) AND (test_in_mo_10 = 1) THEN test_3mo_10 = 1;
else test_3mo_10 = 0;
*Nov;
if (test_in_mo_13 = 1) AND (test_in_mo_12 = 1) AND (test_in_mo_11 = 1) THEN test_3mo_11 = 1;
else test_3mo_11 = 0;
*Dec;
if (test_in_mo_14 = 1) AND (test_in_mo_13 = 1) AND (test_in_mo_12 = 1) THEN test_3mo_12 = 1;
else test_3mo_12 = 0;
run;
Please note that test_in_mo_2 and test_in_mo_1 flags refer to the months of December and November (of the previous year), respectively. The flag variable test_in_mo_3 refers to January, test_in_mo_4 refers to February, and so on until test_in_mo_14 refers to December of the current year.
I've been struggling to make this code into a DO loop, any advice or code would be greatly appreciated. I am on SAS 9.3.
Thank you in advance.
Post some sample input data and expected output data please.
Here's one approach. You may need to expand it if you want to cover additional months.
data want;
set test_merge;
array tests {14} test_in_mo_1 - test_in_mo_14;
array flags {12} test_3mo_1 - test_3mo_12;
do _n_=1 to 12;
flags{_n_} = (tests{_n_} = tests{_n_+1} = tests{_n_+2} = 1);
end;
run;
You can use any set of brackets you want:
tests{i}
tests(i)
tests[i]
They're just alternate ways of referring to an element of the array.
Macro language could be used, but not with this approach. Instead, macro language would generate 12 assignment statements, such as:
test_3mo_1 = (test_in_mo_1 = test_in_mo_2 = test_in_mo_3 = 1);
test_3mo_2 = (test_in_mo_2 = test_in_mo_3 = test_in_mo_4 = 1);
...
But remember, you can't just insert macro language into the middle of a program. To use a %DO statement, you have to define the macro, then call it in your program.
I'm not sure if you are only dealing with a 3 month interval or a varying interval.
This may work (untested as no data provided).
data want; set test_merge; array TestIn test_in_mo_1-test_in_mo_14 ; array TestOut test_3mo_1-test_3mo_12; Do i = 1 to dim(testout); testout = ( sum(testin[i],TestIn[i+1},TestIn[i+2])=3 ); end; run;
This uses an assumption that your test_in variables are value 0/1. A sum of 3 that = 3 means that they are all 1. The sum function is used in case some are missing though likely wouldn't matter much for this bit.
The ( sum() = 3); returns a boolean result for the comparison, 1 if true and 0 if false.
If you are thinking about doing this with 2, 3, 4, 5 etc intervals you may be better pointed to Proc IML and matrices as calculating enough offsets to handle variations is obnoxsious.
Not that having this in a wide format to begin with is likely more complicated than doing with long data (one record per month(?))
Available on demand!
Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.