BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
SASDatesKillMe
Obsidian | Level 7

I have a survey date variable (Q1Date) and a list of chemo infusion dates (I1-I3). I'd like to determine whether any of I1-I3 are within 0-28 days before Q1Date. In reality I have five version of survey dates and 15 infusion dates I'll need to do this for, but for simplicity I have tried to do this first with just one Q1Date and a infusion dates.

 

The array below seems to work, but it seems overly cumbersome. Could any suggest a way to simplify this, perhaps using a simpler array or perhaps PROC SQL?

 

Thanks!

 

DATA Dates;
INPUT ID $2 Q1Date mmddyy11. I1 mmddyy11. I2 mmddyy11. I3 mmddyy11.;
CARDS;
01 07/01/2000 01/01/1990 06/10/2000           
02 07/01/2000 01/01/2000 02/01/2000 06/15/2000
03 07/01/2000 07/10/2000 02/01/2000 01/01/2000
04            07/10/2000 02/01/2000 01/01/2000
;
RUN;

DATA Dates;
SET Dates;
FORMAT Q1Date I1-I3 mmddyy10.;
RUN;

PROC PRINT DATA=Dates;
RUN;


DATA Test;
SET Dates;

Q1OnChemo=0; *Set Q1OnChemo to zero at first;

ARRAY InfusionDate[3] I1 I2 I3; *List of infusion dates;
DO i=1 TO 3;
	IF NOT MISSING(Q1Date) & /*Do only if Q1Date isn't missing*/
	NOT MISSING(InfusionDate(i)) & /*Do only if InfusionDate isn't missing*/
	Q1Date - InfusionDate(i) GE 0 & /*Do only if Q1Date is on or after InfusionDate*/
	Q1Date - InfusionDate(i) LE 28	/*Do only if InfusionDate is within 28 days before Q1Date*/
	THEN Q1OnChemo=1;
END;
DROP i;

IF (I1=. & I2=. & I3=.) | Q1Date=. THEN Q1OnChemo=.; *Set Q1OnChemo to missing if Q1Date or ALL the infusion dates are missing;
RUN;

PROC PRINT DATA=TEST;
RUN;
1 ACCEPTED SOLUTION

Accepted Solutions
ballardw
Super User

Please see for one way:

DATA Dates;
   INPUT ID $2 Q1Date mmddyy11. I1 mmddyy11. I2 mmddyy11. I3 mmddyy11.;
   FORMAT Q1Date I1-I3 mmddyy10.;

CARDS;
01 07/01/2000 01/01/1990 06/10/2000           
02 07/01/2000 01/01/2000 02/01/2000 06/15/2000
03 07/01/2000 07/10/2000 02/01/2000 01/01/2000
04            07/10/2000 02/01/2000 01/01/2000
;
RUN;

DATA Test;
SET Dates;

Q1OnChemo=0; *Set Q1OnChemo to zero at first;

ARRAY InfusionDate[3] I1 I2 I3; *List of infusion dates;
IF NOT MISSING(Q1Date) then DO i=1 TO 3;
   Q1OnChemo = (0 le (Q1Date - InfusionDate(i)) le 28);
   if Q1OnChemo=1 then leave;
END;
DROP i;

IF (I1=. & I2=. & I3=.) | Q1Date=. THEN Q1OnChemo=.; *Set Q1OnChemo to missing if Q1Date or ALL the infusion dates are missing;
RUN;

Likely one issue you were having is that your DO loop evaluated all of the infusion dates and you were getting the result for the last one. The LEAVE instruction says to exit the loop if the condition is met.

 

Note that SAS allows comparing in a value < something < other value as a single test, which makes the code easier tor read.

 

= (0 le (Q1Date - InfusionDate(i)) le 28) returns the LOGICAL value of the comparison true(1) or false(0).

note that if the infusion date is missing the subtraction returns a missing value and the test result is false.

 

I skip evaluating the do loop entirely if the Q1Date is missing as otherwise you test for that condition for each infusion.

 

With multiple survey dates the have an array of those and loop through them out side the loop over the infusion dates.

View solution in original post

4 REPLIES 4
RW9
Diamond | Level 26 RW9
Diamond | Level 26

First off, and it seems to be becoming the norm here all of a sudden, avoid coding all in UPPERCASE!  You see, its sounds like shouting and it makes code so much harder to read.  

I can tell you that proc sql will not make you code less cumbersome in this scenario.  SQL deals with normalised data very well, but transposed data which is what you have it is not designed for.

Perhaps some changes to the code:

data want;
  set dates;
  if q1date ne . then 
    q1onchemo=ifn(max(of i:) - q1date le 28,1,0);
run;

This uses functions rather than loops, so we take the max of the array first, and if that is within 28 days set the flag to 1.

 

And a similar thing using and array - note this will find any value <= 28 days, not just the max - wasn't sure from your description which you needed:

data want;
  set dates;
  array i{3};
  if q1date ne . then do; 
    do j=1 to 3;
      if i{j} - q1date le 28 then q1onchemo=1;
    end;
  end;
run;
SASDatesKillMe
Obsidian | Level 7

Much appreciated. Good to know that SQL is better suited for long data.

ballardw
Super User

Please see for one way:

DATA Dates;
   INPUT ID $2 Q1Date mmddyy11. I1 mmddyy11. I2 mmddyy11. I3 mmddyy11.;
   FORMAT Q1Date I1-I3 mmddyy10.;

CARDS;
01 07/01/2000 01/01/1990 06/10/2000           
02 07/01/2000 01/01/2000 02/01/2000 06/15/2000
03 07/01/2000 07/10/2000 02/01/2000 01/01/2000
04            07/10/2000 02/01/2000 01/01/2000
;
RUN;

DATA Test;
SET Dates;

Q1OnChemo=0; *Set Q1OnChemo to zero at first;

ARRAY InfusionDate[3] I1 I2 I3; *List of infusion dates;
IF NOT MISSING(Q1Date) then DO i=1 TO 3;
   Q1OnChemo = (0 le (Q1Date - InfusionDate(i)) le 28);
   if Q1OnChemo=1 then leave;
END;
DROP i;

IF (I1=. & I2=. & I3=.) | Q1Date=. THEN Q1OnChemo=.; *Set Q1OnChemo to missing if Q1Date or ALL the infusion dates are missing;
RUN;

Likely one issue you were having is that your DO loop evaluated all of the infusion dates and you were getting the result for the last one. The LEAVE instruction says to exit the loop if the condition is met.

 

Note that SAS allows comparing in a value < something < other value as a single test, which makes the code easier tor read.

 

= (0 le (Q1Date - InfusionDate(i)) le 28) returns the LOGICAL value of the comparison true(1) or false(0).

note that if the infusion date is missing the subtraction returns a missing value and the test result is false.

 

I skip evaluating the do loop entirely if the Q1Date is missing as otherwise you test for that condition for each infusion.

 

With multiple survey dates the have an array of those and loop through them out side the loop over the infusion dates.

SASDatesKillMe
Obsidian | Level 7

This is great. Thanks. This much simpler array is very helpful.

 

The loop was indeed returning the result of the last infusion date in some instances, so the LEAVE instruction is a big improvement too.

 

Could you help with how to add the additional array of survey dates? I think I get that I'd just write an array of survey dates, but I'm stuck on how to specify that only the Q1Date is relevant for the new variable "Q1OnChemo" and that only the Q2Date is relevant for the new variable "Q2OnChemo," etc... I can't seem to figure out how to avoid having each iteration write onto the Q1OnChemo variable.

 

Would this longer version of your single array be a reasonable alternative?

 

DATA Dates;
INPUT ID $2 Q1Date mmddyy11. Q2Date mmddyy11. Q3Date mmddyy11. I1 mmddyy11. I2 mmddyy11. I3 mmddyy11.;
CARDS;
01 07/01/2000 08/01/2000 09/01/2000 01/01/1990 06/10/2000 07/15/2000
02 07/01/2000            09/01/2000 01/01/1990 06/10/2000 07/15/2000
03 07/01/2000 08/01/2000 09/01/2000 01/01/1990            07/15/2000
;
RUN;

DATA Dates;
SET Dates;
FORMAT Q1Date Q2Date Q3Date I1-I3 mmddyy10.;
RUN;

PROC PRINT DATA=Dates;
RUN;


DATA Test;
SET Dates;
Q1OnChemo=0;
Q2OnChemo=0;
Q3OnChemo=0;

ARRAY InfusionDate[3] I1 I2 I3; *List of infusion dates;
IF NOT MISSING(Q1Date) THEN DO i=1 TO 3;
   Q1OnChemo = (0 LE (Q1Date - InfusionDate(i)) LE 28); IF Q1OnChemo=1 THEN LEAVE;
   Q2OnChemo = (0 LE (Q2Date - InfusionDate(i)) LE 28); IF Q2OnChemo=1 THEN LEAVE;
   Q3OnChemo = (0 LE (Q3Date - InfusionDate(i)) LE 28); IF Q3OnChemo=1 THEN LEAVE;
END;
DROP i;

IF (I1=. & I2=. & I3=.) | Q1Date=. THEN Q1OnChemo=.; *Set Q1OnChemo to missing if Q1Date or ALL the infusion dates are missing;
IF (I1=. & I2=. & I3=.) | Q2Date=. THEN Q2OnChemo=.; *Set Q2OnChemo to missing if Q1Date or ALL the infusion dates are missing;
IF (I1=. & I2=. & I3=.) | Q3Date=. THEN Q3OnChemo=.; *Set Q3OnChemo to missing if Q1Date or ALL the infusion dates are missing;
RUN;

PROC PRINT DATA=TEST;
RUN;

  

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 4 replies
  • 1546 views
  • 2 likes
  • 3 in conversation