DATA Step, Macro, Functions and more

Processing multiple variables in array/ do loop

Reply
New Contributor
Posts: 4

Processing multiple variables in array/ do loop

Have:

- 4 variables for CD4 count (CD4_1, CD4_2, CD4_3 CD4_4)

- 4 variables for the date when the CD4 count was measured (CD4_date1, CD4_date2, CD4_date3, CD4_date4)

- 1 variable specifying the start date of the study period (start_sp)

- 1 variable specifying the end date of the study period (end_sp)

 

Want to create 3 new dichotomous variables:

- CD4_1000, indicates if the individual had a CD4 count less than 1000, with the date of the test falling between the start and end of the study period (start_sp and end_sp)

- CD4_500, indicates if the individual had a CD4 count less than 500, with the date of the test falling between the start and end of the study period (start_sp and end_sp)

- CD4_350, indicates if the individual had a CD4 count less than 350, with the date of the test falling between the start and end of the study period (start_sp and end_sp)

 

My question is how to create these 3 variables without a paragraph of if then's. This creates the 3 variables, but only incorporates the counts of the tests, not the dates:

 

array CD4 [*] CD4Count;
   CD4_350 = 0;
   CD4_500 = 0;
   CD4_1000 = 0;
   do i = 1 to dim(CD4);
      if CD4[i] < 350 then CD4_350 = 1;
      if CD4[i] < 500 then CD4_500 = 1;
      if CD4[i] < 1000 CD4_1000 = 1;

   end;

 

How can I modify this to include the requirement that the date of the test had to occur within the study period?

Super User
Posts: 10,466

Re: Processing multiple variables in array/ do loop

Some example input data and output for that is desireable.

 

Your code as shown is only going to process 1 variable comparison, that of CD4Count.

You may have meant

array cd cd4_1 - cd4_4 ; so that the base values are those four.

a second array would be needed to have the matching date

array d cd4_date1 - cd4_date4;

 

Your if statements would look something like

 

if cd[i] < (value) and (startdate le d[i] le enddate) then do ...

if the date isn't suppose to match the start/end to be "within period" then use lt or < insted of le.

 

HOWEVER you have a logic problem in that if the cd4_4 < 350 (or missing as missing is less than any value in SAS ) all of the resulting cd4_350, cd4_500 and cd4_1000 will all be true.

 

I am not sure what you want for the cd4_350 for instance as it may well change with each cd4 count variable.

Suppose Cd4_1=200, cd4_2=600, cd4_3=900, and cd4_4=1200. Then the first will set cd_350, cd4_500 and cd4_1000 all to 1.

there won't be any change in the cd4_350/500/1000 variables. This may be what you want if the interpretation is "at sometime within the study period at least one of the Cd4 counts was less than XXXX"

PROC Star
Posts: 1,558

Re: Processing multiple variables in array/ do loop

Like this?

 

do I = 1 to dim(CD4);
  CD4_350 =(  . < CD4[I] <=  350) * ( START_SP<CD4_DATE[I]<END_SP) * CD4_DATE[I];
  CD4_500 =(350 < CD4[I] <=  500) * ( START_SP<CD4_DATE[I]<END_SP) * CD4_DATE[I];
  CD4_1000=(500 < CD4[I] <= 1000) * ( START_SP<CD4_DATE[I]<END_SP) * CD4_DATE[I];
end;

This will store the test date provided the value is within the range and the date is with the date boundaries too.

Otherwise it will store zero.

The elements in the parentheses are tests and resolve to either 0 or 1.

 

 

Ask a Question
Discussion stats
  • 2 replies
  • 124 views
  • 0 likes
  • 3 in conversation