DATA Step, Macro, Functions and more

arrays

Reply
Contributor
Posts: 63

arrays

Hi,

 

data patient_medication;
input patient_ID Medication $ Dose (Date_begin Date_end) (: mmddyy10.);
format Date_begin Date_end mmddyy10.;
datalines;
1 A 3 05/08/2009 06/09/2010
1 B 1 04/04/2009 12/12/2009
2 X 5 06/08/2009 09/09/2010
2 Y 2 08/04/2010 10/10/2010 ;
run;

proc sort data=patient_medication;
by patient_ID Date_begin Date_end;
run;

proc sql noprint;
select min(Date_begin), max(Date_end) into :first_date, :last_date from patient_medication;
quit;

data max_drug;
set patient_medication;
by patient_ID Date_begin Date_end;
array drug_day[&first_date : &last_date] _temporary_;
if first.patient_ID then call missing(of drug_day[*]);

do dt=Date_begin to Date_end;
drug_day[dt]+dose;

if last.patient_ID then do; max_dose_day=max(of drug_day[*]);
output;
end;
keep patient_ID max_dose_day;
run;

 

in the above progrm a temporary array drug_day is created that which elements  are stored inthat array?can u pls explain this step

do dt=Date_begin to Date_end;
drug_day[dt]+dose;

Occasional Contributor
Posts: 12

Re: arrays

Hello Molla

In this block of code, dt is an array index that loops thru the temporary array drug_day which is defined by the array statement on the fourth line down from the SAS data step called max_drug. The temporary array's lower boundary is the earliest date value from the input records read in following the datalines statement and the array's upper boundary is the latest date value read in following the datalines statement. The proc sql block following the sort routine takes the earliest/minimum date value and the latest/maximum date value and assigns them to corresponding macro variables, hence the array is indexed by date and the array's range is from the minimum to the maximum date read in.

It is a classic loop structure in SAS

So array drug_day[dt] + Dose is:
DT is 04/04/2009 and Dose = 1 in the first instance, dt will be the minimum date value and Dose will be 1.

do dt=Date_begin to Date_end;
drug_day[dt]+dose;


The doses for each patient are being accumulated by the date the doses were given.

Contributor
Posts: 63

Re: arrays

drug_day[dt]+dose;

drug_day[04/04/2009 ] what  will  be its  value

Super User
Posts: 5,093

Re: arrays

Unless you have dates going back to 1960, the array subscript will be out of range.  The value of 04/04/2009 is 4 divided by 4 divided by 2009 ... in other words, a very small fraction a little greater than zero.

Super User
Posts: 5,093

Re: arrays

Also note that the program is open to question.

 

If the date ranges do not overlap for a given patient, this code merely finds the maximum DOSE value for each patient.  There are very easy ways to accomplish that without complicating things with arrays.

 

If the date ranges do overlap for a given patient, this code adds together the DOSE values for different medications before finding the maximum value.  That could conceivably be a good result, but it's difficult to imagine.

PROC Star
Posts: 554

Re: arrays

Also, I believe an end statement is missing somewhere Smiley Happy

Occasional Contributor
Posts: 12

Re: arrays

Hello Molla

correction:

 

The array indexes are the sas date values derived from the date strings(date_begin and date_end) that are read in via the datalines. The input stmt format for date_begin and date_end specify that these values are sas dates. Date_begin value 04/04/2009 is sas date 17991. For each patient entry, the loop goes from the sas date value for the date_begin to the sas date value for the date_end and adds the dose amount to EACH array element in the date range span. That is what the code is doing, whether correct or not. So for patient id 1 with a date_begin of 04/04/2009( sas date 17991) and date_end of 12/12/2009( sas date 18243), the Dose amount of 1 will be added to EACH array value from drug_day1 to drug_day252, since there are 252 days from date_begin to date_end for the entry that is '1 B 1 04/04/2009 12/12/2009'. So the array index can be viewed explicitly as drug_day[17991] thru drug_day[18243] or drug_day1 thru drug_day252. For each observation that corresponds to an input record in this example, a temporary array is created with a lower bound of the minimum date value of all the date values in the dataset and a upper bound of the maximum date in the dataset. But only the actual date_begin index thru the actual date_end index are assigned values in the temporary array. So in the case of entry '2 X 5 06/08/2009 09/09/2010', drug_day66 to drug_day524 or drug_day [18056] thru drug_day[18514] will only have values since there are 458 days from date_begin to date_end. Check how SAS treats dates and SAS array references. 

 

Paste the code into SAS UNIV Edition editor and step thru it. You will have to add an end statement after drug_day[dt]+dose and remove the ';' after the last datalines entry.

Ask a Question
Discussion stats
  • 6 replies
  • 176 views
  • 0 likes
  • 4 in conversation