Hi all, I have a dataset that looks like this:
I need to get the information for each BUID into a single row. I know how to do this with arrays and do loops, but only when there's a fixed number of iterations, and I know what that number is. In this case, some BUIDs have more purchases than others for a given time period, and so the number of observations (and times the loop has the iterate) is variable. Currently my code looks like this:
data rfm162.fy13_p4_array2; retain booked_date1-booked_date100 dmd1-dmd100 mbtwn1-mbtwn100 ;
rename count=order_count_fy13_p4 RNCY=rncy_today; label RNCY='rncy_today';
set rfm162.fy13_p4_array; by curr_wg_buid booked_date;
format booked_date1-booked_date100 datetime20.;
array bd(100) booked_date1-booked_date100; array money(100) dmd1-dmd100; array mbtwn(100) mbtwn1-mbtwn100;
do i=1 to 100;
if i>count then do; bd(i)=.; money(i)=.; mbtwn(i)=.; continue; end; if i=count then do; bd(i)=booked_date; money(i)=dmd; end; if i>=2 then do; m=i-1; mbtwn(i)=((bd(i)-bd(m))/2592000); end; end;
dmd_average_fy13_p4=mean(of dmd1-dmd100); dmd_total_fy13_p4=sum(of dmd1-dmd100);
mbtwn_average_fy13_p4=mean(of mbtwn1-mbtwn100);
drop dmd1-dmd100 mbtwn1-mbtwn100 dmd booked_date booked_date1-booked_date100 i m; if last.curr_wg_buid;
run;
This code works, but it's messy in the sense that I might be cutting off a couple of transactions for some customers who made more than 100 transactions. Additionally, when I'm looking at a shorter time period, I've got to either do freq to figure out what the new max number of observations is, or make a relatively arbitrary guess. How can I get the do loop to figure out how many iterations it should loop for me in order to match the maximum number of observations for each BUID? I've got a feeling it has something to do with the *, based on my research, but I haven't been able to get that to work. I get weird "variable outside of array" type messages. Thanks,
... View more