Array Do Loop Returning Incorrect Data

Accepted Solution Solved
Reply
Occasional Contributor
Posts: 6
Accepted Solution

Array Do Loop Returning Incorrect Data

data rfm162.fy16_array2;
retain
booked_date1-booked_date32
dmd1-dmd32
mbtwn1-mbtwn32
;

 

set rfm162.fy16_array;
by curr_wg_buid booked_date;

 

format booked_date1-booked_date32 datetime20.;

 

array bd(32) booked_date1-booked_date32;
array money(32) dmd1-dmd32;
array mbtwn(32) mbtwn1-mbtwn32;

 

do i=1 to 32;
    if i>count then continue; else
    if i=count then do;
        bd(i)=booked_date;
        money(i)=dmd;
    end;
    if i>=2 then do;
        m=i-1;
        mbtwn(i)=((bd(i)-bd(m))/2592000);
    end;
end;

 

dmd_total=sum(of dmd1-dmd32);

 

drop dmd1-dmd32;
if last.curr_wg_buid;

 

run;


Hey all,

I'm still relatively new to SAS. I have been generally successful at troubleshooting my code, but this has me stumped. I've got "long" dataset sorted by business unit identifier (curr_wg_buid), and booked date (date of purchase). What I want is a "wide" dataset, that takes all of the booked dates and puts them on one row for each business unit identifier. I also want to calculate the number of months between the dates (mbtwn) and the total value of the revenue generated in the orders (dmd_total). 

This code is getting awful close, but is simply returning the wrong info. I've got a count variable (count) on the original data file (fy16_array) that counts by 1 the number of times a BUID appears along with booked_date and dmd (revenue) variables for that transaction. When looking at the output file, there are many BUIDs with count values of 1, with several reported purchase dates, so I know that something is wrong.

Any hints? Thank you,


Accepted Solutions
Solution
‎11-10-2015 09:30 AM
Super User
Super User
Posts: 6,502

Re: Array Do Loop Returning Incorrect Data

[ Edited ]

So the basic problem with your posted code is that you are retaining the values in the array, but not resetting them when starting a new group.  Placing the DO loop around the SET statement eliminates the need for the RETAIN statement and will eliminate this problem.  You data already has a COUNT variable that could be used as the array index.  Since you don't want to keep the variables used in the MONEY() array then perhaps you can make it a temporary array?  However TEMPORARY arrays are retained so you will need to reset those variables.

 

data rfm162.fy16_array2;
  do until (last.curr_wg_buid) ;
    set rfm162.fy16_array;
    by curr_wg_buid booked_date;
    array bd(32) booked_date1-booked_date32;
    array money(32) _temporary_ ;
    array mbtwn(32) mbtwn1-mbtwn32;
    format booked_date1-booked_date32 datetime20.;
 
    bd(count)=booked_date;
    money(count)=dmd;
    if count > 1 then 
        mbtwn(count)=((bd(count)-bd(count-1))/2592000)
    ;
  end;
  dmd_total=sum(of money(*));
  call missing(of money(*));
run;

   

View solution in original post


All Replies
Super User
Posts: 17,963

Re: Array Do Loop Returning Incorrect Data

1. you need to reset the arrays to empty at the end of each curr_wg_buid

There's a macro written here that may help you as well:

http://www.sascommunity.org/wiki/A_Better_Way_to_Flip_(Transpose)_a_SAS_Dataset
Super User
Posts: 10,552

Re: Array Do Loop Returning Incorrect Data

With something like this it would really help to have some example data. It need not have as many rows to transpose but should show the same behavior. Also some result should be provided.


Matthew wrote:

This code is getting awful close, but is simply returning the wrong info. I've got a count variable (count) on the original data file (fy16_array) that counts by 1 the number of times a BUID appears along with booked_date and dmd (revenue) variables for that transaction. When looking at the output file, there are many BUIDs with count values of 1, with several reported purchase dates, so I know that something is wrong.

Any hints? Thank you,


I can think of a couple of different ways to interpret "by 1 the number of times a BUID appears along with booked_date and dmd (revenue) variables for that transaction", so you might want to post how that count was done. How you used either Booked_date or BUID in that counting are likley places to look.

Super User
Posts: 17,963

Re: Array Do Loop Returning Incorrect Data

Some other thoughts:

Why go to 32 if you have a variable count that you could use as the loop terminator? You can also write the loop as: do i=1 to 32 while (i<count);

Check your nested if conditions, they may not be executing the way you expect them to.
Super User
Posts: 5,099

Re: Array Do Loop Returning Incorrect Data

 

Guessing that this is what you are hoping to achieve:

 

data want;

i=0;

dmd_total=0;

array bd (32) booked_date1 - booked_date32;

array mbtwn (32);

do until (last.curr_wg_buid);

   set rfm162.fy16_array;

   by curr_wg_buid;

   i + 1;

   dmd_total + dmd;

   bd(i) = booked_date;

   if i >= 2 then mbtwn(i) = (bd(i)-bd(i-1))/2592000;

end;

run;

 

This will run into trouble (fixable trouble) if you ever have more than 32 observations for the same CURR_WG_BUID.

Occasional Contributor
Posts: 6

Re: Array Do Loop Returning Incorrect Data

Wow, quick responses!

My rationale behind not terminating at "while i<=count" is because I'd done a proc freq on count values before this step just to see what the max number of transactions by a single customer was. I will try some of these solutions when I'm back in the office in the morning.

Respected Advisor
Posts: 3,908

Re: Array Do Loop Returning Incorrect Data

You need to give us some sample data. It's not that hard to solve your problem but the solution will depend on how your source data looks like - and it's not conclusive based on the code you've posted.

Occasional Contributor
Posts: 6

Re: Array Do Loop Returning Incorrect Data

I've attached my source dataset here. As you can see, every time a BUID makes a purchase, I have a row for that transaction, sorted by BUID and booked_date.


Loop Sample.JPG
Solution
‎11-10-2015 09:30 AM
Super User
Super User
Posts: 6,502

Re: Array Do Loop Returning Incorrect Data

[ Edited ]

So the basic problem with your posted code is that you are retaining the values in the array, but not resetting them when starting a new group.  Placing the DO loop around the SET statement eliminates the need for the RETAIN statement and will eliminate this problem.  You data already has a COUNT variable that could be used as the array index.  Since you don't want to keep the variables used in the MONEY() array then perhaps you can make it a temporary array?  However TEMPORARY arrays are retained so you will need to reset those variables.

 

data rfm162.fy16_array2;
  do until (last.curr_wg_buid) ;
    set rfm162.fy16_array;
    by curr_wg_buid booked_date;
    array bd(32) booked_date1-booked_date32;
    array money(32) _temporary_ ;
    array mbtwn(32) mbtwn1-mbtwn32;
    format booked_date1-booked_date32 datetime20.;
 
    bd(count)=booked_date;
    money(count)=dmd;
    if count > 1 then 
        mbtwn(count)=((bd(count)-bd(count-1))/2592000)
    ;
  end;
  dmd_total=sum(of money(*));
  call missing(of money(*));
run;

   

Occasional Contributor
Posts: 6

Re: Array Do Loop Returning Incorrect Data

Tom,

With your advice I've been able to make my code work correctly by clearing the arrays, but running the code you provided does not yield the same result. At this point I'm just curious as to why it doesn't, because it seems that it should. My code, essentially the same as before, but with commands to clear the arrays if i>count:

data rfm162.fy16_array2;


retain
booked_date1-booked_date32
dmd1-dmd32
mbtwn1-mbtwn32
;

 

set rfm162.fy16_array;
by curr_wg_buid booked_date;

 

format booked_date1-booked_date32 datetime20.;

 

array bd(32) booked_date1-booked_date32;
array money(32) dmd1-dmd32;
array mbtwn(32) mbtwn1-mbtwn32;

 

do i=1 to 32;
    if i>count then do;
        bd(i)=.;
        money(i)=.;
        mbtwn(i)=.;
        continue;
    end;
    if i=count then do;
        bd(i)=booked_date;
        money(i)=dmd;
    end;
    if i>=2 then do;
        m=i-1;
        mbtwn(i)=((bd(i)-bd(m))/2592000);
    end;
end;

 

dmd_total=sum(of dmd1-dmd32);

drop dmd1-dmd32;


if last.curr_wg_buid;

 

run;

--------------------

Comparing this to your code, it seems to me that the array variables must be retained from one iteration to the next. Your code returns for me only values for the first booked_date of each BUID, even though the count values for these rows suggest that it has dropped all but the last.curr_wg_buid, as it should. This only half makes sense to me. If it was going to only return one of the booked_dates, I would expect it to be that of the last booked_date for that BUID on record, which I would also expect to be placed in the booked_date(n) column associated with that count value. It is currently only populating booked_date1.

Super User
Posts: 17,963

Re: Array Do Loop Returning Incorrect Data

Like I mentioned above at the first occurence of curr_wg_buid you need to reset the array to . for all values.

if first.curr_wg_build then do i=1 to 32;
call missing(bd(i), money(i), mbtwn(i));
end;
Super User
Super User
Posts: 6,502

Re: Array Do Loop Returning Incorrect Data

[ Edited ]

Why do you have the DO loop at all if you are using FIRST./LAST. processing?  If you just want to find the total of DMD you do not need an array.

 

data rfm162.fy16_array2;
  retain 
    booked_date1-booked_date32
    mbtwn1-mbtwn32
  ;
 
  set rfm162.fy16_array;
  by curr_wg_buid booked_date;
 
  format booked_date1-booked_date32 datetime20.;
 
  array bd booked_date1-booked_date32;
  array mbtwn mbtwn1-mbtwn32;
 
  if first.curr_wg_buid then call missing(of bd(*)  mbtwn(*));
  if first.curr_wg_buid then i=0;
  i+1;
  if i ne count then put 'NOTE: I and COUNT are not equal.' (_n_ i count) (=);
  bd(i)=booked_date;
  if i>=2 then do;
        mbtwn(i)=((bd(i)-bd(i-1))/2592000);
  end;
  retain dmd_total ;
  if first.curr_wg_buid then dmd_total=0;
  dmd_total=sum(dmd_total,dmd);
  if last.curr_wg_buid then output ;
run;

   

🔒 This topic is solved and locked.

Need further help from the community? Please ask a new question.

Discussion stats
  • 11 replies
  • 583 views
  • 3 likes
  • 6 in conversation