Desktop productivity for business analysts and programmers

Array looping through dates in YYMMN6. format

Accepted Solution Solved
Reply
Occasional Contributor
Posts: 14
Accepted Solution

Array looping through dates in YYMMN6. format

Hi,

 

Would anyone know how to loop through dates in an array with formats such as 201501 for January 2015.

I am trying to loop through 6 years and get a total balance for all accounts for each year, please see the following code.

%let from=200701; %let to=201508;

data newdata (keep= _Smiley Happy;

set olddata end=Lastobs;

array month{*} balance_&from.-balance_&to.;

array out{*} _&from.-_&to.;

do i=1 to 104;

out{i}+sum(of month{i});

end;

if Lastobs;

run;

 


Accepted Solutions
Solution
‎11-26-2015 05:16 AM
Trusted Advisor
Posts: 1,114

Re: Array looping through dates in YYMMN6. format

[ Edited ]

Hi @PriyaL,

 

The statement var &names; in my solution restricts the summations to variables with the prefix 'account_balance' and a year/month suffix within the specified date range. If there was no need to have the restriction to the date range (because all 'account_balance' variables in the input dataset are within that range), you could write var account_balance:; instead.

View solution in original post


All Replies
Esteemed Advisor
Esteemed Advisor
Posts: 7,211

Re: Array looping through dates in YYMMN6. format

[ Edited ]

Simple answer here is don't (i.e. don't put data in column headers).  The YYMM is data, you don;t want to be using that as variable names.  You will just set yourself up for a lot of coding to handle this.  Show an example (in the form of a datastep) some test data, and what you want out.  I would personally suggest you use a normalised data structure, i.e. long rather than wide, and if you need to transpose later on, e.g.

... MONTH  BALANCE

... 2015-01  2100

... 2015-02  2300

... 2015-03  4300

...

 

Using the above you can then simple use aggregates on the data to sum things up.  Obviously without seeing the data its hard to say.

 

 

Here is an example of working with normalised data:

data have;
 informat timepoint yymmn6.;
 format timepoint yymmn6.;
 input timepoint balance;
datalines;
201501 2300
201504 4600
201505 5100
201506 100
201601 9300
201602 800
;
run;

proc sql;
  create table WANT as
  select  distinct
          year(TIMEPOINT) as YEAR,
          sum(BALANCE) as SUM_BALANCE
  from    HAVE
  group by year(TIMEPOINT);
quit;
Occasional Contributor
Posts: 14

Re: Array looping through dates in YYMMN6. format

The colomn headers simply depict the data that is in the month, so for coloumn "balance_201305" that will be a coloumn of all customer balances in May 2013.

So I guess you are suggesting I change the layout of the inital data and sum through the row and create a Sum Total field at the end.

Can this be achieved with the Proc Transpose function in SAS?

Thanks

Esteemed Advisor
Esteemed Advisor
Posts: 7,211

Re: Array looping through dates in YYMMN6. format

The key here is the step before this.  I.e. you don't want to end up with a dataste with data in column headings.  What does your data look like before summing up, post a test example as a datastep with some made up bits if needed.

The reason being is that Balance just references a column of balance data, the 201305 is data itself that you need to use.  If you look at transposed dataset, you will see they generally have the form of <prefix><incrementor>, so COL1, where in this example COL is the prefix and 1 is in the incrementor and could be 2,3 etc.  Then when dealing with columns you can select the whole range by using COL:, or COL1-COLx.  You cannot do this directly with data in thee top as it may not be sequential or logical.

Occasional Contributor
Posts: 14

Re: Array looping through dates in YYMMN6. format

Here is an idea of the data

Account_Numberaccount_balance201404account_balance201405account_balance201406account_balance201407account_balance201408account_balance201409
13489757870077.7169721.0769363.7769005.8268647.2168287.94
198994889200682.6200682.5200682.4200682.3200682.19200682.09
388499933126449.9125801.3125151.6124500.6123848.46123195.11
384902999106735.5106735.5106735.5106735.5106735.49106735.49
23849939935464.2735465.4535466.6435467.8335469.0235470.21
.......
.......
.......

 

There are several rows, and i am just looking to make 1 table with 2 rows so the first row is the Overall sums per month and the second row is the final sum.

So for example:

Jan-07Feb-07Mar-07Apr-07May-07Jun-07Jul-07
3894885773894970723.9E+083.9E+083.9E+08389531052389539547
Respected Advisor
Posts: 3,832

Re: Array looping through dates in YYMMN6. format

You're already summing all accounts per month so I'm not sure how your 2nd row should look like (overall sum). Should this be the sum per year (=additional columns not additional row).

 

If you transpose your data from a wide to a long format then there are procedures like Proc Tabulate and Proc Report which provide a wide variety of options to give you the report you want.

data olddata;
input Account_Number:$16. account_balance200612 account_balance200701 account_balance201406 account_balance201407 
  account_balance201508 account_balance201509;
cards;
134897578 70077.71 69721.07 69363.77 69005.82 68647.21 68287.94
198994889 200682.6 200682.5 200682.4 200682.3 200682.19 200682.09
388499933 126449.9 125801.3 125151.6 124500.6 123848.46 123195.11
384902999 106735.5 106735.5 106735.5 106735.5 106735.49 106735.49
238499399 35464.27 35465.45 35466.64 35467.83 35469.02 35470.21
;
run;

data long(drop= _: account_balance:);
  set olddata;
  format balance_dt date9. acc_amt best16.;
  array acc_bal {*} account_balance:;
  do _i=1 to dim(acc_bal);
    acc_amt=acc_bal[_i];
    balance_dt=input(compress(vname(acc_bal[_i]),,'kd'),yymmn6.);
    format balance_dt date9.;
    output;
  end;
run;
Grand Advisor
Posts: 17,338

Re: Array looping through dates in YYMMN6. format

Yes, Proc Transpose can be used to flip your data. Then you can use BY processing to group your data. You may also want to look into the SCAN function to parse the variable name. This will allow you to create a variable to represent the Year/Month that you can then group by. If you create a SAS date you can then apply formats to the variable that will allow you to summarize at different levels.
Trusted Advisor
Posts: 1,114

Re: Array looping through dates in YYMMN6. format

[ Edited ]

Hi, @PriyaL,

 

I fully agree with @RW9 and @Reeza that you should really transpose your "OLDDATA" into a vertical structure in the future.

 

That said, if you want to summarize the existing OLDDATA, why don't you just apply PROC SUMMARY? If the only problem is the selection of columns within a date range, you could try this:

 

/* Create test data (please note that the first and last account balance are outside your date range) */

data olddata;
input Account_Number account_balance200612 account_balance200701 account_balance201406 account_balance201407 account_balance201508 account_balance201509;
cards;
134897578 70077.71 69721.07 69363.77 69005.82 68647.21 68287.94
198994889 200682.6 200682.5 200682.4 200682.3 200682.19 200682.09
388499933 126449.9 125801.3 125151.6 124500.6 123848.46 123195.11
384902999 106735.5 106735.5 106735.5 106735.5 106735.49 106735.49
238499399 35464.27 35465.45 35466.64 35467.83 35469.02 35470.21
;


/* Specify the date range (in YYYYMM format) */

%let from=200701; %let to=201508;


/* Create the list of variable names "within" the date range (macro var. NAMES) */

proc sql noprint;
select name
into :names separated by ' '
from dictionary.columns
where libname='WORK' & memname='OLDDATA' & upcase(name) eqt 'ACCOUNT_BALANCE'
      & input("&from",yymmn6.) <= input(substr(name,16),yymmn6.) <= input("&to",yymmn6.);
quit;


/* Create a dataset with monthly totals */

proc summary data=olddata;
var &names;
output out=totals(drop=_:) sum=;
quit;

proc print data=totals;
run;

Your original data step code suggests that you would like to have simplified column names in the TOTALS dataset. But I think you will be better off if you don't rename but just label your variables accordingly. This can be achieved as follows:

/* Create label assignments (macro var. LABELS) */

proc sql noprint;
select catx('=',name,quote(put(input(substr(name,16),yymmn6.),monyy5.)))
into :labels separated by ' '
from dictionary.columns
where libname='WORK' & memname='TOTALS' & upcase(name) eqt 'ACCOUNT_BALANCE';
quit;


/* Apply label assignments to dataset TOTALS */

proc datasets lib=work nolist;
modify totals;
label &labels;
quit;

proc print data=totals label;
run;

It's not quite clear to me if this is all you need.

  • In your original post you wrote you were "trying to loop through 6 years." But your "from" and "to" dates span a period of 8 years and 8 months.
  • You wanted to "get a total balance for all accounts for each year." But later on it seemed you were happy with totals for each month. After all, how would you interpret a sum of 12 monthly account balances of the same person?
  • In your third post you wrote you were "just looking to make 1 table with 2 rows so the first row is the Overall sums per month and the second row is the final sum." But the table you presented as an example has only one row (plus column headers) -- like the TOTALS dataset created above. So, what kind of "final sum" did you envisage for the second row?

 

 (Edit: Replaced tabs by spaces in the first data step to avoid issues depending on Enhanced Editor options.)

 

 

Occasional Contributor
Posts: 14

Re: Array looping through dates in YYMMN6. format

HI @FreelanceReinhard

 

I think you are right as a proc summary is the easiest way as my data is already in the columns by month. So i would just need to sum through the columns I want to make a final dataset with 2 rows which contains the sum total of all account balances for all acounts in that month.

 

Is there a way to filter on proc summary just to keep the variables with the prefix 'account_balance' as my datasets contains several other variables because it is a master dataset. 

 

Thanks.

Solution
‎11-26-2015 05:16 AM
Trusted Advisor
Posts: 1,114

Re: Array looping through dates in YYMMN6. format

[ Edited ]

Hi @PriyaL,

 

The statement var &names; in my solution restricts the summations to variables with the prefix 'account_balance' and a year/month suffix within the specified date range. If there was no need to have the restriction to the date range (because all 'account_balance' variables in the input dataset are within that range), you could write var account_balance:; instead.

☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 9 replies
  • 739 views
  • 0 likes
  • 5 in conversation