BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
PriyaL
Fluorite | Level 6

Hi,

 

Would anyone know how to loop through dates in an array with formats such as 201501 for January 2015.

I am trying to loop through 6 years and get a total balance for all accounts for each year, please see the following code.

%let from=200701; %let to=201508;

data newdata (keep= _:);

set olddata end=Lastobs;

array month{*} balance_&from.-balance_&to.;

array out{*} _&from.-_&to.;

do i=1 to 104;

out{i}+sum(of month{i});

end;

if Lastobs;

run;

 

1 ACCEPTED SOLUTION

Accepted Solutions
FreelanceReinh
Jade | Level 19

Hi @PriyaL,

 

The statement var &names; in my solution restricts the summations to variables with the prefix 'account_balance' and a year/month suffix within the specified date range. If there was no need to have the restriction to the date range (because all 'account_balance' variables in the input dataset are within that range), you could write var account_balance:; instead.

View solution in original post

9 REPLIES 9
RW9
Diamond | Level 26 RW9
Diamond | Level 26

Simple answer here is don't (i.e. don't put data in column headers).  The YYMM is data, you don;t want to be using that as variable names.  You will just set yourself up for a lot of coding to handle this.  Show an example (in the form of a datastep) some test data, and what you want out.  I would personally suggest you use a normalised data structure, i.e. long rather than wide, and if you need to transpose later on, e.g.

... MONTH  BALANCE

... 2015-01  2100

... 2015-02  2300

... 2015-03  4300

...

 

Using the above you can then simple use aggregates on the data to sum things up.  Obviously without seeing the data its hard to say.

 

 

Here is an example of working with normalised data:

data have;
 informat timepoint yymmn6.;
 format timepoint yymmn6.;
 input timepoint balance;
datalines;
201501 2300
201504 4600
201505 5100
201506 100
201601 9300
201602 800
;
run;

proc sql;
  create table WANT as
  select  distinct
          year(TIMEPOINT) as YEAR,
          sum(BALANCE) as SUM_BALANCE
  from    HAVE
  group by year(TIMEPOINT);
quit;
PriyaL
Fluorite | Level 6

The colomn headers simply depict the data that is in the month, so for coloumn "balance_201305" that will be a coloumn of all customer balances in May 2013.

So I guess you are suggesting I change the layout of the inital data and sum through the row and create a Sum Total field at the end.

Can this be achieved with the Proc Transpose function in SAS?

Thanks

RW9
Diamond | Level 26 RW9
Diamond | Level 26

The key here is the step before this.  I.e. you don't want to end up with a dataste with data in column headings.  What does your data look like before summing up, post a test example as a datastep with some made up bits if needed.

The reason being is that Balance just references a column of balance data, the 201305 is data itself that you need to use.  If you look at transposed dataset, you will see they generally have the form of <prefix><incrementor>, so COL1, where in this example COL is the prefix and 1 is in the incrementor and could be 2,3 etc.  Then when dealing with columns you can select the whole range by using COL:, or COL1-COLx.  You cannot do this directly with data in thee top as it may not be sequential or logical.

PriyaL
Fluorite | Level 6

Here is an idea of the data

Account_Numberaccount_balance201404account_balance201405account_balance201406account_balance201407account_balance201408account_balance201409
13489757870077.7169721.0769363.7769005.8268647.2168287.94
198994889200682.6200682.5200682.4200682.3200682.19200682.09
388499933126449.9125801.3125151.6124500.6123848.46123195.11
384902999106735.5106735.5106735.5106735.5106735.49106735.49
23849939935464.2735465.4535466.6435467.8335469.0235470.21
.......
.......
.......

 

There are several rows, and i am just looking to make 1 table with 2 rows so the first row is the Overall sums per month and the second row is the final sum.

So for example:

Jan-07Feb-07Mar-07Apr-07May-07Jun-07Jul-07
3894885773894970723.9E+083.9E+083.9E+08389531052389539547
Patrick
Opal | Level 21

You're already summing all accounts per month so I'm not sure how your 2nd row should look like (overall sum). Should this be the sum per year (=additional columns not additional row).

 

If you transpose your data from a wide to a long format then there are procedures like Proc Tabulate and Proc Report which provide a wide variety of options to give you the report you want.

data olddata;
input Account_Number:$16. account_balance200612 account_balance200701 account_balance201406 account_balance201407 
  account_balance201508 account_balance201509;
cards;
134897578 70077.71 69721.07 69363.77 69005.82 68647.21 68287.94
198994889 200682.6 200682.5 200682.4 200682.3 200682.19 200682.09
388499933 126449.9 125801.3 125151.6 124500.6 123848.46 123195.11
384902999 106735.5 106735.5 106735.5 106735.5 106735.49 106735.49
238499399 35464.27 35465.45 35466.64 35467.83 35469.02 35470.21
;
run;

data long(drop= _: account_balance:);
  set olddata;
  format balance_dt date9. acc_amt best16.;
  array acc_bal {*} account_balance:;
  do _i=1 to dim(acc_bal);
    acc_amt=acc_bal[_i];
    balance_dt=input(compress(vname(acc_bal[_i]),,'kd'),yymmn6.);
    format balance_dt date9.;
    output;
  end;
run;
Reeza
Super User
Yes, Proc Transpose can be used to flip your data. Then you can use BY processing to group your data. You may also want to look into the SCAN function to parse the variable name. This will allow you to create a variable to represent the Year/Month that you can then group by. If you create a SAS date you can then apply formats to the variable that will allow you to summarize at different levels.
FreelanceReinh
Jade | Level 19

Hi, @PriyaL,

 

I fully agree with @RW9 and @Reeza that you should really transpose your "OLDDATA" into a vertical structure in the future.

 

That said, if you want to summarize the existing OLDDATA, why don't you just apply PROC SUMMARY? If the only problem is the selection of columns within a date range, you could try this:

 

/* Create test data (please note that the first and last account balance are outside your date range) */

data olddata;
input Account_Number account_balance200612 account_balance200701 account_balance201406 account_balance201407 account_balance201508 account_balance201509;
cards;
134897578 70077.71 69721.07 69363.77 69005.82 68647.21 68287.94
198994889 200682.6 200682.5 200682.4 200682.3 200682.19 200682.09
388499933 126449.9 125801.3 125151.6 124500.6 123848.46 123195.11
384902999 106735.5 106735.5 106735.5 106735.5 106735.49 106735.49
238499399 35464.27 35465.45 35466.64 35467.83 35469.02 35470.21
;


/* Specify the date range (in YYYYMM format) */

%let from=200701; %let to=201508;


/* Create the list of variable names "within" the date range (macro var. NAMES) */

proc sql noprint;
select name
into :names separated by ' '
from dictionary.columns
where libname='WORK' & memname='OLDDATA' & upcase(name) eqt 'ACCOUNT_BALANCE'
      & input("&from",yymmn6.) <= input(substr(name,16),yymmn6.) <= input("&to",yymmn6.);
quit;


/* Create a dataset with monthly totals */

proc summary data=olddata;
var &names;
output out=totals(drop=_:) sum=;
quit;

proc print data=totals;
run;

Your original data step code suggests that you would like to have simplified column names in the TOTALS dataset. But I think you will be better off if you don't rename but just label your variables accordingly. This can be achieved as follows:

/* Create label assignments (macro var. LABELS) */

proc sql noprint;
select catx('=',name,quote(put(input(substr(name,16),yymmn6.),monyy5.)))
into :labels separated by ' '
from dictionary.columns
where libname='WORK' & memname='TOTALS' & upcase(name) eqt 'ACCOUNT_BALANCE';
quit;


/* Apply label assignments to dataset TOTALS */

proc datasets lib=work nolist;
modify totals;
label &labels;
quit;

proc print data=totals label;
run;

It's not quite clear to me if this is all you need.

  • In your original post you wrote you were "trying to loop through 6 years." But your "from" and "to" dates span a period of 8 years and 8 months.
  • You wanted to "get a total balance for all accounts for each year." But later on it seemed you were happy with totals for each month. After all, how would you interpret a sum of 12 monthly account balances of the same person?
  • In your third post you wrote you were "just looking to make 1 table with 2 rows so the first row is the Overall sums per month and the second row is the final sum." But the table you presented as an example has only one row (plus column headers) -- like the TOTALS dataset created above. So, what kind of "final sum" did you envisage for the second row?

 

 (Edit: Replaced tabs by spaces in the first data step to avoid issues depending on Enhanced Editor options.)

 

 

PriyaL
Fluorite | Level 6

HI @FreelanceReinh

 

I think you are right as a proc summary is the easiest way as my data is already in the columns by month. So i would just need to sum through the columns I want to make a final dataset with 2 rows which contains the sum total of all account balances for all acounts in that month.

 

Is there a way to filter on proc summary just to keep the variables with the prefix 'account_balance' as my datasets contains several other variables because it is a master dataset. 

 

Thanks.

FreelanceReinh
Jade | Level 19

Hi @PriyaL,

 

The statement var &names; in my solution restricts the summations to variables with the prefix 'account_balance' and a year/month suffix within the specified date range. If there was no need to have the restriction to the date range (because all 'account_balance' variables in the input dataset are within that range), you could write var account_balance:; instead.

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

SAS Enterprise Guide vs. SAS Studio

What’s the difference between SAS Enterprise Guide and SAS Studio? How are they similar? Just ask SAS’ Danny Modlin.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 9 replies
  • 4343 views
  • 0 likes
  • 5 in conversation