Hi All --
This may be an unusual task, and so I will try to explain the best I can please, please let me know if I don't explain the question well enough: I have a set of aggregated data, that i'd like to merge in such a way that it is duplicated for appropriate by-groups. I think the best way to illustrate what I want is with an example:
This is data set of aggregate data (using dummy data here), essentially it is simple average balances across a number of months. There is a field indicating the number of months, the account, the average balance across that number of months:
account | average_bal | num_months |
1 | 50$ | 1 |
1 | 55$ | 2 |
1 | 54$ | 3 |
2 | 130$ | 1 |
2 | 135$ | 2 |
2 | 133$ | 3 |
So account is the account number, when num_months = 1, the average is really just the balance of the last month, when num_months = 2 its the average balance across the past two months, when num_months=3 then it's the average across last 3 months, etc. etc..
What I want is to have monthly data added to this, so that thee average across all months designated in "num_months" stays the same for each "num_months" group, but the monthly data is present, so this is what I'd like to have:
account | average_bal | num_months | month | monthly_bal |
1 | 50$ | 1 | august | 50 |
1 | 55$ | 2 | august | 50 |
1 | 55$ | 2 | july | 48 |
1 | 54$ | 3 | august | 50 |
1 | 54$ | 3 | july | 48 |
1 | 54$ | 3 | june | 55 |
2 | 130$ | 1 | august | 130 |
2 | 135$ | 2 | august | 130 |
2 | 135$ | 2 | july | 140 |
2 | 133$ | 3 | august | 130 |
2 | 133$ | 3 | july | 140 |
2 | 133$ | 3 | june | 137 |
Now I realize that the data doesn't actually match mathematically (the averages don't match the actual calculated average of the 'monthly_bal' values as they should in real life, as I just made up these numbers to illustrate the data structure I'm looking for). So assuming I have a dataset of monthly balances for each month and each account, i'd like to be able to merge and get the above table. A simple cartesian product doesn't get me exactly what I want, but close. I've tried many joining and merging methods with Proc SQL and Datasteps, but can't quite get it. Has anyone else had this problem, or know what I'm missing here? Thanks a lot!
Are you just trying to do something like the following?
data have;
input account average_bal num_months;
cards;
1 50 1
1 55 2
1 54 3
2 130 1
2 135 2
2 133 3
;
data want (drop=_:);
array _month(3);
retain _month:;
set have;
by account;
if first.account then do;
call missing(of _month(*));
_month(1)=average_bal;
monthly_bal=_month(1);
month=1;
output;
end;
else do;
_month(Num_months)=average_bal*num_months-sum(of _month(*));
do month=1 to num_months;
monthly_bal=_month(month);
output;
end;
end;
run;
Hmm, thank you arthur, however the code does not seem to execute:
29 retain _month();
_
22
ERROR 22-322: Syntax error, expecting one of the following: a quoted string, a numeric constant, a datetime constant,
a missing value.
Try it with the following:
data have;
input account average_bal num_months;
cards;
1 50 1
1 55 2
1 54 3
2 130 1
2 135 2
2 133 3
;
data want (drop=_:);
array _month(3);
retain _month:;
set have;
by account;
if first.account then do;
call missing(of _month(*));
_month(1)=average_bal;
monthly_bal=_month(1);
month=1;
output;
end;
else do;
_month(Num_months)=average_bal*num_months-sum(of _month(*));
do month=1 to num_months;
monthly_bal=_month(month);
output;
end;
end;
run;
Thank you for all of your replies and help Arthur. The code executed, however I'm not exactly sure how I could use that....I'm not trying to calculate averages or totals or anything at all, I was just using average balances as example data. It was really just a question of how to merge two datasets structured in that way, not a question of how to calculate balances. I appreciate you taking the time to try and help however, I no longer need any help on this, it proved to be too difficult to do within SAS and so I manually assembled the data in Excel as I needed..not a very elegant solution but it worked.
Thank you all.
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.