Hi,
I have the following dataset. Please note that Paymment is Commulative to the previouse payment. I need to compute the ANNUAL and Monthly payments for the entire dataset. I am using Enterprise Guide, but can use a code to save in a program:
Name Month Total Payment
Al Dec 10 (the data is backdated to the previous year to facilitate the calculations)
Al Jan 50 (which means that the payment for this month is 40 only, and so on)
Al March 75
Al June 110
Tara Dec 0
Tara Jan 20
Tara Feb 35
Tara Dec 100
For the above example, I need to create 2 new columns that shows the payments were as follows:
Annual:
Name Amount
Al 100
Tara 100
Monthly:
it shows the monthly increment from the previous month.
Thank you for looking into this.
Altijani
Like this?
data HAVE;
input NAME $ MONTH $ TOTAL_PAYMENT;
cards;
Al Dec 10 (the data is backdated to the previous year to facilitate the calculations)
Al Jan 50 (which means that the payment for this month is 40 only, and so on)
Al March 75
Al June 110
Tara Dec 0
Tara Jan 20
Tara Feb 35
Tara Dec 100
run;
data WANT;
set HAVE;
by NAME;
MONTHLY=TOTAL_PAYMENT-lag(TOTAL_PAYMENT) * ^first.NAME;
YEARLY +MONTHLY;
if first.NAME then YEARLY=0;
else output;
run;
NAME | MONTH | TOTAL_PAYMENT | MONTHLY | YEARLY |
---|---|---|---|---|
Al | Jan | 50 | 40 | 40 |
Al | March | 75 | 25 | 65 |
Al | June | 110 | 35 | 100 |
Tara | Jan | 20 | 20 | 20 |
Tara | Feb | 35 | 15 | 35 |
Tara | Dec | 100 | 65 | 100 |
The concept of "backdated to the previous year" is a little unclear, so here's a reasonable guess.
For the monthly data:
data want;
set have;
by name;
increment = dif(total_payment);
if first.name then increment=0;
run;
Then to get the annual data:
proc summary data=want;
by name;
var total_payment;
output out=annual (keep=name amount) sum=amount;
run;
proc print data=annual;
run;
If this is all of the data in the dataset, you can find the difference for the increments by using a BY statement. If there is more data than this, you will need to use a Proc Sort to sort alphabetically by the Name variable before using a BY statement.
Alternatively you can do something like this and get the same results*:
Data Diff;
Set Have;
Difference = Dif(Total_Payment);
If Difference < 0 Then
Difference = 0;
Run;
*This requires the names to be grouped together with the Total Amounts increasing but does not require the Names to be sorted alphabetically.
For the annual data you can use a Proc SQL or a Proc Summary. I'm partial to Proc SQL so here is what I have and this should work assuming all payments are positive.
Proc SQL;
Create table Annual as
Select H.Name, H.Total_Payment as Total_Amount
From Have H
Where H.Total_Payment = (Select Max(Total_Payment) From Have J Where H.Name = J.Name);
Quit;
Hi,
Thanks for the responses.
The data has thousands of records, with several in each Name group, but I just wanted to give you the needed details.
Hi,
I have your SQL code running for a day now. Obviously something is not right. Any suggestions? My data has 1.4M obs.
Thanks again,
Altijani
running a subquery for each group value is very expensive
Hello.
Without seeing the rest of your data and how it is organized, it is difficult to know what exactly is going on.
The SQL works for calculating the Total_Payment for each of the people you have listed based on the way they are sorted.
Have you tried using the alternative method that Chris posted?
That method also works on the data you have given and as his output shows, lists both the monthly increments and total payment received to date in columns in the same table.
Like this?
data HAVE;
input NAME $ MONTH $ TOTAL_PAYMENT;
cards;
Al Dec 10 (the data is backdated to the previous year to facilitate the calculations)
Al Jan 50 (which means that the payment for this month is 40 only, and so on)
Al March 75
Al June 110
Tara Dec 0
Tara Jan 20
Tara Feb 35
Tara Dec 100
run;
data WANT;
set HAVE;
by NAME;
MONTHLY=TOTAL_PAYMENT-lag(TOTAL_PAYMENT) * ^first.NAME;
YEARLY +MONTHLY;
if first.NAME then YEARLY=0;
else output;
run;
NAME | MONTH | TOTAL_PAYMENT | MONTHLY | YEARLY |
---|---|---|---|---|
Al | Jan | 50 | 40 | 40 |
Al | March | 75 | 25 | 65 |
Al | June | 110 | 35 | 100 |
Tara | Jan | 20 | 20 | 20 |
Tara | Feb | 35 | 15 | 35 |
Tara | Dec | 100 | 65 | 100 |
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.