Hi,
I am trying to calculate cumulative total using proc sql. I know how to do in a data step but i am trying to understand sas better and wanted to try using proc sql
data is this. I want a cumulative column for the monthly_pmt variable by ID. I am just testing this code for one ID right now
| ID | ENT_SEQ_NO | ACTV_DATE | Monthly_pmt | month | count |
| 759966427 | 988522250 | 07JUL2019:00:00:00 | 150 | 7 | 1 |
| 759966427 | 988522251 | 02AUG2019:00:00:00 | 150 | 8 | 2 |
| 759966427 | 988522252 | 06SEP2019:00:00:00 | 150 | 9 | 3 |
| 759966427 | 988522253 | 05OCT2019:00:00:00 | 100 | 10 | 4 |
| 759966427 | 988522254 | 08NOV2019:00:00:00 | 400 | 11 | 5 |
| 759966427 | 988522256 | 21DEC2019:00:00:00 | 200 | 12 | 6 |
I wrote this but i am getting duplication
PROC SQL;
CREATE TABLE CUMULATIVE AS
SELECT a.*, SUM(b.MONTHLY_PMT) AS CUML
FROM TEST3 A INNER JOIN TEST3 B
ON A.ID= B.ID and b.count <= a.count
GROUP BY A.ID,a.ent_seq_no,a.actv_date
ORDER BY A.ID,a.ent_seq_no,a.actv_date
;
QUIT;
result is:
| ID | ENT_SEQ_NO | ACTV_DATE | Monthly_pmt | month | count | CUML |
| 759966427 | 988522250 | 07JUL2019:00:00:00 | 150 | 7 | 1 | 150 |
| 759966427 | 988522251 | 02AUG2019:00:00:00 | 150 | 8 | 2 | 300 |
| 759966427 | 988522251 | 02AUG2019:00:00:00 | 150 | 8 | 2 | 300 |
| 759966427 | 988522252 | 06SEP2019:00:00:00 | 150 | 9 | 3 | 450 |
| 759966427 | 988522252 | 06SEP2019:00:00:00 | 150 | 9 | 3 | 450 |
| 759966427 | 988522252 | 06SEP2019:00:00:00 | 150 | 9 | 3 | 450 |
| 759966427 | 988522253 | 05OCT2019:00:00:00 | 100 | 10 | 4 | 550 |
| 759966427 | 988522253 | 05OCT2019:00:00:00 | 100 | 10 | 4 | 550 |
| 759966427 | 988522253 | 05OCT2019:00:00:00 | 100 | 10 | 4 | 550 |
| 759966427 | 988522253 | 05OCT2019:00:00:00 | 100 | 10 | 4 | 550 |
| 759966427 | 988522254 | 08NOV2019:00:00:00 | 400 | 11 | 5 | 950 |
| 759966427 | 988522254 | 08NOV2019:00:00:00 | 400 | 11 | 5 | 950 |
| 759966427 | 988522254 | 08NOV2019:00:00:00 | 400 | 11 | 5 | 950 |
| 759966427 | 988522254 | 08NOV2019:00:00:00 | 400 | 11 | 5 | 950 |
| 759966427 | 988522254 | 08NOV2019:00:00:00 | 400 | 11 | 5 | 950 |
| 759966427 | 988522256 | 21DEC2019:00:00:00 | 200 | 12 | 6 | 1150 |
| 759966427 | 988522256 | 21DEC2019:00:00:00 | 200 | 12 | 6 | 1150 |
| 759966427 | 988522256 | 21DEC2019:00:00:00 | 200 | 12 | 6 | 1150 |
| 759966427 | 988522256 | 21DEC2019:00:00:00 | 200 | 12 | 6 | 1150 |
| 759966427 | 988522256 | 21DEC2019:00:00:00 | 200 | 12 | 6 | 1150 |
| 759966427 | 988522256 | 21DEC2019:00:00:00 | 200 | 12 | 6 | 1150 |
I got around it by modifying to ask for distinct a.* in the above query but is this the only way? why is it doing this?
HI @TheNovice It's simple. Your GROUP BY is not complete.
You could include A.* in group by to not allow PROC SQL to REMERGE
So,
GROUP BY A.ID,a.ent_seq_no,a.actv_date,a.monthly_pmt,a.month,a.count
should solve the problem
Full Version:
PROC SQL;
CREATE TABLE CUMULATIVE AS
SELECT a.*, SUM(b.MONTHLY_PMT) AS CUML
FROM have A INNER JOIN have B
ON A.ID= B.ID and b.count <= a.count
GROUP BY A.ID,a.ent_seq_no,a.actv_date,a.monthly_pmt,a.month,a.count
ORDER BY A.ID,a.ent_seq_no,a.actv_date
;
QUIT;
HI @TheNovice It's simple. Your GROUP BY is not complete.
You could include A.* in group by to not allow PROC SQL to REMERGE
So,
GROUP BY A.ID,a.ent_seq_no,a.actv_date,a.monthly_pmt,a.month,a.count
should solve the problem
Full Version:
PROC SQL;
CREATE TABLE CUMULATIVE AS
SELECT a.*, SUM(b.MONTHLY_PMT) AS CUML
FROM have A INNER JOIN have B
ON A.ID= B.ID and b.count <= a.count
GROUP BY A.ID,a.ent_seq_no,a.actv_date,a.monthly_pmt,a.month,a.count
ORDER BY A.ID,a.ent_seq_no,a.actv_date
;
QUIT;
yes, that makes so much sense.... thank you so much. it worked!
April 27 – 30 | Gaylord Texan | Grapevine, Texas
Walk in ready to learn. Walk out ready to deliver. This is the data and AI conference you can't afford to miss.
Register now and lock in 2025 pricing—just $495!
Still thinking about your presentation idea? The submission deadline has been extended to Friday, Nov. 14, at 11:59 p.m. ET.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.