I would like some suggestion for my SAS SQL Code. Suppose I have long format table with 4 Columns. There are multiple account, multiple date as follow:
ACCOUNT DATE STAGE AMOUNT
A001 01DEC2020 1 100000
A001 01JAN2021 1 100000
A001 01FEB2021 1 100000
... ... ... ...
What I am trying to do is I created sequential order column, named RANK
by starting from 1 by ACCOUNT
and DATE
. Then, I will create YEAR
column by CEIL(RANK / 12)
. After that I will create OS
Column by condition WHEN CASE
.
The final table that I needed to aggregate columns GROUP BY ACCOUNT, YEAR
. However, the result is not followed as expecting. I got multiple duplicated rows after running the code.
Below is my code from SAS SQL. SAS is not allow to use RANK() OVER(PARTITION)
.
proc sql;
CREATE TABLE workingData (DROP = RANK OS) AS SELECT
ACCOUNT,
STAGE,
(SELECT COUNT(DISTINCT b.DATE)
FROM loanData b
WHERE b.DATE <= a.DATE
AND a.ACCOUNT eq b.ACCOUNT) as RANK,
CEIL(CALCULATED RANK / 12) as YEAR,
CASE
WHEN CALCULATED RANK = 1 THEN a.AMOUNT
ELSE 0
END AS OS,
CALCULATED YEAR,
MAX(OS) AS OUTSTANDING,
MEAN(AMOUNT) AS EAD
FROM loanData a
GROUP BY ACCOUNT, CALCULATED YEAR
HAVING STAGE = 2
OR (STAGE = 1 AND CALCULATED RANK <= 12)
OR (STAGE = 3 AND CALCULATED RANK = 1)
ORDER BY ACCOUNT, YEAR
;
quit;
The result is shown below:
ACCOUNT STAGE YEAR OUTSTANDING EAD
A001 1 1 100000 100000
A001 1 1 100000 100000
A001 1 1 100000 100000
A001 1 1 100000 100000
A001 1 1 100000 100000
A001 1 1 100000 100000
A001 1 1 100000 100000
A001 1 1 100000 100000
A001 1 1 100000 100000
A001 1 1 100000 100000
A001 1 1 100000 100000
A001 1 1 100000 100000
A002 2 1 50000 40669.25
A002 2 1 50000 40669.25
A002 2 1 50000 40669.25
... ... ... ... ...
But what I expected is:
ACCOUNT STAGE YEAR OUTSTANDING EAD
A001 1 1 100000 100000
A002 2 1 50000 40669.25
A002 2 2 0 32024
A002 2 3 0 23888
A003 3 1 40000 40000
... ... ... ... ...
Any suggestion on the process that I have just only one chuck of SQL to do it. Thanks!
The "Rank" and "over partition" are non-standard SQL Extension. Practically every dialect of SQL has some.
SAS Proc SQL is basically ANSI SQL with a number of functions that are non-standard.
Sequential numbering would be done in a data step generally.
Since you don't actually show an example with your "Rank" I am not sure exactly how you are actually intending to calculate that. I have to say any variable called Year calculated as you propose is extremely confusing. Especially since the three lines of example data you show includes monthly dates crossing two years.
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.