BookmarkSubscribeRSS Feed
LengYi
Calcite | Level 5

I would like some suggestion for my SAS SQL Code. Suppose I have long format table with 4 Columns. There are multiple account, multiple date as follow:

 

ACCOUNT DATE      STAGE AMOUNT
A001    01DEC2020   1   100000
A001    01JAN2021   1   100000
A001    01FEB2021   1   100000
...     ...        ...  ...

What I am trying to do is I created sequential order column, named RANK by starting from 1 by ACCOUNT and DATE. Then, I will create YEAR column by CEIL(RANK / 12). After that I will create OS Column by condition WHEN CASE.

The final table that I needed to aggregate columns GROUP BY ACCOUNT, YEAR. However, the result is not followed as expecting. I got multiple duplicated rows after running the code.

Below is my code from SAS SQL. SAS is not allow to use RANK() OVER(PARTITION).

 

proc sql;
CREATE TABLE workingData (DROP = RANK OS) AS SELECT
    ACCOUNT,
    STAGE,
    (SELECT COUNT(DISTINCT b.DATE)
        FROM loanData b
        WHERE b.DATE <= a.DATE
            AND a.ACCOUNT eq b.ACCOUNT) as RANK,
    CEIL(CALCULATED RANK / 12) as YEAR,
    CASE
        WHEN CALCULATED RANK = 1 THEN a.AMOUNT
        ELSE 0
        END AS OS,
    CALCULATED YEAR,
    MAX(OS) AS OUTSTANDING,
    MEAN(AMOUNT) AS EAD 
    FROM loanData a
    GROUP BY ACCOUNT, CALCULATED YEAR
    HAVING STAGE = 2
        OR (STAGE = 1 AND CALCULATED RANK <= 12)
            OR (STAGE = 3 AND CALCULATED RANK = 1)
    ORDER BY ACCOUNT, YEAR
    ;
quit;

 The result is shown below:

ACCOUNT STAGE YEAR OUTSTANDING EAD
A001    1   1   100000  100000
A001    1   1   100000  100000
A001    1   1   100000  100000
A001    1   1   100000  100000
A001    1   1   100000  100000
A001    1   1   100000  100000
A001    1   1   100000  100000
A001    1   1   100000  100000
A001    1   1   100000  100000
A001    1   1   100000  100000
A001    1   1   100000  100000
A001    1   1   100000  100000
A002    2   1   50000   40669.25
A002    2   1   50000   40669.25
A002    2   1   50000   40669.25
...     ... ... ...     ...

But what I expected is:

ACCOUNT STAGE YEAR OUTSTANDING EAD
A001    1   1   100000  100000
A002    2   1   50000   40669.25
A002    2   2   0       32024
A002    2   3   0       23888
A003    3   1   40000   40000
...     ... ... ...     ...

Any suggestion on the process that I have just only one chuck of SQL to do it. Thanks!

1 REPLY 1
ballardw
Super User

The "Rank" and "over partition" are non-standard SQL Extension. Practically every dialect of SQL has some.

SAS Proc SQL is basically ANSI SQL with a number of functions that are non-standard.

Sequential numbering would be done in a data step generally.

 

Since you don't actually show an example with your "Rank" I am not sure exactly how you are actually intending to calculate that. I have to say any variable called Year calculated as you propose is extremely confusing. Especially since the three lines of example data you show includes monthly dates crossing two years.

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 1 reply
  • 754 views
  • 0 likes
  • 2 in conversation