BookmarkSubscribeRSS Feed
LengYi
Calcite | Level 5

I would like some suggestion for my SAS SQL Code. Suppose I have long format table with 4 Columns. There are multiple account, multiple date as follow:

 

ACCOUNT DATE      STAGE AMOUNT
A001    01DEC2020   1   100000
A001    01JAN2021   1   100000
A001    01FEB2021   1   100000
...     ...        ...  ...

What I am trying to do is I created sequential order column, named RANK by starting from 1 by ACCOUNT and DATE. Then, I will create YEAR column by CEIL(RANK / 12). After that I will create OS Column by condition WHEN CASE.

The final table that I needed to aggregate columns GROUP BY ACCOUNT, YEAR. However, the result is not followed as expecting. I got multiple duplicated rows after running the code.

Below is my code from SAS SQL. SAS is not allow to use RANK() OVER(PARTITION).

 

proc sql;
CREATE TABLE workingData (DROP = RANK OS) AS SELECT
    ACCOUNT,
    STAGE,
    (SELECT COUNT(DISTINCT b.DATE)
        FROM loanData b
        WHERE b.DATE <= a.DATE
            AND a.ACCOUNT eq b.ACCOUNT) as RANK,
    CEIL(CALCULATED RANK / 12) as YEAR,
    CASE
        WHEN CALCULATED RANK = 1 THEN a.AMOUNT
        ELSE 0
        END AS OS,
    CALCULATED YEAR,
    MAX(OS) AS OUTSTANDING,
    MEAN(AMOUNT) AS EAD 
    FROM loanData a
    GROUP BY ACCOUNT, CALCULATED YEAR
    HAVING STAGE = 2
        OR (STAGE = 1 AND CALCULATED RANK <= 12)
            OR (STAGE = 3 AND CALCULATED RANK = 1)
    ORDER BY ACCOUNT, YEAR
    ;
quit;

 The result is shown below:

ACCOUNT STAGE YEAR OUTSTANDING EAD
A001    1   1   100000  100000
A001    1   1   100000  100000
A001    1   1   100000  100000
A001    1   1   100000  100000
A001    1   1   100000  100000
A001    1   1   100000  100000
A001    1   1   100000  100000
A001    1   1   100000  100000
A001    1   1   100000  100000
A001    1   1   100000  100000
A001    1   1   100000  100000
A001    1   1   100000  100000
A002    2   1   50000   40669.25
A002    2   1   50000   40669.25
A002    2   1   50000   40669.25
...     ... ... ...     ...

But what I expected is:

ACCOUNT STAGE YEAR OUTSTANDING EAD
A001    1   1   100000  100000
A002    2   1   50000   40669.25
A002    2   2   0       32024
A002    2   3   0       23888
A003    3   1   40000   40000
...     ... ... ...     ...

Any suggestion on the process that I have just only one chuck of SQL to do it. Thanks!

1 REPLY 1
ballardw
Super User

The "Rank" and "over partition" are non-standard SQL Extension. Practically every dialect of SQL has some.

SAS Proc SQL is basically ANSI SQL with a number of functions that are non-standard.

Sequential numbering would be done in a data step generally.

 

Since you don't actually show an example with your "Rank" I am not sure exactly how you are actually intending to calculate that. I have to say any variable called Year calculated as you propose is extremely confusing. Especially since the three lines of example data you show includes monthly dates crossing two years.

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 1 reply
  • 472 views
  • 0 likes
  • 2 in conversation