Create a Credit Risk Dashboard in SAS Viya: Part 1

This is part of a series on technical use cases, designed to help SAS users solve a particular business or technical challenge.

The challenge: Create a dashboard that monitors credit risk of credit card customers, such as the average credit score over time, key risk indicators, and expected loss. This scenario uses both SAS Studio and SAS Visual Analytics.

Step 1: Import Credit Card Data into CAS

Goal: Load raw credit card customer data into SAS Viya’s in-memory CAS environment for fast processing. We will use SAS Data Explorer for this import.

Open SAS Data Explorer: In SAS Viya, go to Manage Data. This interface lets you import data from various sources.
Select Data Files: On the Import tab, choose the source Local Files. Select the raw data files for credit risk analysis. For example, you might have:

accounts.csv – containing account-level info (Customer_ID, credit_limit, current_balance, current_credit_score, region, etc.).
performance.csv – containing monthly performance records (Customer_ID, Month, credit_score, balance, payment_amount, due_amount, delinquency_status, etc.).

Import to CAS: Choose a target CAS library (e.g. CASUSER for personal session) and table name for each file, or accept the defaults (which usually use the file name as table name in CASUSER). Click Import for each file, or Import All to load both at once. SAS Data Explorer will upload each file, create an in-memory CAS table, promote it, and save it to disk for persistence. You should see a success message and the new tables listed in the selected CASLIB. For example, after importing, you might see tables ACCOUNTS and PERFORMANCE. The lightning bolt icon indicates that the table is in memory in CAS.
Handling Existing Tables: If a table with the same name already exists in the CASLIB (from a previous import), specify a different name or check the Replace file option before importing to overwrite it.
Result: The raw customer account data and performance data are now loaded into CAS. They reside in-memory (e.g. CASUSER.ACCOUNTS, CASUSER.PERFORMANCE), ready for processing.

Step 2: Verify Data in SAS Studio and Start CAS Session

Goal: Confirm that the data imported correctly and prepare SAS Studio for data manipulation by ensuring an active CAS session.

Launch SAS Studio: Open SAS Studio. From SAS Drive or the Applications menu, choose Develop Code and Flows. This opens the web-based coding interface for SAS.
Start CAS Session: In SAS Studio, open a new Program editor. Submit the following code to start a CAS session and assign all CASLIBs:

cas; CASLIB _ALL_ assign;

Check the SAS log for a confirmation message:

NOTE: The session name identified with the SESSREF= SAS option is connected to Cloud Analytic Services.

Locate CAS Tables: In SAS Studio’s left pane, expand the Libraries section. Navigate to the CAS library where data was imported. You should see the newly imported tables, e.g. ACCOUNTS and PERFORMANCE.
View Data Sample: Double-click one of the table names or right-click and select “View Data” to open it. Verify that all expected columns are present and properly named (for example, Customer_ID, Credit_Limit, Current_Balance, Credit_Score, etc.).
Check Values: Scroll through a few rows. Ensure numeric fields (like balances, limits, scores) are numeric and not missing or garbled, and character fields (like Region or Status) look correct. For instance, credit scores should be reasonable three-digit numbers (300–850 range) and balances should be non-negative values.
Row Count: Confirm the number of records matches expectations. SAS Studio’s data viewer shows the total row count at the top; verify it aligns with the source file (e.g. if performance.csv had 12,000 monthly records, the CAS table shows ~12,000 rows). You can also programmatically check this. For example, run:
```
proc contents data=casuser.accounts;
run;
```

This will display metadata (including the number of observations) for ACCOUNTS. Seeing the correct counts confirms a successful import.

At this point, the raw data is loaded and verified in CAS, and SAS Studio is ready for data preparation steps.

Step 3: (Optional) Load Data via Code With PROC IMPORT

If you prefer code or need to automate data loading, SAS provides code-based methods.

To load the performance CSV via code from a network fileshare, you can use a PROC IMPORT step:

proc import
    file='/location/to/file/performance.csv'
    out=casuser.performance
    dbms=csv
    replace;
run;

If you uploaded your data to SAS Content rather than a network fileshare, you must use the FILESRVC access method in a FILENAME statement to load the file:

filename perfcsv filesrvc
    path='/location/to/file'
    filename='performance.csv';

proc import
    file=perfcsv
    out=casuser.performance
    dbms=csv
    replace;
run;

This code loads the CSV from the specified path into the CASUSER CASLIB as a table named PERFORMANCE. You would do a similar step for accounts.csv. Check the SAS log for notes confirming the load. After running, verify the new tables appear in CASUSER and have the expected content (as in Step 2).

This step is not required if you already imported the data via the Data Explorer UI, but it’s useful for automation or documentation purposes.

Step 4: Clean and Merge Data

Goal: Prepare the data for analysis by cleaning any issues and combining the data sources into a unified dataset. This involves standardizing columns, handling missing or duplicate entries, and merging the account info with monthly performance records.

Standardize Column Names & Types: Review column names in both tables. As noted, your import process likely already cleaned them depending on your default system settings (e.g., “Credit Score” -> Credit_Score). If any names are still not intuitive or contain special characters, rename them using PROC DATASETS or a DATA step. Ensure numeric fields are numeric since SAS might read some fields as character if they contained non-numeric characters. For example, if Credit_Score came in as character, convert it using an INPUT function in a DATA step. Likewise, make sure date fields are properly recognized as SAS date or datetime.
Handle Missing Values & Anomalies: Scan for obvious data issues. If some fields have missing values (e.g., missing credit score for a customer), decide on a strategy – you might impute or flag them. Check for any outliers or invalid values: e.g., negative balances or credit limits. Also ensure logical consistency. For example, if Current_Balance exceeds Credit_Limit for an account, that implies an over-limit scenario (which might be allowed to some degree, but if it’s extreme it could be a data error or a risk indicator in itself). Correct any obvious data entry errors if possible, or at least note them.
Filter Irrelevant Records: If the data includes closed accounts or inactive customers that should not be in this risk monitoring, consider filtering them out. For instance, ensure Account_Status = 'Active' before analysis. You can apply a filter in a DATA step or FedSQL WHERE clause to exclude accounts with statuses like "Closed" or "Charged-Off" if your focus is on current portfolio risk.
Merge Datasets: Now join the account profile data with the monthly performance data to create a combined dataset. The common key is likely Customer_ID. Use PROC FEDSQL (SAS’s ANSI SQL) or a DATA step merge to do this join in CAS. For example, using a DATA Step:

data casuser.accountsaccount_monthly;
    merge casuser.accounts(in=a)
          casuser.performance(in=p);
    by customer_id;

    if(a);
run;

This performs a left join that is functionally equivalent to FEDSQL so that each monthly performance record (p) is augmented with the corresponding account info (a). The result ACCOUNT_MONTHLY will have multiple rows per Customer_ID – one for each month (or period) of performance, carrying along static attributes from the account table. All accounts will be retained, even if they do not have a performance field.

Verify the Merge: After the join, verify the new table. The number of rows in ACCOUNT_MONTHLY should match the number of records in ACCOUNTS, assuming every performance record found a match in accounts. For example, if ACCOUNT had 12,000 rows, the joined table should have 12,000 rows with no duplicates or losses. You can confirm row count with another SELECT COUNT(*) query. Also spot-check a few records: each record in ACCOUNT_MONTHLY should have all expected fields. For a given Customer_ID, check that the values coming from the accounts table (like Credit_Limit, Region) repeat correctly for each monthly record of that customer, and that fields from the performance table (like Balance, Delinquency_Status) vary by month.
Duplicate Customer Check: Ensure the accounts table did not contain duplicate Customer_IDs, which could cause duplicated rows after join. If needed, run a de-duplication check on the accounts data prior to joining:
```
proc sort data=account_monthly nodupkey 
          out=casuser.account_monthly_unique
          dupout=casuser.dup_accounts;
run;
```

This will remove any duplicate Customer_ID entries and output them to dup_accounts for review. Ideally, dup_accounts will be empty. This means each customer appears only once in accounts. If not, you may need to reconcile which record to keep using a strategy that makes the most sense for your application. For example, you could keep only the latest info for that customer.

Result: You now have an integrated dataset ACCOUNT_MONTHLY with each customer’s account details and their monthly performance metrics combined. This is the basis for calculating risk features and metrics.

Step 5: Feature Engineering – Derive Risk Indicators

Goal: Create new variables that capture credit risk indicators, to be used in the dashboard. These features quantify aspects like credit utilization, payment behavior, credit score changes, and expected loss components. Many can be derived from the combined dataset on a per-record or per-customer basis.

Key features to engineer:

Credit Utilization Rate: The ratio of balance to credit limit, indicating how much of the available credit is used.

Utilization Rate = Balance / Credit Limit

This will be a number between 0 and (possibly just above 1 if over-limit). High utilization (close to 1 or >1) is a risk flag as it may indicate the customer is maxing out their credit.
Payment Ratio: The ratio of payment to the amount due.

Payment Ratio = Payment Amount / Due Amount

A value of 1 means the customer paid the full amount due; less than 1 means they paid only part of what was due. Persistent low payment ratios, such as a customer paying only the minumum, can indicate rising risk.
Late Payment / Delinquency Flag: A binary indicator of whether the account is delinquent in that period. If the dataset has Delinquency_Status or can derive from dates, create a flag:

Delinquent Flag = (Payment Date > Due Date or No Payment by Due Date)

This flag marks if the account is past due on payments for that month. We could also derive days past due or months delinquent if data permits, but a simple flag or count of late payments suffices as a feature.
Credit Score Trend Metrics: Since we have credit scores over time, we can create features to capture their change. For instance, for each customer you might compute the change in credit score from a previous point:

Score Change Last 3 Months = Current Credit Score - Credit Score 3 Months Ago

This requires looking at previous records. You can achieve this with data step by sorting by Customer and Date and using lag(). A negative change indicates a drop in creditworthiness. Even without explicitly calculating, the dashboard will show average score trend over time; but you may want a flag like Score_Drop_Flag = 1 if Credit_Score dropped by > X in last Y months to highlight deteriorating customers.
Expected Loss (Expected Credit Loss): This is a composite metric. Expected loss is typically:

Expected Loss = Probability of Default * Exposure * Loss Given Default

We might not have a the probability of default directly, but we could create a model to predict the probability of default. That is beyond the scope of this article, so if you do not have this metric or do not have the data to predict this metric, feel free to skip it.
Other Features: If data allows, consider additional indicators:

Months_on_Books: How long the account has been open. Shorter tenure with good history might be less risky than long-time users with emerging issues, or vice versa; it’s a contextual metric.
Max_Delinquency_Level: If Delinquency_Status has values like “30 days”, “60 days”, “90 days late,” you could encode this to a numeric scale showing the maximum amount deliquency.
Utilization_Category: Bucket the utilization into categories (e.g., low <30%, medium 30–70%, high >70%) for easier filtering in the dashboard.

You can derive these features with SAS code or derive these columns in SAS Visual Analytics. It is generally preferable to use derived columns in SAS Visual Analytics whenevr possible to reduce the overall size of the underlying dataset. For example, if you need to do a simple aggregation, create a flag, create a group such as utilization category, or do some basic math on a few values, you should do this in Visual Analytics as these on-the-fly calculations are done very efficiently. The best thing to do is to try to create the calculation in Visual Analytics first. If you are unable to do the calculation in Visual Analytics, such as with a lag or a lead, or if you need to join tables within Visual Analytics, then you should go back to your data and create those columns prior to loading into Visual Analytics. It is advisable to avoid doing joins in Visual Analytics for dashboards that will be accessed by many users since this creates a new unique table for every user who views the dashboard.

Step 6: Promote the Final Table for Visual Analytics

Goal: Make the prepared data available to SAS Visual Analytics by promoting it to global scope in CAS. By default, tables created in CAS are tied to your session and will disappear when your session ends unless you upload it through the interface which automatically promotes and saves tables to a backing store. In CAS, the default state of a table you create with code is temporary no matter which CASLIB you put it in. Think of it kind of like the WORK directory: all your datasets disappear when your session ends, except it's like this for any CASLIB. Promotion ensures the table persists and is accessible for reporting.

Choose Table to Promote: In our case, the table ACCOUNT_MONTHLY_FEATURES contains the data we want to visualize. We will promote this table. If you also want the static account info or other tables accessible in VA, you can promote those too, but typically one well-prepared table is sufficient.

PROC CASUTIL Promote: Run a CASUTIL procedure to promote the table:
```
proc casutil incaslib='casuser' outcaslib='public';
      promote casdata='account_monthly_features';
quit;
```

This takes the ACCOUNT_MONTHLY_FEATURES table from the CASUSER CASLIB and promotes it within CASUSER. You should change outcaslib to a shared CASLIB like Public if you want other users to see it. Using CASUSER for promotion means it’s global only to you. Think of CASUSER like your SASUSER library: it's a place for your semi-permanent or permanent files, but other people cannot access it.

Verify Promotion: Check the SAS log for a note confirming success. You will see the table when selecting a data source within Visual Analytics. If you do not see it in the data source selector within Visual Analytics, wait about 30-60 seconds, then try again as it may take a moment for the data crawler to pick up the promoted table from SAS code.
Result: The curated data table is now persistent in memory and ready to be consumed by SAS Visual Analytics. Without this step, Visual Analytics will not find the table, and it will be dropped when SAS Studio session ends.

Note that the data is loaded to memory only. If CAS restarts, the data will be unloaded and you will need to run your pipeline again. To ensure that your dashboard will function correctly if this happens, you need to create a persistent data store. Visual Analytics knows where your data is, and if it is unloaded from memory, it will automatically load it. To do this, run the following code:

proc casutil incaslib='public' outcaslib='public';
      save casdata='account_monthly_features' replace;
quit;

Your data is now permanently saved to a backing store (in this case, a physical disk) and can be reloaded by Visual Analytics if it is unloaded.

In Part 2, we'll show you how to build the credit risk dashboard in SAS Visual Analytics.