Simplifying Data Ingestion in ALM on Viya: Best Practices

The purpose of this post is to guide you through the process of loading and ingesting data in Asset Liability Management (ALM) on SAS Viya. Whether you’re new to SAS Viya or looking to improve your data ingestion workflow, this guide will provide you with practical tips and best practices to simplify your data management process. By the end of this post, you’ll have a clear understanding of how to efficiently handle your data, making it readily available for your financial analysis and decision-making.

1. Understanding the Data Requirements

Before ingesting data, it's important to identify the types of data required for ALM processes. These can be in the form of:

Financial Instruments Data: Information about fixed and floating rate instruments, derivatives, loans, deposits, etc.
Market Data: Yield curves, interest rates, exchange rates, economic indicators.
Transaction Data: Detailed records of financial transactions.
Credit Risk Data: Credit ratings, default probabilities, exposure data.

2. Preparing your Data Sources

Ensuring that your data sources are prepared and accessible is an important step in the data ingestion process. This involves understanding the various sources your data can come from and the best practices for preparing these sources for integration with SAS Viya. Some of the most common data sources are:

Databases:
- Relational Databases: These include Oracle, SQL Server, PostgreSQL, MySQL, and more. They are commonly used to store structured data.
- NoSQL Databases: Databases like MongoDB, Cassandra, and Redis are used for unstructured or semi-structured data. They are highly scalable and can handle large volumes of data.
Flat Files:
- CSV (Comma-Separated Values): used for data exchange due to its simplicity and compatibility with various tools.
- Excel Files: Common in business environments for data analysis and reporting.
- JSON (JavaScript Object Notation): Often used for data interchange between web applications and servers due to its lightweight and human-readable format.
APIs (Application Programming Interfaces):
- APIs allow access to data from various platforms, including financial data providers, market data feeds, and internal enterprise systems. They are used in real-time data integration and dynamic data retrieval.
Data Lakes:
- Data lakes are centralized repositories that allow you to store all your structured and unstructured data at any scale. Technologies like Hadoop and cloud-based data lakes (e.g., AWS S3, Azure Data Lake) are popular choices.

3. Best Practices for preparing Data Sources

Data Quality Assurance:
- Validation: Ensure that the data meets the required standards and formats before ingestion. This includes checking for missing values, outliers, and other inconsistencies.
- Cleansing: Clean the data to remove any errors or irrelevant information. This may involve filling missing values, correcting data types, applying rules and removing duplicates.
Data Transformation:
- Normalization: Standardize the data to a common format or structure. This helps in maintaining consistency and simplifies the analysis.
- Aggregation: Summarize the data at required levels to meet the needs of ALM models. For instance, aggregate transaction data by date, instrument type, or geographical region.
Data Security:
- Encryption: Ensure that sensitive data is encrypted both at rest and in transit. This protects the data from unauthorized access and breaches.
- Access Control: Implement role-based access controls to restrict data access to authorized personnel only.
Documentation:
- Metadata: Maintain comprehensive metadata for all data sources. This includes information about the data origin, structure, and any transformations applied.
- Data Lineage: Track the data flow from the source to the final destination. This helps in understanding the data transformations and ensures transparency.

4. Using SAS Data Connectors

SAS Viya provides various data connectors that facilitate the integration and ingestion of data from multiple sources. These connectors are essential for importing data into the SAS environment, ensuring compatibility, and enhancing the efficiency of your data workflows. Here is an in-depth look at some of the most commonly used data connectors and how they can be utilized in the context of ALM.

Types of SAS Data Connectors

SAS/ACCESS: SAS/ACCESS engines provide seamless read, write, and update access to a variety of relational and non-relational databases. This allows you to interact with your data directly from within the SAS environment. Here are some of the commands you can use to connect to different databases:

Select any image to see a larger version.
Mobile users: To view the images, select the "Full" version at the bottom of the page.

SAS Data Connectors: These connectors are designed to work with big data platforms and cloud storage solutions, enabling high-performance data access and manipulation.

Hadoop: Hadoop is widely used for storing large datasets in a distributed environment. SAS Viya’s Hadoop connector allows you to efficiently read and write data to and from Hadoop.
SAP HANA: SAP HANA is an in-memory database and application development platform. The SAS connector for SAP HANA provides high-speed data access.
Cloud Storage such as Amazon S3 or Azure Data Lake.

Setting Up and Using SAS Data Connectors

Install and Configure Connectors: Ensure that the necessary SAS/ACCESS engines and data connector modules are installed and configured in your SAS Viya environment. Configuration details can often be found in the SAS documentation or provided by your database administrator.
Establish Connections: Use the appropriate libname or filename statements to establish connections to your data sources. Each connector may require specific connection parameters such as server addresses, user credentials, and database names.
Data Access and Manipulation: Once the connection is established, you can use standard SAS procedures to access and manipulate the data. This includes procedures like PROC SQL, DATA STEP, and others for data transformation, analysis, and reporting.

Example of connection strings to different databases:

Hadoop: libname myhadoop hadoop server='hadoopserver' schema='hadoopschema';

SAP HANA:ibname saphana hana user='username' password='password' server='saphanaserver' schema='saphanaschema';

Amazon S3: filename s3file s3 's3://mybucket/mydata.csv' access_key_id='your_access_key' secret_access_key='your_secret_key';

Azure: libname adls azure storage_account='your_storage_account' storage_key='your_storage_key';

Example Workflow with SAS Data Connectors

Here we have an example of using a SAS data connector to integrate data from an Oracle database:

Step 1: Establish Connection

libname oradb oracle user='username' password='password' path='oracleserver';

Step 2: Load Data into SAS Viya

proc sql;

create table work.financial_data as

select * from oradb.financial_instruments;

quit;

Step 3: Data Transformation and Cleansing

data work.cleaned_data;

set work.financial_data;

/* Example transformations */

if interest_rate = . then interest_rate = 0;

effective_date = input(date, yymmdd10.);

format effective_date date9.;

run;

Step 4: Store Data in Viya CAS

proc casutil;

load data=work.cleaned_data casout="financial_data" replace;

quit;

Example Workflow with Excel

proc import datafile="/mnt/data/financial_instruments.xlsx"

out=work.financial_data

dbms=xlsx

replace;

sheet="Sheet1";

run;

data work.cleaned_data;

set work.financial_data;

/* Transformations as needed */

run;

proc casutil;

load data=work.cleaned_data casout="financial_data" replace;

quit;

5. Conclusion

Ingesting data into SAS Viya for ALM involves connecting to data sources, loading and transforming the data, and then storing it in CAS for in memory analytics. This process ensures that your financial and risk management models, business evolution plans, allocation and segmentation schemes, all have accurate and up-to-date data for effective calculations, analysis and reporting.

For more information, please check out the official Asset and Liability Management Documentation here

Find more articles from SAS Global Enablement and Learning here.