SAS Model Studio is a user-friendly low code/no code application that provides an intuitive interface that enables data scientists and analysts to build and deploy machine learning models. It supports various products such as SAS Visual Data Mining and Machine Learning, SAS Visual Text Analytics, and SAS Visual Forecasting, making it a versatile tool for data scientists and analysts.
Although SAS Model Studio simplifies the process of building and deploying machine learning models, like any software, it can still encounter issues that may be difficult to resolve independently. In this article, we concentrate on the SAS Visual Data Mining and Machine Learning (VDMML) product and guide on collecting pertinent information and logs to send to SAS Technical Support. Additionally, we explore how the hmeq data set can help to recreate problems and swiftly address any issues.
Getting started with troubleshooting SAS VDMML issues
When troubleshooting SAS VDMML issues with SAS Technical Support, the first step is to collect basic information such as the site number and product version. See Contact SAS Technical Support to know how to collect them. Once you have this information, the next step is to enable debug reporting to generate logs for troubleshooting.
Enabling Debug Reporting in SAS VDMML
The Logging window lets you control how much information is generated in the logs. It can be either for a project (via the Settings icon -> Project settings)
or for all your projects (via the User icon -> Settings.)
To avoid generating too much content in the logs, unless instructed otherwise, only check "Enable debug reporting" and then "Add time information and headers". Refer to Enable Debug Reporting in the User's Guide for more information.
Collecting relevant information and logs
To troubleshoot issues with SAS Technical Support effectively, it's important to collect relevant information and logs to identify the root cause of the problem. Follow these steps:
Determine if the issue is new or existing to identify potential causes.
Provide a screenshot of the issue, for example, a popup window containing the error message or a pipeline containing the failing node, to help visualize the problem and improve communication.
Collect information such as date, time, job, and user to identify the specific problem area.
Provide a step-by-step guide on how to reproduce the issue.
Identify recent changes to the environment that may be related to the issue. Changes to the system, server, or hardware can sometimes cause unexpected issues, so it's essential to check for any recent changes that may have triggered the problem.
If logs are available, collect and review the appropriate logs to gather more information about the issue. For example, the Project Data Advisor log can provide information about issues with creating a project, while the node log can provide information about issues with running a node. Notify SAS Technical Support If logs are not available.
To find the logs of a SAS VDMML project, open the project, click the Settings button in the upper right corner of the window, and then click Project logs.
The types of logs required will vary depending on the issue at hand. Here are some examples of when and where to find different logs.
Troubleshooting issues with creating a project: If a project fails to be created, and the Data tab does not appear on the project page, then the Project Data Advisor log can provide insight into the creation of the data source's metadata such as variable names, labels, type, role, and level assignments.
Troubleshooting issues with running the Data node: If the Data tab (that contains metadata information) has been created, but the Data node is running indefinitely or fails to run, then the Project Partitioning log can provide more information about the creation of (train, validate, and test) data partitions.
Troubleshooting issues with running a node: If a project is created successfully, but a node fails to run, then the node log can provide information about the SAS code that runs behind the scenes. Right-click the node and select Log from the pull-down menu to get the node's log.
Additional logs might be available from Project logs: Log for Project Retraining, Log for Project Batch Retraining, Log for Initiating Project Batch Retraining, and Log for Scoring and Assessing. These logs can provide more information about specific operations related to each log.
If a retrain job was run via running Batch API code, then the Project Batch Retraining and the Project Retraining logs are generated.
If the project data source is replaced, then the Initiating Project Batch Retraining log is generated.
After scoring holdout data, you might find multiple logs prefixed by ’Log for Scoring and Assessing’. For instance, let's assume there are two pipelines in your project where the winning model from the first pipeline is the Gradient Boosting node and the winning model from the second pipeline is the Logistic Regression node. The following three logs are generated:
Reproducing SAS VDMML issues with sample data
The sample home equity (hmeq) data set is an excellent data source for testing SAS VDMML issues. This data set includes information about 5,960 fictitious mortgages, each representing an applicant for a home equity loan. The binary target BAD indicates whether an applicant in the training data eventually defaulted or was ever seriously delinquent. When you can reproduce the issue using the sample hmeq data, it suggests the issue is less likely data-dependent but more likely relevant to the environment. When you cannot reproduce the issue using the sample hmeq data, additional logs from the Viya services and CAS server might be required to diagnose the problem further. You can download the hmeq data from the Example Data Sets for the SAS® Viya® Platform page.
In summary, SAS Model Studio is a low-code/no-code application that simplifies the process of building and deploying machine learning models, including SAS Visual Data Mining and Machine Learning (VDMML). However, issues may still arise, and collecting relevant information and logs is essential to troubleshoot them effectively. To do so, one should determine if the issue is new or existing, provide a screenshot of the problem, collect information such as date and time, provide a step-by-step guide on how to reproduce the issue, identify recent changes to the environment that may be related to the issue, and gather the appropriate logs. The sample home equity (hmeq) dataset is an excellent resource for testing SAS VDMML issues.
... View more