BookmarkSubscribeRSS Feed

Outliers Example of Integration of SAS Visual Analytics with SAS Jobs via Data-Driven Content-Part 5

Started ‎09-21-2020 by
Modified ‎06-28-2023 by
Views 4,176

This article presents another example of a use case that applies the technique you have learned to integrate SAS Visual Analytics (VA), Data-Driven Content (DDC) objects, and SAS jobs. As always, it builds on top of assets and examples discussed previously, in other articles of this series.

 

In this use case you want to eliminate outliers from your time series chart, so that spikes that might occur in the last period of time, due to incomplete data for that period or other factors, will not penalize the visualization because of changes in the Y axis scale that occurred to accommodate the spikes.

 

Picture 1- Chart with outliersPicture 1- Chart with outliers

 

As you can see in the screenshot, if it were not for the spikes at the end of the time series in two of the products, the data points in this chart would not be squeezed at the bottom of the chart, and the visualization would be much better.

 

 

Time Series Outliers Example

 

The data used in this example was fabricated by transforming the table PRDSALE from the SASHELP library and provided to you as a SAS dataset named OUTLIER in GitHub. It contains KPI information for five different products on a monthly basis.

 

The goal is to use a SAS job to detect the outliers and remove them. The rules used for defining outliers and what “remove them” really means can vary, but you will keep them very simple in this example:

  • an outlier exists whenever the difference between KPI for consecutive months is greater than 10%
  • the rule to remove the outlier will be setting the outlier KPI to missing.

The time series is grouped by product, so the analysis must be performed individually for each product and the rule should only be applied for the last month in each series.

 

 

Data-Driven Content Code

 

The DDC used in this example will be the same one presented in previous article, without any changes. Enough enhancements were made to that DDC to make it reusable and you will be just leveraging it here. The name of the DDC is ProxyDDCForVAJobCASIntegration.

 

 

SAS Job Code

 

As happened with the previous use case, the job code used in HelloCASWorld example is a good starting point for implementing this solution. You will only need to add the highlighted code in the “Main Processing” block:

 

Picture 2- Changes needed in the “Main Processing” blockPicture 2- Changes needed in the “Main Processing” block

 

 

SAS Visual Analytics Report

 

The report was also kept simple. The time series chart at the top displays the data coming from the original data source called OUTLIER, which was loaded in memory in the Public CAS library. The time series at the bottom contains the improved visualization, based on the table created on the fly by the SAS job. This table was named OUTLIER_CLEAN. The DDC object is in the blank space at the top right of the report. The three data items applied in the time series Roles tab were also applied to the DDC: Month, Product, KPI. The slider control object has a filter action to the DDC and the first time series.

 

Whenever you modify the selection in the slider, the DDC receives new data, and responds by executing the job, which in turn loads in memory a new table OUTLIER_CLEAN without the outliers. The time series chart at the bottom has automatic refresh turned on, so at the specified interval it reads the new data and displays the results.

 

Picture 3- VA sample reportPicture 3- VA sample report

 

When designing a report like that from the scratch you must remember of a few things discussed in the previous articles, such as:

  1. The job’s output table in the CASUSER library doesn’t exist until you execute the job successfully at least once. Only after the table is available in the CASUSER library, you can add it into the report and use it as the source of other VA report objects, like the second time series chart.
  2. Tables from CASUSER library that are added to VA reports keep reference to the report author. This reference must be removed before the report is shared with a broader audience. As the report designer, you do that by editing the report BIRD XML:
      1. Save the VA report
      2. Hit Ctrl+Alt+B to open the SAS Visual Analytics Diagnostics window
      3. Make sure the BIRD tab across the top and the XML button are selected (these are the defaults)
      4. Click on the Data icon on the left
      5. Scroll down searching for the reference to the CASUSER library
      6. Remove (<userid>) that is appended to the CASUSER library

         

        Picture 4- Removing userid from CASUSERPicture 4- Removing userid from CASUSER

         

      7. Click the Load button on the top left
      8. Save the VA report

    If you have the right privileges for exporting and importing content in SAS Environment Manager (menu option Manage Environment), you can replace all the steps above by exporting the report and them reimporting it. You just need to replace the target CASUSER(<userid>) library with just CASUSER in the GUI during the import process.

  3. You must set automatic refresh in the objects that depend on the table loaded in the CASUSER library, otherwise they will not refresh as the data changes with the job execution. You do that in the object’s Option tab by turning on the option labeled either “Automatically refresh object” or “Periodically reload object data” depending on the VA release. In this example, the object that needs automatic refresh is the time series chart at the bottom of the report. Remember that automatic refresh only works when the report is in view mode.
  4. You must define at least the parameters _job_name and _job_output_cas_table in the VA report and assign a value to them. The parameter _job_executing_message is optional. All of them are character parameters and their names are case sensitive. For this example, the following values were assigned:

    _job_output_cas_table: "OUTLIER_CLEAN"

    _job_executing_message: "Removing outliers..."

    _job_name: "/Public/Jobs/SAS Communities/Outlier”

  5. VA parameters are only passed in the JSON message to the DDC if the parameters affect the data that is being passed to the DDC. This is further explained in the article Using parameters with Data-Driven Content in SAS Visual Analytics, and in this example you guarantee that the parameters are being passed to the DDC by setting a dummy advanced filter in the DDC object that looks like this:

 

Picture 5- Dummy advanced filter applied to the DDC object to guarantee the parameters are being passed in the VA messagePicture 5- Dummy advanced filter applied to the DDC object to guarantee the parameters are being passed in the VA message

 

 

Deploying This Example

 

All files and data used in this example are available for downloading from the GitHub project sas-visualanalytics-thirdpartyvisualizations, under folder called samples/IntegrationWithSASJobs.

 

It requires the dataset OUTLIERS that was provided to you via GitHub loaded in memory in the Public CAS library.

 

Deployment steps:

  1. Download the GitHub project
  2. Unzip it into your Web Server document folder. These instructions assume you have unzipped the GitHub project in a folder called github in the Web Server document root folder. You will only need the content of folders github/utils and github/samples/IntegrationWithSASJobs.
  3. Logged into SAS Job Execution Web application, do the following:
    1. Create a new job in a folder of your preference (e.g. /Public/Jobs/SAS Communities) and name it Outlier (see first article for more details if needed)
    2. Open the job for edition, copy & paste the content of file github/samples/IntegrationWithSASJobs/5.OutliersUseCase/Outliers.sas
    3. Save the job
    4. Add job parameter _action=FORM as discussed in the first article
    5. Add another job parameter _form=/Public/Jobs/SAS Communities/

       

      Picture 6- SAS job fixed parametersPicture 6- SAS job fixed parameters

    6. If you have already deployed the Pareto example from the previous article, skip to #4, otherwise:
      1. Create a new job form as a separate content, as explained in the first article, and name it ProxyDDCForVAJobCASIntegration
      2. Open the job form for edition, copy & paste the content of file github/samples/IntegrationWithSASJobs/4.ParetoUseCase/ ProxyDDCForVAJobCASIntegration.html (1). Yes, this is the same file used in the previous use case, as it was designed to be reusable.
      3. Make changes to the host name (search for your.host.name and replace it accordingly) and path of src on lines 20, 21, and 22 (1):

         

        Picture 7- Modify src appropriatelyPicture 7- Modify src appropriately

         

      4. Save the job form
  4. Logged into SAS Visual Analytics, do the following:
    1. Create a new empty report
    2. With the report opened, hit Ctrl+Alt+B to bring the SAS Visual Analytics Diagnostics window.
    3. With the BIRD tab selected across the top (that’s the default), replace the BIRD XML content with the content of the file github/samples/IntegrationWithSASJobs/5.OutliersUseCase/VA-DDC-Job Outliers use case.xml
    4. Hit Load (this will close the diagnostics window)
    5. With the DDC object at the top right of the report selected, go to the Options pane on the right and fix the URL entry: replace your.host.name accordingly and make sure the path to the job is correct (same value entered in #3a)
    6. Make changes to the values assigned to the parameter _job_name if necessary, so it matches with your environment (same value entered in #3a)
    7. Save the report

Note: (1) Starting with release 2023.06, the examples that used to inherit jQuery from their parent (the SAS Visual Analytics web application) no longer work, so we have provided replacement codes. These replacement files have the same names as their original, but they end with .v4. Because of that, some references to line numbers may not exactly match on those .v4 files. 

 

We could have exported all the content as a package, but this would require special privileges in order to import it. Sharing the example as standalone files will give you the opportunity to better explore the SAS Job Execution Web application, familiarize yourself with the content, and understand how they are connected.

 

Note: The first time a user opens this report it fails because the CASUSER table doesn’t exist yet. Reopening the report will work, as well as any subsequent access to the report. This will happen to any other report that depends on a table that is being dynamically loaded in memory when the report opens.

 

 

Next Steps

 

This is the last article planned for this series. I’ve learned a lot while working on the examples used here, and I refer to them very often whenever I need a template for a new use case. I hope you have found them useful and I hope they can be used as a foundation for your needs as well. As this is an evolving area, I expect to be back soon with a different approach to share with you. Meanwhile, I recommend exploring the references below to learn more about Data-Driven Content and SAS Jobs.

 

 

References

 

 

 

Learn More…

 

Version history
Last update:
‎06-28-2023 06:14 PM
Updated by:
Contributors

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

Free course: Data Literacy Essentials

Data Literacy is for all, even absolute beginners. Jump on board with this free e-learning  and boost your career prospects.

Get Started

Article Tags