Pareto Example of Integration of SAS Visual Analytics with SAS Jobs via Data-Driven Content-Part 4

2 Likes

In this article, you are going to apply what you have learned about the integration between SAS Visual Analytics (VA), Data-Driven Content (DDC) objects, and SAS jobs to solve a business use case.

It’s very straight forward to create a compounded chart like the one in the screenshot below, showing bars for each country GDP and a line for the cumulative GDP as percentage, but the challenging part is to be able to filter that chart to only display the countries that are responsible for a certain percentage of the total, let’s say 80%, to create a Pareto chart.

Picture 1- Pareto chart for GDP

GTP (cumulative % of total) is an aggregated measure that must be computed from the highest to the lowest GDP, which is easily done by sorting the chart by GDP. An attempt to show only countries where the GTP (cumulative % of total) is less or equal to a certain value (e.g. list of countries that are responsible for 80% of the total GDP) will produce the wrong results if you try to use ranks, or it will not be possible if you try to use filters. Filtering on aggregated measures is not supported in this case and ranking would return a percentage of the total number of countries based on their GTP, not the countries responsible for a percentage of the total GDP. Using "New data from aggregation" to create an aggregated data source will not work either because of the sequential nature of the cumulative calculation, and while that sequence can be easily defined in a chart, it doesn’t exist for aggregated tables computed in VA. The solution is to pass the data to a job that will create a new table on the fly with the percentages accumulated in the right order. That table can then be filtered because the cumulative percentage is no longer an aggregated measure calculated in VA.

Pareto with Ranking Example

This example uses data downloaded from The World Bank. The table contains GDP information for countries from years 1960 to 2019, and it was exported as an Excel spreadsheet.

Picture 2- GDP source data from The World Bank

In SAS Data Explorer (menu option Manage Data), make sure you select the correct cells from the spreadsheet while importing it, as in the next screenshot below, as well as the other values highlighted in yellow. I suggest opening the Excel file to confirm that those values remain the same.

Picture 3- Import options for Excel data

By the time the data used in the report was downloaded, year 2019 didn’t have any value, so this column was not used. Also, that original table was transposed, so that the final table had years as rows instead of columns:

Picture 4- Transposed input table

The table can be transposed in SAS Data Studio (menu option Prepare Data). Just use the Transpose transformation with the following column assignment:

ID Columns: Indicator Name
Transpose Columns: all year columns from 1960 to 2018 (do not include 2019 as it is empty)
Group By Columns: Country Name, Country Code

Don’t forget to rename the _NAME_ column to Year:

Picture 5- SAS Data Studio transpose transformation

The final transposed table was saved in the Public library and called GDP_COUNTRY_YEAR_TRANSPOSED. This is the source data for your report.

Data-Driven Content Code

As explained in the previous article, enhancements were made to the HelloCASWorld example to make the DDC code reusable with any job that comply with the interface defined:

Job receives a parameter castab from the DDC
Job returns a JSON message back to the DDC with the following format: {success: …, retcode: …, message: …}

This means the DDC created in the previous article can be used without any modification in this example.

Because this DDC is to be used with many different jobs, in the deployment section below you will see that you are going to be saving this code as a separated job form with a generic name: ProxyDDCForVAJobCASIntegration, indicating that this DDC is to be used as a proxy to connect VA with jobs, and it’s not attached to a particular job.

SAS Job Code

Starting with the template job code from the previous article example, you will only need to add the highlighted code in the “Main Processing” block:

Picture 6- Changes needed in the “Main Processing” block

The PROC CONTENTS and PROC SQL are used to acquire the categorical and numeric column names necessary for PROC FREQ, which is what calculates the cumulative values, percentages, and cumulative percentages. The data step after the PROC FREQ is to adjust the percentages as numbers between 0-1, so that a VA percentage format can be applied.

Notice that the job is not applying business names to the columns generated by PROC FREQ and saved in memory in the output table. It’s your responsibility, as the report designer, to do so.

SAS Visual Analytics Report

The report in this example contains a drop-down object to select the year. This drop-down object filters the DDC object, which is the blank space on the top right. Both objects use data coming from the main source table called GDP_COUTRY_YEAR_TRANSPOSED. The role assignment in the DDC object are Country Name and GDP (current US$). The bar-line chart and the list table below it are fed by the table generated by the job, named PARETO_RANK. The slider object has a single value numeric VA parameter formatted as percentage, with min and max values set to 0 and 1 respectively, which selects the percentage to be used as the filter. The parameter is used in an advanced filter to subset the PARETO_RANK data in both the bar-line chart and the list table. Notice that only when the year drop-down object is changed, the DDC and the job are re-executed. The slider object filters the job’s output table PARETO_RANK without having to re-create the table.

Picture 7- VA sample report

When designing a report like that from the scratch you must remember of a few things discussed in the previous article, such as:

The job’s output table in the CASUSER library doesn’t exist until you execute the job successfully at least once. Only after the table is available in the CASUSER library, you can add it into the report and use it as the source of other VA report objects, like the bar-line chart and the list table.
Tables from CASUSER library that are added to VA reports keep reference to the report author. This reference must be removed before the report is shared with a broader audience. As the report designer, you do that by editing the report BIRD XML:

a. Save the VA report

b. Hit Ctrl+Alt+B to open the SAS Visual Analytics Diagnostics window

c. Make sure the BIRD tab across the top and the XML button are selected (these are the defaults)

d. Click on the Data icon on the left

e. Scroll down searching for the reference to the CASUSER library

f. Remove (<userid>) that is appended to the CASUSER library

Picture 8- Removing userid from CASUSER

g. Click the Load button on the top left

h. Save the VA report

If you have the right privileges for exporting and importing content in SAS Environment Manager (menu option Manage Environment), you can replace all the steps above by exporting the report and them reimporting it. You just need to replace the target CASUSER(<userid>) library with just CASUSER in the GUI during the import process.

You must set automatic refresh in the objects that depend on the table loaded in the CASUSER library, otherwise they will not refresh as the data changes with the job execution. You do that in the object’s Option tab by turning on the option labeled either “Automatically refresh object” or “Periodically reload object data” depending on the VA release. In this example, the objects that need automatic refresh are the bar-line chart and the list table at the bottom of the report. Remember that automatic refresh only works when the report is in view mode.
You must define at least the parameters _job_name and _job_output_cas_table in the VA report and assign a value to them. The parameter _job_executing_message is optional. All of them are character parameters and their names are case sensitive. For this example, the following values were assigned:

_job_output_cas_table: "PARETO_RANK"

_job_executing_message: "Calculating Pareto..."

_job_name: "/Public/Jobs/SAS Communities/Pareto”

VA parameters are only passed in the JSON message to the DDC if the parameters affect the data that is being passed to the DDC. This is further explained in the article Using parameters with Data-Driven Content in SAS Visual Analytics, and in this example you guarantee that the parameters are being passed to the DDC by setting a dummy advanced filter in the DDC object that looks like this:

Picture 9- Dummy advanced filter applied to the DDC object to guarantee the parameters are being passed in the VA message

In addition to those, after the job’s output table is added to the report for the very first time, you must give the data items a more business-oriented labels. As mentioned previously, the job used in the example doesn’t apply user-friendly labels to the columns calculated in PROC FREQ. Knowing what each column represents and labeling them appropriately is important. In this example, the output columns from PROC FREQ were labeled in the VA report as follows:

COUNT: “GDP (current US$)”
PERCENT: “GDP (% of total)”
CUM_FREQ: “GDP (cumulative total)”
CUM_PCT: “GDP (cumulative % total)”

Deploying This Example

All files used in this example are available for downloading from the GitHub project sas-visualanalytics-thirdpartyvisualizations, under folder called samples/IntegrationWithSASJobs.

It requires the data downloaded from The World Bank, transformed and loaded into memory in the Public caslib as explained in the beginning of this article.

Deployment steps:

Download the GitHub project
Unzip it into your Web Server document folder. These instructions assume you have unzipped the GitHub project in a folder called github in the Web Server document root folder. You will only need the content of folders github/utils and github/samples/IntegrationWithSASJobs.
Logged into SAS Job Execution Web application, do the following:
1. Create a new job in a folder of your preference (e.g. /Public/Jobs/SAS Communities) and name it Pareto (see first article for more details if needed)
2. Open the job for edition, copy & paste the content of file github/samples/IntegrationWithSASJobs/4.ParetoUseCase/Pareto.sas
3. Save the job
4. Add job parameter _action=FORM as discussed in the first article
5. Add another job parameter _form=/Public/Jobs/SAS Communities/ProxyDDCForVAJobCASIntegration
  This tells the job to transfer execution to this form, whenever the _action parameter is set to FORM
  
  Picture 10- SAS job fixed parameters
6. Create a new job form as a separate content, as explained in the first article, and name it ProxyDDCForVAJobCASIntegration
7. Open the job form for edition, copy & paste the content of file github/samples/IntegrationWithSASJobs/4.ParetoUseCase/ ProxyDDCForVAJobCASIntegration.html ⁽¹⁾
8. Make changes to the host name (search for your.host.name and replace it accordingly) and path of src on lines 20, 21, and 22 ⁽¹⁾:
  
  Picture 11- Modify src appropriately
9. Save the job form
Logged into SAS Visual Analytics, do the following:
1. Create a new empty report
2. With the report opened, hit Ctrl+Alt+B to bring the SAS Visual Analytics Diagnostics window.
3. With the BIRD tab selected across the top (that’s the default), replace the BIRD XML content with the content of the file github/samples/IntegrationWithSASJobs/4.ParetoUseCase/VA-DDC-Job Pareto use case.xml
4. Hit Load (this will close the diagnostics window)
5. With the DDC object at the top right of the report selected, go to the Options pane on the right and fix the URL entry: replace your.host.name accordingly and make sure the path to the job is correct (same value entered in #3a)
6. Make changes to the values assigned to the parameter _job_name if necessary, so it matches with your environment (same value entered in #3a)
7. Save the report

Note: (1) Starting with release 2023.06, the examples that used to inherit jQuery from their parent (the SAS Visual Analytics web application) no longer work, so we have provided replacement codes. These replacement files have the same names as their original, but they end with .v4. Because of that, some references to line numbers may not exactly match on those .v4 files.

We could have exported all the content as a package, but this would require special privileges in order to import it. Sharing the example as standalone files will give you the opportunity to better explore the SAS Job Execution Web application, familiarize yourself with the content, and understand how they are connected.

Note: The first time a user opens this report it fails because the CASUSER table doesn’t exist yet. Reopening the report will work, as well as any subsequent access to the report. This will happen to any other report that depends on a table that is being dynamically loaded in memory when the report opens.

Next Steps

In the next article you will be exploring another use case: detect and remove outliers from time series to prevent Y axis scale to be too high and compromise the clarity of the information displayed in the chart.

References

SAS Job Execution Web Application 2.2: User’s Guide documentation
JSON Libname Engine documentation
Reading data with the SAS JSON libname engine documentation
Data-Driven Content: leveraging third-party visualizations in SAS Visual Analytics (part 1 of 2) by SAS’s Renato Luppi
Data-Driven Content: leveraging third-party visualizations in SAS Visual Analytics (part 2 of 2) by SAS’s Renato Luppi
Using parameters with Data-Driven Content in SAS Visual Analytics by SAS’s Renato Luppi
sas-visualanalytics-thirdpartyvisualizations GitHub project and examples

Learn More…

acordes · ‎10-31-2020

@Renato_sas that's absolutely fantastic!

I followed your well described steps and it works.

If I hadn't dedicated so much time to VA and tried so desperately to solve the task at hand with job execution solely, it would have been somehow intimidating.

I'm super happy. Now I can apply it to my needs.

Once again thank you.

Renato_sas · ‎11-02-2020

Hi @acordes,

You are very welcome! I'm glad to know that you've found it helpful.

Pareto Example of Integration of SAS Visual Analytics with SAS Jobs via Data-Driven Content-Part 4

Register Today!

Free course: Data Literacy Essentials

Get Started