In this article, you are going to apply what you have learned about the integration between SAS Visual Analytics (VA), Data-Driven Content (DDC) objects, and SAS jobs to solve a business use case.
It’s very straight forward to create a compounded chart like the one in the screenshot below, showing bars for each country GDP and a line for the cumulative GDP as percentage, but the challenging part is to be able to filter that chart to only display the countries that are responsible for a certain percentage of the total, let’s say 80%, to create a Pareto chart.
GTP (cumulative % of total) is an aggregated measure that must be computed from the highest to the lowest GDP, which is easily done by sorting the chart by GDP. An attempt to show only countries where the GTP (cumulative % of total) is less or equal to a certain value (e.g. list of countries that are responsible for 80% of the total GDP) will produce the wrong results if you try to use ranks, or it will not be possible if you try to use filters. Filtering on aggregated measures is not supported in this case and ranking would return a percentage of the total number of countries based on their GTP, not the countries responsible for a percentage of the total GDP. Using "New data from aggregation" to create an aggregated data source will not work either because of the sequential nature of the cumulative calculation, and while that sequence can be easily defined in a chart, it doesn’t exist for aggregated tables computed in VA. The solution is to pass the data to a job that will create a new table on the fly with the percentages accumulated in the right order. That table can then be filtered because the cumulative percentage is no longer an aggregated measure calculated in VA.
Pareto with Ranking Example
This example uses data downloaded from The World Bank. The table contains GDP information for countries from years 1960 to 2019, and it was exported as an Excel spreadsheet.
In SAS Data Explorer (menu option Manage Data), make sure you select the correct cells from the spreadsheet while importing it, as in the next screenshot below, as well as the other values highlighted in yellow. I suggest opening the Excel file to confirm that those values remain the same.
By the time the data used in the report was downloaded, year 2019 didn’t have any value, so this column was not used. Also, that original table was transposed, so that the final table had years as rows instead of columns:
The table can be transposed in SAS Data Studio (menu option Prepare Data). Just use the Transpose transformation with the following column assignment:
Don’t forget to rename the _NAME_ column to Year:
The final transposed table was saved in the Public library and called GDP_COUNTRY_YEAR_TRANSPOSED. This is the source data for your report.
Data-Driven Content Code
As explained in the previous article, enhancements were made to the HelloCASWorld example to make the DDC code reusable with any job that comply with the interface defined:
This means the DDC created in the previous article can be used without any modification in this example.
Because this DDC is to be used with many different jobs, in the deployment section below you will see that you are going to be saving this code as a separated job form with a generic name: ProxyDDCForVAJobCASIntegration, indicating that this DDC is to be used as a proxy to connect VA with jobs, and it’s not attached to a particular job.
SAS Job Code
Starting with the template job code from the previous article example, you will only need to add the highlighted code in the “Main Processing” block:
The PROC CONTENTS and PROC SQL are used to acquire the categorical and numeric column names necessary for PROC FREQ, which is what calculates the cumulative values, percentages, and cumulative percentages. The data step after the PROC FREQ is to adjust the percentages as numbers between 0-1, so that a VA percentage format can be applied.
Notice that the job is not applying business names to the columns generated by PROC FREQ and saved in memory in the output table. It’s your responsibility, as the report designer, to do so.
SAS Visual Analytics Report
The report in this example contains a drop-down object to select the year. This drop-down object filters the DDC object, which is the blank space on the top right. Both objects use data coming from the main source table called GDP_COUTRY_YEAR_TRANSPOSED. The role assignment in the DDC object are Country Name and GDP (current US$). The bar-line chart and the list table below it are fed by the table generated by the job, named PARETO_RANK. The slider object has a single value numeric VA parameter formatted as percentage, with min and max values set to 0 and 1 respectively, which selects the percentage to be used as the filter. The parameter is used in an advanced filter to subset the PARETO_RANK data in both the bar-line chart and the list table. Notice that only when the year drop-down object is changed, the DDC and the job are re-executed. The slider object filters the job’s output table PARETO_RANK without having to re-create the table.
When designing a report like that from the scratch you must remember of a few things discussed in the previous article, such as:
a. Save the VA report
b. Hit Ctrl+Alt+B to open the SAS Visual Analytics Diagnostics window
c. Make sure the BIRD tab across the top and the XML button are selected (these are the defaults)
d. Click on the Data icon on the left
e. Scroll down searching for the reference to the CASUSER library
f. Remove (<userid>) that is appended to the CASUSER library
g. Click the Load button on the top left
h. Save the VA report
If you have the right privileges for exporting and importing content in SAS Environment Manager (menu option Manage Environment), you can replace all the steps above by exporting the report and them reimporting it. You just need to replace the target CASUSER(<userid>) library with just CASUSER in the GUI during the import process.
_job_output_cas_table: "PARETO_RANK"
_job_executing_message: "Calculating Pareto..."
_job_name: "/Public/Jobs/SAS Communities/Pareto”
In addition to those, after the job’s output table is added to the report for the very first time, you must give the data items a more business-oriented labels. As mentioned previously, the job used in the example doesn’t apply user-friendly labels to the columns calculated in PROC FREQ. Knowing what each column represents and labeling them appropriately is important. In this example, the output columns from PROC FREQ were labeled in the VA report as follows:
Deploying This Example
All files used in this example are available for downloading from the GitHub project sas-visualanalytics-thirdpartyvisualizations, under folder called samples/IntegrationWithSASJobs.
It requires the data downloaded from The World Bank, transformed and loaded into memory in the Public caslib as explained in the beginning of this article.
Deployment steps:
This tells the job to transfer execution to this form, whenever the _action parameter is set to FORM
Note: (1) Starting with release 2023.06, the examples that used to inherit jQuery from their parent (the SAS Visual Analytics web application) no longer work, so we have provided replacement codes. These replacement files have the same names as their original, but they end with .v4. Because of that, some references to line numbers may not exactly match on those .v4 files.
We could have exported all the content as a package, but this would require special privileges in order to import it. Sharing the example as standalone files will give you the opportunity to better explore the SAS Job Execution Web application, familiarize yourself with the content, and understand how they are connected.
Note: The first time a user opens this report it fails because the CASUSER table doesn’t exist yet. Reopening the report will work, as well as any subsequent access to the report. This will happen to any other report that depends on a table that is being dynamically loaded in memory when the report opens.
Next Steps
In the next article you will be exploring another use case: detect and remove outliers from time series to prevent Y axis scale to be too high and compromise the clarity of the information displayed in the chart.
References
Learn More…
@Renato_sas that's absolutely fantastic!
I followed your well described steps and it works.
If I hadn't dedicated so much time to VA and tried so desperately to solve the task at hand with job execution solely, it would have been somehow intimidating.
I'm super happy. Now I can apply it to my needs.
Once again thank you.
Hi @acordes,
You are very welcome! I'm glad to know that you've found it helpful.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Data Literacy is for all, even absolute beginners. Jump on board with this free e-learning and boost your career prospects.