We’ll use fictitious data (attached below), SAS Viya & SAS Studio Long Term Support 2020.1 to quickly determine if/how purchasing behaviors differ by gender, geography, item category, and time.
These instructions assume you’re already comfortable navigating SAS Studio, and that SAS Viya and SAS Studio are already installed and configured. Since SAS Studio on SAS version 9.4 and SAS Viya 3.5 are similar, the exact number of clicks and screens will differ a bit, so you can use these same instructions as guidelines in those environments too. Your exact click count will depend on how and where you store the fictitious data. Also note that you’ll likely need to horizontally squeeze and stretch the panes to emphasize the new results pane on the far right. Stretching, scrolling, squeezing panes don’t count towards our click count because they’re optional and improve legibility.
First we’ll save, upload and load our fictitious data into CAS. These preparation steps do not count towards our bid-a-click total of 20 or fewer clicks because your keystrokes will depend upon where/how you store and upload the fictitious data.
Logon to SAS Viya & SAS Studio Long Term Support 2020.1.
* START A CAS SESSION */ cas mySession sessopts=(caslib=casuser timeout=1800 locale="en_US"); /* SET UP LIBRARY REFERENCES */ caslib _all_ list; caslib _all_ assign; libname mysas "directory where you have read/write permissions"; libname demodata cas caslib=casuser; /* LOAD CUSTOMERS, TRANSACTIONS, PRODUCTS DATASETS FROM HOME PATH LIBREF TO CAS */ proc casutil; load data=mysas.all_custs_trans_prods outcaslib='casuser' casout="all_custs_trans_prods" replace; quit;
Review ALL_CUSTS_TRANS_PRODS variables and observations by scrolling horizontally and vertically to review its observations (152) and variables (22). Note its easily understood layman’s terms, with a variety of variables:
These variables are well-suited to meet our goal of quickly determining if/how purchasing behaviors differ by gender, geography, item category, and time.
Begin counting clicks.
Click on Tasks in the left margin:
Click on 4th icon on left margin -> Prepare Data -> Examine Data -> Characterize Data
(double click) (5 clicks)
Click on Select a table icon (highlighted below) to navigate to CASUSER or DEMODATA (may cause error). ALL_CUSTS_TRANS_PRODS dataset
(If you have this as the last prior dataset, you don’t need to conduct another click). (0 – several clicks, depending on where you’ve stored the demo data).
Click on plus sign in upper right corner alongside Automatic Characterization (highlighted below).
CTRL-left mouse click to choose check box along left side of quantity and UnitCost variables (highlighted below). You may need to scroll down to see and choose UnitCost. You've now conducted 9-10 clicks.
You may want to stretch the middle pane to the left to become narrower and widen the right pane where code and log tabs have now been refreshed.
Notice that PROC MEANS AND UNIVARIATE have been generated by your specifications in the middle pane. You’ve conducted 9-10 clicks and the report’s specifications are nearly completed. Not only is this report created in 20 or fewer clicks, it's also generating a complete SAS program, as seen in the right pane.
Scroll down in the middle pane – you may want to stretch it to the left to become wider.
Click the > alongside Custom Characterization. Stretching and scrolling don't count towards our clicks since they will vary by your preferences to improve legibility.
Click on plus sign in upper right corner alongside Categorical Variables (highlighted below).
CTRL-left mouse click to choose check box along left side of Gender and CategoryName (scroll down) (highlighted below). You may need to scroll down to see and choose CategoryName. Click OK. You likely have conducted 14-15 clicks by now.
Scroll down in the right pane – you may want to stretch it to the left to become wider so you can more easily review its generated code.
Notice that PROC FREQ has been added to our PROC MEANS AND UNIVARIATE code with just 3 additional clicks in the middle pane. We’ve conducted 17-18 clicks and nearly completed our report’s specifications.
Scroll down in the middle pane until you see Date variables (highlighted).
Let’s also gather the minimum and maximum date range so we can confirm that the scope of this input data will meet our needs. Choose the purchase date by clicking on plus sign in upper right corner alongside Date Variables and click check box along left side of date (highlighted). Click OK. Your click count is 18-19 now.
If you decide not to specify date, you can create this report in 14-15 clicks.
Stretch and scroll down slowly in the right pane.
Notice how PROC FREQ, MEANS and UNIVARIATE have now created a complete program that gathers descriptive statistics, all by conducting 14-15 clicks via specifications in the middle pane. Now you’re ready for our final click Run, which executes your code and delivers your output report:
You’ll likely need to horizontally squeeze the panes to emphasize the new results pane on the far right. Scroll down to show the characterization’s intelligence – distribution analysis and visualizations of quantity, unit cost, gender, product/item category, plus and the minimum and maximum dates.
What have we accomplished with our 20 (or fewer) clicks?
We’ve quickly produced insight by gathering and visualizing the distinct values for some of our key variables: gender, product categories, quantity, and unit cost, plus the minimum and maximum dates.
Characterization is a great way to begin analyzing your data because it reveals how variables’ distinct values are distributed – these distributions allow us to see and determine if the data makes sense, to determine if this data is valuable or viable for further reporting and modeling and analytics tasks. It can also confirm, deny, or reveal facts such as which gender has conducted more transactions/purchases (the gents here) and fewer transactions/purchases (the ladies here), which categories of items have more and fewer purchases (beverages represent 50% over the other 2 categories here), the range of dates in our data (November 1 – December 22, 2020 – is that narrow timeframe expected?), and how much of any item is purchased (1 is most popular value here, with $5 items being the most often purchased and $25 items being the least often purchased).
Now that you’ve experienced how fast and easily you can produce one report and intelligence in SAS Studio, I hope you’re inspired to try more tasks on your SAS environment.
You can logout, save your work – whatever makes sense for you.
Fun facts: Bid-a-Note has typically been the final head-to-head round of the Name That Tune show, except from 1978 to 1981 and during the 1984-85 tournaments, when it was the next-to-last/penultimate round. In the 2021 episodes, bid-a-note is the first round; 10 is the maximum number of notes that can be bid.
Registration is open! SAS is returning to Vegas for an AI and analytics experience like no other! Whether you're an executive, manager, end user or SAS partner, SAS Innovate is designed for everyone on your team. Register for just $495 by 12/31/2023.
If you are interested in speaking, there is still time to submit a session idea. More details are posted on the website.
Data Literacy is for all, even absolute beginners. Jump on board with this free e-learning and boost your career prospects.