11-10-2015 12:14 PM - edited 11-10-2015 12:22 PM
I want to compare overall data with a subset.
I could do this overlaying histograms thanks to these great answers.
That was for continuous data. Can the same be done with nominal and ordinal data in a grouped Pie chart or barplot?
11-10-2015 12:56 PM
If your data is grouped, and you want to compare the values side by side by Category, you can use the Proc SGPLOT VBAR statement or the Proc GCHART VBAR statement.
proc sgplot data=sashelp.cars;
vbar origin / response=mpg_city group=type groupdisplay=cluster stat=mean;
11-10-2015 03:38 PM
Thanks for answering.
This is not exactly what I'm looking for. I want to compare the distribution of a subgroup with the overall distribution.
Let's take an example.
Assuming I have different clusters of customers, I want to see what are the characteristics of each one.
This can be done by comparing the distribution of each cluster with the overall distribution.
For example, for cluster1, I compare its countries' proportions to all data countiries' proportions.
A pie chart grouping both proportions can be visually easier to compare than two side by side pie charts.
Hope my example is clear
11-10-2015 04:26 PM
11-10-2015 04:35 PM
Regardless of whether you use SGPLOT or GTL/SGRENDER, the data manipulation is the same:
1. Create a dataset with the subset data extracted from the original data, keeping only the columns necessary for the chart.
2. Rename the kept columns in the new dataset to something different from the orginal data column names.
3. Create a new dataset that is a "straight" merge (not "match-merge") of the subset data and the original data.
4. Use this merged dataset in either SGPLOT or SGRENDER.
Hope this helps!
11-10-2015 05:15 PM
One benefit with GTL is you can use the EVAL feature using IFC and IFN functions (and othe rways) to do subsetting of data on the fly inside the template.
11-17-2015 05:30 AM
I finally used a butterfly plot to compare nominal variables. I used the eval function in GTL to keep values related to a specific filter variable's modality, and then compare them to overall values.
Butterfly plot is finally clearer and easier to compare than a pie chart.
I created a macro to automate all this so that it can be reusable.
Need further help from the community? Please ask a new question.