I am having trouble making a frequency table with multiple variables. My goal is to have a basic frequency table to make a stacked bar chart of nominees and winners by year.
Here is the R code I want to emulate in SAS.
library(tidyverse) nominees <- readr::read_csv('https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2021/2021-09-21/nominees.csv') nominees %>% count(year, type)
This is what the output looks like. Winner and nominee counts by year.
# A tibble: 122 x 3 year type n <dbl> <chr> <int> 1 1957 Nominee 16 2 1957 Winner 4 3 1958 Nominee 32 4 1958 Winner 8 5 1959 Nominee 16 6 1959 Winner 4 7 1961 Nominee 7 8 1961 Winner 3 9 1962 Nominee 8 10 1962 Winner 2
Here is what I have in SAS so far:
* Get data;
filename test1234 temp;
proc http
url="https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2021/2021-09-21/nominees.csv"
method="GET"
out=test1234;
run;
proc import out=nominees datafile=test1234 dbms=csv replace;
guessingrows = max;
run;
* Count;
proc freq data=nominees;
tables year * type;
run;
This mostly accomplishes what I want, but there is way too much in the output. Here is a picture of the output with areas I have no need for crossed out.
How can I get the SAS output to be much more minimal than it currently is? Or is SAS able to make a stacked bar chart by year with nominee and winner from this frequency table?
Of course you can control the output and you can also just get the summary table.
To look for options, you would go to the PROC FREQ documentation, the TABLE Statement and look at the options available.
* Count;
proc freq data=nominees;
tables year * type / nopercent norow nocol;
run;
To pipe the output to a data set instead, which is more equivalent to :
summary <- nominees %>% count(year, type)
proc freq data=nominees;
tables year * type / out=summary;
run;
But you can also graph it directly, no need to pre-summarize.
filename test1234 url "https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2021/2021-09-21/nominees.csv";
proc import out=nominees datafile=test1234 dbms=csv replace;
guessingrows = max;
run;
proc freq data=nominees;
tables year * type;
run;
ods html5 file = '/home/fkhurshed/graph1.html' style=meadow gpath='/home/fkhurshed/';
proc sgplot data=nominees;
vbar year / group = type stat=freq groupdisplay=stack;
run;
proc sgplot data=nominees;
vbar year / group = type stat=percent groupdisplay=stack;
run;
ods html close;
FYI - you can read directly from the URL if desired.
I don't know how to set ODS theme and gpath outside of the ODS statement to the main output in SAS Studio.....so I piped it to a different file. The default graphs are kinda ugly as heck though.
You may have to provide examples, or a least a better description of what to keep.
Proc Freq is for simple more exploratory summaries.
Proc Report and Proc Tabulate provided much more control over what is displayed and usually where to some extent. Both will have different limitations on how crosses of columns get displayed. For instance it appears that you want the column with Type = Nominee to show only counts for some years but then include the percentages for other years?
If you desire is not show rows of year = 6 and 8 then you could filter the data with a WHERE clause like:
Where year not in (6 8);
(or possibly clean the data of possibly bad year values).
If you only want counts with freq:
proc freq data=nominees; where year not in (6 8); tables year * type /norow nocol nopercent nocum; run;
Otherwise describe what you want more clearly. I am afraid that crossing out a few items doesn't clearly convey what is wanted.
Of course you can control the output and you can also just get the summary table.
To look for options, you would go to the PROC FREQ documentation, the TABLE Statement and look at the options available.
* Count;
proc freq data=nominees;
tables year * type / nopercent norow nocol;
run;
To pipe the output to a data set instead, which is more equivalent to :
summary <- nominees %>% count(year, type)
proc freq data=nominees;
tables year * type / out=summary;
run;
But you can also graph it directly, no need to pre-summarize.
filename test1234 url "https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2021/2021-09-21/nominees.csv";
proc import out=nominees datafile=test1234 dbms=csv replace;
guessingrows = max;
run;
proc freq data=nominees;
tables year * type;
run;
ods html5 file = '/home/fkhurshed/graph1.html' style=meadow gpath='/home/fkhurshed/';
proc sgplot data=nominees;
vbar year / group = type stat=freq groupdisplay=stack;
run;
proc sgplot data=nominees;
vbar year / group = type stat=percent groupdisplay=stack;
run;
ods html close;
FYI - you can read directly from the URL if desired.
I don't know how to set ODS theme and gpath outside of the ODS statement to the main output in SAS Studio.....so I piped it to a different file. The default graphs are kinda ugly as heck though.
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.