Dear all,
I need to plot a histogram for a sample dataset like this using sgplot (rel. frequencies per year for each group):
Year of Diagnosis | Group | Number of Cases | Relative Frequecies per Year |
2015 | A | 10 | |
2015 | B | 20 | |
2016 | A | 15 | |
2016 | B | 5 | |
2017 | A | 16 | |
2017 | B | 30 | |
2017 | C | 50 | |
2018 | B | 13 | |
2018 | C | 5 |
my first question is that can I use proc sql or proc means to calculate the relative frequencies per year? If yes, how?
If I use proc freq, it calculates the sum over all years but I need the rel. frequencies for group per year. For example for 2015, the rel. frequency should be 10/30*100 for group A and 20/30*100 for group B
the sample plot will look like this: in my case this will be group A, B and C
I am quite unsure regarding what you are asking because the relative frequency (as defined by you) is just the % of the total of the year by group, and therefore the stacks will always add up to 100%, anyway below is my try:
data have;
infile datalines;
input year $ group $ nbr_of_cases;
datalines;
2015 A 10
2015 B 20
2016 A 15
2016 B 5
2017 A 16
2017 B 30
2017 C 50
2018 B 13
2018 C 5
;
run;
proc means data=want noprint nway;
class year;
var nbr_of_cases;
output out=have_sum (drop=_TYPE_ _FREQ_) sum= / autoname;
run;
data want;
merge have have_sum;
by year;
format rel_freq 8.2;
rel_freq = divide(nbr_of_cases,nbr_of_cases_sum);
drop nbr_of_cases nbr_of_cases_sum;
run;
proc sgplot data=want;
vbar year / response=rel_freq group=group;
run;
I am quite unsure regarding what you are asking because the relative frequency (as defined by you) is just the % of the total of the year by group, and therefore the stacks will always add up to 100%, anyway below is my try:
data have;
infile datalines;
input year $ group $ nbr_of_cases;
datalines;
2015 A 10
2015 B 20
2016 A 15
2016 B 5
2017 A 16
2017 B 30
2017 C 50
2018 B 13
2018 C 5
;
run;
proc means data=want noprint nway;
class year;
var nbr_of_cases;
output out=have_sum (drop=_TYPE_ _FREQ_) sum= / autoname;
run;
data want;
merge have have_sum;
by year;
format rel_freq 8.2;
rel_freq = divide(nbr_of_cases,nbr_of_cases_sum);
drop nbr_of_cases nbr_of_cases_sum;
run;
proc sgplot data=want;
vbar year / response=rel_freq group=group;
run;
Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.
Register today!Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Select SAS Training centers are offering in-person courses. View upcoming courses for: