Hey,
I am analysing a questionnaire with multiple choice questions.
My data looks this way:
Var1 Var2 Var3 Var 4
1 0 0 1
1 0 1 0
0 1 1 1
….
so in each row is a proband, and in each column X is 1, if the proband crossed field X in the question, and else 0.
I want to plot a histogram/barplot, where the absolute values are shown, how often the fields are crossed.
So in the above example I would like to get a histogram like:
2 1 2 2
so simply the colsums plotted as a histogram.
How can I do this?
You can use PROC MEANS to sum the columns. Then transpose the sums from wide form to long form. (See "Reshaping data from wide to long format")
You can then create a bar chart of the result, as follows:
data Have;
input Var1 Var2 Var3 Var4;
datalines;
1 0 0 1
1 0 1 0
0 1 1 1
0 0 0 1
;
proc means data=Have;
output out=Sum(drop=_FREQ_) sum=;
run;
/* https://blogs.sas.com/content/iml/2011/01/31/reshaping-data-from-wide-to-long-format.html */
proc transpose data=Sum out=Want(rename=(Col1=Count));
by _TYPE_;
run;
proc sgplot data=Want;
vbar _Name_ / response=Count;
run;
You can use PROC MEANS to sum the columns. Then transpose the sums from wide form to long form. (See "Reshaping data from wide to long format")
You can then create a bar chart of the result, as follows:
data Have;
input Var1 Var2 Var3 Var4;
datalines;
1 0 0 1
1 0 1 0
0 1 1 1
0 0 0 1
;
proc means data=Have;
output out=Sum(drop=_FREQ_) sum=;
run;
/* https://blogs.sas.com/content/iml/2011/01/31/reshaping-data-from-wide-to-long-format.html */
proc transpose data=Sum out=Want(rename=(Col1=Count));
by _TYPE_;
run;
proc sgplot data=Want;
vbar _Name_ / response=Count;
run;
A small additional question:
If my data now looks this way:
data Have;
input Groupvar Var1 Var2 Var3 Var4;
datalines;
1 1 0 0 1
1 1 0 1 0
2 0 1 1 1
2 0 0 0 1
2 0 0 1 1
3 1 0 0 1
;
so that I have an additional group variable. How would I implement this in your suggested code?
I was thinking about a group or by statement, but I cant figure out how to do this....
Thanks in advance 🙂
In the code I provided, no changes are needed, however in PROC SGPLOT you would need
by groupvar;
to get separate bar charts for each level of GROUPVAR
what kind of graph are you wanting now? do you want all groups on the same graph or 3 separate graphs, one for each group?
First, I can't see a histogram working with this data, but a vertical bar chart would work.
I think that to get the var1 var2 var3 etc. on the bottom of a plot, you need to reorganize your data into the long format.
data re_organize;
set have;
array v v1-v4;
do i=1 to dim(v);
variable=vname(v(i));
value=v(i);
output;
end;
drop v1-v4;
run;
And then you can get the bar chart you want from PROC SGPLOT
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.
Find more tutorials on the SAS Users YouTube channel.