Hello all,
I have an original dataset which had 5 variables (alcohol_br1-alcohol_br5) which I have transposed the responses across all 5 variables into one column (COL1) in a new dataset. I ran a proc freq on this new column just to see what the values were in this column and I noticed that the frequency output will sometimes give me the same value multiple times but with different frequencies. I did a cross tab of the COL1 values by alcohol_br1-alcohol_br5 and the output below is an example of the output I get in SAS. I'm wondering if there's a reason why SAS would give the same value twice but with different frequencies? This occurs multiple times and I can't find any difference in the actual value in my dataset.
Molson | Bombay | Stella artois | Molson | |
alcohol_br1 | 2 | 7 | 65 | 599 |
alcohol_br2 | 1 | 0 | 23 | 4 |
alcohol_br3 | 0 | 0 | 90 | 0 |
alcohol_br4 | 0 | 3 | 2 | 2 |
alcohol_br5 | 0 | 6 | 0 | 1 |
@monsterpie wrote:
Hello all,
I have an original dataset which had 5 variables (alcohol_br1-alcohol_br5) which I have transposed the responses across all 5 variables into one column (COL1) in a new dataset. I ran a proc freq on this new column just to see what the values were in this column and I noticed that the frequency output will sometimes give me the same value multiple times but with different frequencies. I did a cross tab of the COL1 values by alcohol_br1-alcohol_br5 and the output below is an example of the output I get in SAS. I'm wondering if there's a reason why SAS would give the same value twice but with different frequencies? This occurs multiple times and I can't find any difference in the actual value in my dataset.
Molson Bombay Stella artois Molson alcohol_br1 2 7 65 599 alcohol_br2 1 0 23 4 alcohol_br3 0 0 90 0 alcohol_br4 0 3 2 2 alcohol_br5 0 6 0 1
You don't show anything resembling "col1" so not sure what the above represents in terms of your question.
If the "value" you reference appearing multiple times it is possibly a character variable with leading blanks, which get reformatted in the layout of proc freq to appear the same. See below for an example:
data example; length value $ 10.; value='abc'; do i= 1 to 5; output; value=cat(' ',value); end; output; run; proc freq data=example; tables value; run;
If you look at the data set Example created above you will see that the variable value has different numbers of leading spaces in front of the value 'abc'. So Proc Freq counts them as different. However the way the output table generator works the values are all left justified and look the same. Proc Print will do the same thing. You need to look at the data with another proc like Report where the style override ASIS will suppress the behavior of removing leading spaces.
proc report data=example; column value; define value / display style=[asis=yes]; run;
We need to see your original data (instructions for providing data; do not provide data as a file attachment or as a screen capture), and your code.
Please explain in more detail:
"...and I noticed that the frequency output will sometimes give me the same value multiple times but with different frequencies."
@monsterpie wrote:
Hello all,
I have an original dataset which had 5 variables (alcohol_br1-alcohol_br5) which I have transposed the responses across all 5 variables into one column (COL1) in a new dataset. I ran a proc freq on this new column just to see what the values were in this column and I noticed that the frequency output will sometimes give me the same value multiple times but with different frequencies. I did a cross tab of the COL1 values by alcohol_br1-alcohol_br5 and the output below is an example of the output I get in SAS. I'm wondering if there's a reason why SAS would give the same value twice but with different frequencies? This occurs multiple times and I can't find any difference in the actual value in my dataset.
Molson Bombay Stella artois Molson alcohol_br1 2 7 65 599 alcohol_br2 1 0 23 4 alcohol_br3 0 0 90 0 alcohol_br4 0 3 2 2 alcohol_br5 0 6 0 1
You don't show anything resembling "col1" so not sure what the above represents in terms of your question.
If the "value" you reference appearing multiple times it is possibly a character variable with leading blanks, which get reformatted in the layout of proc freq to appear the same. See below for an example:
data example; length value $ 10.; value='abc'; do i= 1 to 5; output; value=cat(' ',value); end; output; run; proc freq data=example; tables value; run;
If you look at the data set Example created above you will see that the variable value has different numbers of leading spaces in front of the value 'abc'. So Proc Freq counts them as different. However the way the output table generator works the values are all left justified and look the same. Proc Print will do the same thing. You need to look at the data with another proc like Report where the style override ASIS will suppress the behavior of removing leading spaces.
proc report data=example; column value; define value / display style=[asis=yes]; run;
Or just look at the plain old text output.
The FREQ Procedure Cumulative Cumulative value Frequency Percent Frequency Percent ------------------------------------------------------------- abc 1 16.67 1 16.67 abc 1 16.67 2 33.33 abc 1 16.67 3 50.00 abc 1 16.67 4 66.67 abc 1 16.67 5 83.33 abc 1 16.67 6 100.00
Instead of the "pretty" ODS output.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.
Find more tutorials on the SAS Users YouTube channel.