BookmarkSubscribeRSS Feed
SASuser4321
Obsidian | Level 7

I have the following table with me:

Data1 Data2 Data3
1    
3 2  
6 3 2
4    
5  

 

....

The table has thousands of rows.

I would like to know how many times each number occurs in this table. So for example, for the above table, the output of my code should look like this:

 

Data_value Frequency
1 1
2 2
3 2
4 1
5 1
6 1

Help would be appreciated. Thank you.


13 REPLIES 13
PaigeMiller
Diamond | Level 26

Transpose the data 

 

data transposed;
    set have;
    array d data1-data3;
    do i=1 to 3; 
         if not missing(d(i)) then do;
             data_value=d(i);
             output;
         end;
     end;
    drop data1-data3 i;
run;


The run PROC FREQ on the TRANSPOSED data set.

--
Paige Miller
SASuser4321
Obsidian | Level 7

Thank you for your comment. I appreciate your effort and have upvoted your comment. How do I output the result into one single table? I don't want to see my output in the "results" section of SAS but rather in the "Output Data" section. And I would like the table to look like the one in the OP. The above code doesn't accomplish that task.

PaigeMiller
Diamond | Level 26

@SASuser4321 wrote:
Thank you for your comment. How do I output the result into one single table? I don't want to see my output in the "results" section of SAS but rather in the "Output Data" section.

I think you mean: "How do I output the result into one single data set?" Use the OUT= option in the TABLES statement of PROC FREQ

--
Paige Miller
SASuser4321
Obsidian | Level 7
yes, I tried that. But when I tried something like "proc freq data=transposed; tables out=data;run;", it didn't work.

It says that there is no variable called OUT and no variable called DATA. I would like my output data set to be called 'data'.
PaigeMiller
Diamond | Level 26

The syntax of the TABLES statement

 

TABLES variablename / out = datasetname;

--
Paige Miller
SASuser4321
Obsidian | Level 7
What should I enter for variablename? As shown in the table in the OP, I have three variables: Data1 Data2 Data3
PaigeMiller
Diamond | Level 26

@SASuser4321 wrote:
What should I enter for variablename? As shown in the table in the OP, I have three variables: Data1 Data2 Data3

I'll give you a hint ... you are running PROC FREQ on data set TRANSPOSED so you need to use a variable that is in data set TRANSPOSED

--
Paige Miller
SASuser4321
Obsidian | Level 7
Thank you. It works now. Could you explain what the following two lines in your code mean: if not missing(d(i)) then do;
data_value=d(i);

What is d(i)?
PaigeMiller
Diamond | Level 26

@SASuser4321 wrote:
Thank you. It works now. Could you explain what the following two lines in your code mean: if not missing(d(i)) then do;
data_value=d(i);

What is d(i)?

I used an array which then allows me to refer to data1 as d(1), data2 as d(2), data3 as d(3) and then run a loop where the index i varies from 1 to 3. Arrays are much more helpful if you have 50 or 100 variables, but they work with 3 variables as well.

--
Paige Miller
SASuser4321
Obsidian | Level 7

Hi, thank you for your comment but I don't understand the intuition behind the code. Why do we transpose? What does the i column in my output data mean? I know it means index but how is it relevant in the output table (picture attached with this comment). Why do we get the data_value column in this way and what does it mean? So in the picture with this comment, the i column is the fourth one. It says 1,2,3. How should I interpret this section of the table? I understand the index 3 to refer to the third column and the then the fourth column takes the value in the index. But the overall intuition behind the code is lost to me. I'm wondering if you can come up with a simpler code without do loops. I don't need there to be any consideration of missing values in the code since I'm not looking for any percentages but rather just the absolute number of occurrences. So the code could be simpler? Thank you.

dec7.PNG

PaigeMiller
Diamond | Level 26

The easiest way to count things is if they are all in one variable. So that's why we transpose.

 

Why do we get the data_value column in this way and what does it mean?

Because you asked for a variable named data_value in your original message.

 

The variable I should have been removed from the output data set. And, I don't understand what that table is that you show. Please never show us tables where the columns are not identified by variable name.

 

Did you run PROC FREQ on the transposed data as I said?

--
Paige Miller
ballardw
Super User

@SASuser4321 wrote:
yes, I tried that. But when I tried something like "proc freq data=transposed; tables out=data;run;", it didn't work.

It says that there is no variable called OUT and no variable called DATA. I would like my output data set to be called 'data'.

READ the LOG.

The "didn't work" will be explained by the ERROR message.

 

In general Doesn't work or similar is awful vague.

Are there errors in the log?: Post the code and log in a code box opened with the "</>" to maintain formatting of error messages.

No output? Post any log in a code box.

Unexpected output? Provide input data in the form of data step code pasted into a code box, the actual results and the expected results. Instructions here: https://communities.sas.com/t5/SAS-Communities-Library/How-to-create-a-data-step-version-of-your-dat... will show how to turn an existing SAS data set into data step code that can be pasted into a forum code box using the "</>" icon or attached as text to show exactly what you have and that we can test code against.

webart999ARM
Quartz | Level 8

This might be what you want

proc freq data=your_data_table;
   tables data1 data2 data3 / nocum;
run;

This will generate a frequency table that shows the number of times each value appears in each of the three columns (data1, data2, and data3).

If you want to combine the frequencies from all three columns into a single table, you can use the output out= option to write the results to a new dataset, and then use a data step to merge the results from the three columns into a single table. Here is an example of how you can do that:

 

/* Create a frequency table for each column */
proc freq data=your_data_table;
   tables data1 / nocum output out=freq1;
   tables data2 / nocum output out=freq2;
   tables data3 / nocum output out=freq3;
run;

/* Merge the frequency tables into a single table */
data all_data;
   merge freq1(in=a) freq2(in=b) freq3(in=c);
   by data_value;
   if a then freq=freq1;
   else if b then freq=freq2;
   else if c then freq=freq3;
run;

This will create a new dataset called all_data that contains a single table with the combined frequencies for all three columns. The data_value column will contain the unique values from all three columns, and the freq column will contain the total number of times each value appears in the table.

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 13 replies
  • 3132 views
  • 4 likes
  • 4 in conversation