New SAS Programmer, and I am currently using Proc Freq to calculate the percent and count of two variables in a dataset and it works great. However, I need to only display the events where there percent is greater than 15. I am assuming I should use a proc sql so that I can create a table and then use that created dataset to write if 'variable' lt .15 delete or something to that nature. However, I am not very familar with proc sql or prox freq for that matter, and maybe there is a way to do this in proc freq but I am not sure how. I would 'prefer' to use PROC Freq over Proc SQL because I understand it better, but that is just my preference. My current syntax reads;
proc freq data=table2;
table SOC CODE / nocum;
run;
any ideas?
Maybe simpler though less complete:
proc freq data=table2 noprint;
tables SOC / nocum out=SOCFreq; /*<= creates output summary set*/
tables CODE / nocum out=CodeFreq;
run;
proc print data= SOCFreq noobs;
where percent ge 15; /*<= Filter on percent*/
run;
proc print data= CodeFreq noobs;
where percent ge 15; /*<= Filter on percent*/
run;
Note the Freq only creates an output data set for one-way table this way for the last variable on each tables statement, so there are two tables, one to create each output data set.
Either way you're going to go through the same process - create a table with results and then filter and print it out. I think it's a bit simpler in proc freq since you already have the calculations completed.
I have some code here that processes a proc freq table and generates the output in a formatted form, you can add a WHERE clause to the last table to filter your results.
Maybe simpler though less complete:
proc freq data=table2 noprint;
tables SOC / nocum out=SOCFreq; /*<= creates output summary set*/
tables CODE / nocum out=CodeFreq;
run;
proc print data= SOCFreq noobs;
where percent ge 15; /*<= Filter on percent*/
run;
proc print data= CodeFreq noobs;
where percent ge 15; /*<= Filter on percent*/
run;
Note the Freq only creates an output data set for one-way table this way for the last variable on each tables statement, so there are two tables, one to create each output data set.
Here's a simplified version of what should happen. It assumes that the percents should be based on the total count of all observations, not on the subset of those with a percent > 15%.
proc freq data=have;
tables soc / noprint out=soc_stats;
tables code / noprint out=code_stats;
run;
proc print data=soc_stats;
where percent > 15;
var soc count percent;
run;
***Looks like I was 34 seconds slower than ballardw.
proc print data=code_stats;
where percent > 15;
var code count percent;
run;
You may have to double-check on the right cutoff for the WHERE statement. I believe that 15 is correct, but you may need to change it to 0.15 instead.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.