- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hello!
I have a dataset (simplified example) with only one variable.
Data have;
input class$;
A
A
A
B
B
C
C
C
C
D
I would like to calculate the percentages and counts of each variable in my dataset using PROC SQL.
Output would be similiar to proc freq:
Desired output should be ;
class |
Frequency |
Percent |
A |
3 |
30 |
B |
2 |
20 |
C |
4 |
40 |
D |
1 |
10 |
How can I do this with PROC SQL?
Thanks!
- Tags:
- Percentages
- sql
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hi @Chris_Loke I copy pasted from online for you 🙂
Data have;
input class$;
cards;
A
A
A
B
B
C
C
C
C
D
;
proc sql;
create table want as
select class,count(class) as frequency,calculated frequency/(select count(*) from have) as pct format=percent.
from have
group by class;
quit;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hi @Chris_Loke I copy pasted from online for you 🙂
Data have;
input class$;
cards;
A
A
A
B
B
C
C
C
C
D
;
proc sql;
create table want as
select class,count(class) as frequency,calculated frequency/(select count(*) from have) as pct format=percent.
from have
group by class;
quit;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Thanks!!!
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
How about like that:
proc sql;
select a.class, count(a.class) as countclass, (calculated countclass/b.all)*100 as percent
from
have as a, (select count(class) as all from have) as b
group by a.class
;
quit;
All the best
Bart
Polish SAS Users Group: www.polsug.com and communities.sas.com/polsug
"SAS Packages: the way to share" at SGF2020 Proceedings (the latest version), GitHub Repository, and YouTube Video.
Hands-on-Workshop: "Share your code with SAS Packages"
"My First SAS Package: A How-To" at SGF2021 Proceedings
SAS Ballot Ideas: one: SPF in SAS, two, and three
SAS Documentation
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Why not use PROC FREQ which does the calculations by default, and doesn't have the drawback that missing values will cause the WRONG percentages to be calculated by PROC SQL (unless missing is a valid category). PROC FREQ gives you a choice how to handle missings, PROC SQL does not. PROC FREQ easily generalizes to two-way tables while PROC SQL does not. There are so many benefits to using PROC FREQ, I would not recommend PROC SQL for this purpose.
Paige Miller
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Agree with @PaigeMiller
Here is a quick example. I added missing values in the original data set and included the out= option to create new tables if you need them.
Data have;
infile datalines missover;
input class$;
datalines;
A
A
A
B
B
C
C
C
C
D
;
run;
/*Ignore missing in frequency. Will show how many are missing in the output table but won't count them in the percentage*/
title "Ignore Missing in Frequency";
proc freq data=have;
tables class /out=newtbl1(where=(class ne " ")) /*create a new table and remove the missing row from it*/
nocum /*remove cumulative freq*/;
run;
/*Include missing in frequency and will include missing values in the percentage*/
title "Include Missing in Frequency";
proc freq data=have;
tables class /out=newtbl2 /*create a new table*/
missing /*remove cumulative freq*/
nocum /*include missing values in frequency*/;
run;
title;
HTML Results
- Peter
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Thank you for replying!
The thing is that I have ten different datasets for ten different years that I want to perform my counts on. I want join all the years togheter in to one table and the export it to excel.
If this is possible to do with Proc Freq I would be happy to learn it!
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Both FREQ and SQL take only one input so it doesn't' matter which you use. Unless you create a view first and then run the SQL or FREQ on the view. Either way, FREQ would still be more efficient in terms of processing time, but you already have a solution so now it's most efficient to use that.
@Chris_Loke wrote:
Thank you for replying!
The thing is that I have ten different datasets for ten different years that I want to perform my counts on. I want join all the years togheter in to one table and the export it to excel.
If this is possible to do with Proc Freq I would be happy to learn it!