BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
RoddyJ
Obsidian | Level 7

Hi all,

 

I'm trying to use proc SQL to calculate a percentage of entries in a dataset.

 

I'm using the following code:

proc SQL;
create table percents as
select *, 
count(pta = "Yes")/count(*) as pct_pta,
count(pta = "Yes") as count_pta from d1; quit;

The count(pta = "Yes") is counting every entry in the dataset, the pta = "Yes" condition isn't working properly.

 

The dataset has a column pta where the value is either "Yes" or missing, I tired changing it to "Yes" or "No" and 0 or 1, but still no luck.

 

Can anyone help me out? 

1 ACCEPTED SOLUTION

Accepted Solutions
PeterClemmensen
Tourmaline | Level 20

Using your own logic, a quick fix could be this

 

data d1;
pta='Yes'; output;
pta='No'; output;
pta='Yes'; output;
pta='Yes'; output;
run;

proc SQL;
create table percents as
select *, 
sum(pta = "Yes")/count(*) as pct_pta,
sum(pta = "Yes") as count_pta
from d1;
quit;

View solution in original post

3 REPLIES 3
PeterClemmensen
Tourmaline | Level 20

Using your own logic, a quick fix could be this

 

data d1;
pta='Yes'; output;
pta='No'; output;
pta='Yes'; output;
pta='Yes'; output;
run;

proc SQL;
create table percents as
select *, 
sum(pta = "Yes")/count(*) as pct_pta,
sum(pta = "Yes") as count_pta
from d1;
quit;
ed_sas_member
Meteorite | Level 14

Hi @RoddyJ 

 

Depending on the output you want, you can also remove *, in you code. It will output only the percentage and frequency you want.

Otherwise, the method proposed by @PeterClemmensen  will output as many rows as input dataset with duplicate information regarding percentage and frequency.

gamotte
Rhodochrosite | Level 12

Hello,

 

count function gives you the number of observations for the variable in the resulting dataset, no matter

what particular values the variable takes.

 

data have;
input dummy;
cards;
0
1
1
0
0
0
1
0
;
quit;

proc sql;
SELECT count(dummy) FROM have;
quit;

Your condition (pta = "Yes") creates an implicit dummy variable as the one in the example above.

When pta <> "Yes" the observation will be zero but will be counted anyway.

 

=> You have to use the sum function as @PeterClemmensen suggests.

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 3 replies
  • 1680 views
  • 1 like
  • 4 in conversation