BookmarkSubscribeRSS Feed
pmpradhan
Quartz | Level 8

My data is as follow:

office idstaff id
U790452363
U797452354
U797452363
U790452356
U790452363
U790452336
U797452346
U797452383
U100452363

 

I want to perform count proc freq and get table as:

 

office idcount of staffpercentage of staff id for each office id
U100111.11%
U790444.44%
U7974

44.44%

 

 

I did proc freq data=have; table staff_id; by office_id; run; I got a weird result.

 

thanks

7 REPLIES 7
Reeza
Super User

Your code looks correct. What is your 'weird result'?

 


@pmpradhan wrote:

My data is as follow:

office id staff id
U790 452363
U797 452354
U797 452363
U790 452356
U790 452363
U790 452336
U797 452346
U797 452383
U100 452363

 

I want to perform count proc freq and get table as:

 

office id count of staff percentage of staff id for each office id
U100 1 11.11%
U790 4 44.44%
U797 4

44.44%

 

 

I did proc freq data=have; table staff_id; by office_id; run; I got a weird result.

 

thanks


 

pmpradhan
Quartz | Level 8

the result that I got was something like:

 

U100 452363 1

U790 452363 2

U790 452356 1 

U790 452336 1

and so forth

 

If proc sql allows me to create the percentage column that is fine too.

 

novinosrin
Tourmaline | Level 20

Hi @pmpradhan I'm afraid for your expected result you do not need by group

 

data have;
input (officeid	staffid) ($);
cards;
U790	452363
U797	452354
U797	452363
U790	452356
U790	452363
U790	452336
U797	452346
U797	452383
U100	452363
;
proc freq data=have;
tables officeid/out=want;
run;
pmpradhan
Quartz | Level 8

Thanks for the code. How does this handle the duplicates counts of staffid?

novinosrin
Tourmaline | Level 20

Can you please clarify the results you want for your sample? 

pmpradhan
Quartz | Level 8
something along the lines of
proc sql;
create table x as
select distinct officeid,
count distinct staffid,
(calculated officeid/calculated staffid)*100 from have;
quit;

the code might not be accurate but something along those lines should work.
Reeza
Super User

It depends partly on how you define your percentages but I'm a fan of the double proc freq for that type of work. 

SQL with distinct works as well.

 

I illustrate both options here:

https://github.com/statgeek/SAS-Tutorials/blob/master/count_distinct_by_group.sas

 

*Double PROC FREQ;
proc freq data=sashelp.cars noprint;
table origin * make / out=origin_make;
run;

proc freq data=origin_make noprint;
table origin / out= distinct_freq;
run;

title 'PROC FREQ';
proc print data=distinct_freq;
run;
title 'PROC SQL';
proc print data=distinct_sql;
run;

 

 

If you want the percentages in the output data set make sure to specify OUTPCT or OUTPERCENT (can't recall which is correct) to capture the percents.  

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 7 replies
  • 1973 views
  • 2 likes
  • 3 in conversation