My data is as follow:
office id | staff id |
U790 | 452363 |
U797 | 452354 |
U797 | 452363 |
U790 | 452356 |
U790 | 452363 |
U790 | 452336 |
U797 | 452346 |
U797 | 452383 |
U100 | 452363 |
I want to perform count proc freq and get table as:
office id | count of staff | percentage of staff id for each office id |
U100 | 1 | 11.11% |
U790 | 4 | 44.44% |
U797 | 4 | 44.44% |
I did proc freq data=have; table staff_id; by office_id; run; I got a weird result.
thanks
Your code looks correct. What is your 'weird result'?
@pmpradhan wrote:
My data is as follow:
office id staff id U790 452363 U797 452354 U797 452363 U790 452356 U790 452363 U790 452336 U797 452346 U797 452383 U100 452363
I want to perform count proc freq and get table as:
office id count of staff percentage of staff id for each office id U100 1 11.11% U790 4 44.44% U797 4 44.44%
I did proc freq data=have; table staff_id; by office_id; run; I got a weird result.
thanks
the result that I got was something like:
U100 452363 1
U790 452363 2
U790 452356 1
U790 452336 1
and so forth
If proc sql allows me to create the percentage column that is fine too.
Hi @pmpradhan I'm afraid for your expected result you do not need by group
data have;
input (officeid staffid) ($);
cards;
U790 452363
U797 452354
U797 452363
U790 452356
U790 452363
U790 452336
U797 452346
U797 452383
U100 452363
;
proc freq data=have;
tables officeid/out=want;
run;
Thanks for the code. How does this handle the duplicates counts of staffid?
Can you please clarify the results you want for your sample?
It depends partly on how you define your percentages but I'm a fan of the double proc freq for that type of work.
SQL with distinct works as well.
I illustrate both options here:
https://github.com/statgeek/SAS-Tutorials/blob/master/count_distinct_by_group.sas
*Double PROC FREQ; proc freq data=sashelp.cars noprint; table origin * make / out=origin_make; run; proc freq data=origin_make noprint; table origin / out= distinct_freq; run; title 'PROC FREQ'; proc print data=distinct_freq; run; title 'PROC SQL'; proc print data=distinct_sql; run;
If you want the percentages in the output data set make sure to specify OUTPCT or OUTPERCENT (can't recall which is correct) to capture the percents.
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.