Re: proc freq

pmpradhan · Posted 08-28-2018 01:33 PM

My data is as follow:

office id	staff id
U790	452363
U797	452354
U797	452363
U790	452356
U790	452363
U790	452336
U797	452346
U797	452383
U100	452363

I want to perform count proc freq and get table as:

office id	count of staff	percentage of staff id for each office id
U100	1	11.11%
U790	4	44.44%
U797	4	44.44%

I did proc freq data=have; table staff_id; by office_id; run; I got a weird result.

thanks

Reeza · Posted 08-28-2018 01:59 PM

Your code looks correct. What is your 'weird result'?

@pmpradhan wrote:

My data is as follow:

office id staff id

U790 452363

U797 452354

U797 452363

U790 452356

U790 452363

U790 452336

U797 452346

U797 452383

U100 452363

I want to perform count proc freq and get table as:

office id count of staff percentage of staff id for each office id

U100 1 11.11%

U790 4 44.44%

U797 4
44.44%

I did proc freq data=have; table staff_id; by office_id; run; I got a weird result.

thanks

pmpradhan · Posted 08-28-2018 05:56 PM

the result that I got was something like:

U100 452363 1

U790 452363 2

U790 452356 1

U790 452336 1

and so forth

If proc sql allows me to create the percentage column that is fine too.

novinosrin · Posted 08-28-2018 02:27 PM

Hi @pmpradhan I'm afraid for your expected result you do not need by group

data have;
input (officeid	staffid) ($);
cards;
U790	452363
U797	452354
U797	452363
U790	452356
U790	452363
U790	452336
U797	452346
U797	452383
U100	452363
;
proc freq data=have;
tables officeid/out=want;
run;

pmpradhan · Posted 08-28-2018 05:51 PM

Thanks for the code. How does this handle the duplicates counts of staffid?

novinosrin · Posted 08-28-2018 05:58 PM

Can you please clarify the results you want for your sample?

pmpradhan · Posted 08-28-2018 06:04 PM

something along the lines of
proc sql;
create table x as
select distinct officeid,
count distinct staffid,
(calculated officeid/calculated staffid)*100 from have;
quit;

the code might not be accurate but something along those lines should work.

Reeza · Posted 08-28-2018 06:31 PM

It depends partly on how you define your percentages but I'm a fan of the double proc freq for that type of work.

SQL with distinct works as well.

I illustrate both options here:

https://github.com/statgeek/SAS-Tutorials/blob/master/count_distinct_by_group.sas

*Double PROC FREQ;
proc freq data=sashelp.cars noprint;
table origin * make / out=origin_make;
run;

proc freq data=origin_make noprint;
table origin / out= distinct_freq;
run;

title 'PROC FREQ';
proc print data=distinct_freq;
run;
title 'PROC SQL';
proc print data=distinct_sql;
run;

If you want the percentages in the output data set make sure to specify OUTPCT or OUTPERCENT (can't recall which is correct) to capture the percents.

Registration is open

SAS Training: Just a Click Away