I am trying to generate a summary table, but I can't seem to think of how to execute it. What I need is to add a column with distinct count of ID by DeptID (or name) and by month of service. The 'hours' and 'amount' columns would need to be summed accordingly. I currently have a table but I would like it to contain distinct count of ID.
proc means data = services_fortab noprint nway sum;
var hours paid;
class vendorid vendorname svc_month Fiscal_Year SvcLocDesc ;
output out = services_fortab2(drop = _TYPE_ _FREQ_) sum=;
run;
data welcome;
infile datalines dlm=",";
input ID $ DeptID $ Name :$20. svcdate :yymmdd10. Fiscal_Year Hours Amount ;
format svcdate yymmddd10.;
datalines;
1234, 7, East, 2019-01-01, 2019, 1.50, 109.7
1234, 7, East, 2019-01-01, 2019, 7.50, 509.7
6543, 7, East, 2019-01-1, 2019, 2.75, 340.82
7847, 8, West, 2019-02-01, 2019, 5, 7000
5818, 10, North, 2019-03-01, 2019, 6, 1032
0844, 3, South, 2019-04-01, 2019, 40, 855
4499, 5, West, 2018-09-01, 2019, 10, 790
1234, 7, East, 2019-09-01, 2019, 6.50, 1900.7
3000, 2, North, 2018-12-01, 2019, 11.75, 874.79
3785, 27, West, 2018-11-01, 2019, 20.3, 450.64
;
run;
I'd like the output to look something like this:
DeptID | Name | svcdate | Fiscal Year | Hours | Amount | Count |
7 | East | 1/1/2019 | 2019 | 10 | 5324 | 50 |
7 | East | 3/1/2019 | 2019 | 6 | 756 | 3 |
8 | West | 2/1/2019 | 2019 | 67 | 2366 | 34 |
8 | West | 4/1/2019 | 2019 | 98 | 7899 | 467 |
10 | North | 2/1/2019 | 2019 | 33 | 956 | 64 |
10 | North | 3/1/2019 | 2019 | 100 | 5899 | 223 |
3 | South | 9/1/2018 | 2019 | 93 | 578 | 11 |
3 | South | 2/1/2019 | 2019 | 25 | 235 | 176 |
I know the solution is likely obvious, but I can't figure it out. Thank you!
You can try below code
data welcome;
infile datalines dlm=",";
input ID $ DeptID $ Name :$20. svcdate :yymmdd10. Fiscal_Year Hours Amount ;
format svcdate yymmddd10.;
datalines;
1234, 7, East, 2019-01-01, 2019, 1.50, 109.7
1234, 7, East, 2019-01-01, 2019, 7.50, 509.7
6543, 7, East, 2019-01-1, 2019, 2.75, 340.82
7847, 8, West, 2019-02-01, 2019, 5, 7000
5818, 10, North, 2019-03-01, 2019, 6, 1032
0844, 3, South, 2019-04-01, 2019, 40, 855
4499, 5, West, 2018-09-01, 2019, 10, 790
1234, 7, East, 2019-09-01, 2019, 6.50, 1900.7
3000, 2, North, 2018-12-01, 2019, 11.75, 874.79
3785, 27, West, 2018-11-01, 2019, 20.3, 450.64
;
run;
proc sql;
create table want as select DeptID, Name, svcdate, Fiscal_Year , sum(Hours) as hours, sum(amount) as amount, count(id) as count
from welcome group by DeptID, Name, svcdate, Fiscal_Year ;
quit;
Hello Jag,
Thank you for the quick response! The count I am getting is not a sum of distinct ID. I want to show how many individuals were seen at the department X during month XX-YYYY.
Thank you!
Sadly it isn't. There isn't a default function to do distinct counts in SAS so you need to calculate it externally and then merge it in with your other table.
There are two methods to do this that I recommend, PROC SQL or double PROC FREQ. Examples for both are below. To scale it for multiple variables add you extra variables to the GROUP BY or TABLE statement.
/*This demonstrates how to count the number of unique occurences of a variable
across groups. It uses the SASHELP.CARS dataset which is available with any SAS installation.
The objective is to determine the number of unique car makers by origin/
Note: The SQL solution can be off if you have a large data set and these are not the only two ways to calculate distinct counts.
If you're dealing with a large data set other methods may be appropriate.*/
*Count distinct IDs;
proc sql;
create table distinct_sql as
select origin, count(distinct make) as n_make
from sashelp.cars
group by origin;
quit;
*Double PROC FREQ;
proc freq data=sashelp.cars noprint;
table origin * make / out=origin_make;
run;
proc freq data=origin_make noprint;
table origin / out= distinct_freq;
run;
title 'PROC FREQ';
proc print data=distinct_freq;
run;
title 'PROC SQL';
proc print data=distinct_sql;
run;
@colabear wrote:
I am trying to generate a summary table, but I can't seem to think of how to execute it. What I need is to add a column with distinct count of ID by DeptID (or name) and by month of service. The 'hours' and 'amount' columns would need to be summed accordingly. I currently have a table but I would like it to contain distinct count of ID.
proc means data = services_fortab noprint nway sum;
var hours paid;
class vendorid vendorname svc_month Fiscal_Year SvcLocDesc ;
output out = services_fortab2(drop = _TYPE_ _FREQ_) sum=;
run;
data welcome;
infile datalines dlm=",";
input ID $ DeptID $ Name :$20. svcdate :yymmdd10. Fiscal_Year Hours Amount ;
format svcdate yymmddd10.;
datalines;
1234, 7, East, 2019-01-01, 2019, 1.50, 109.7
1234, 7, East, 2019-01-01, 2019, 7.50, 509.7
6543, 7, East, 2019-01-1, 2019, 2.75, 340.82
7847, 8, West, 2019-02-01, 2019, 5, 7000
5818, 10, North, 2019-03-01, 2019, 6, 1032
0844, 3, South, 2019-04-01, 2019, 40, 855
4499, 5, West, 2018-09-01, 2019, 10, 790
1234, 7, East, 2019-09-01, 2019, 6.50, 1900.7
3000, 2, North, 2018-12-01, 2019, 11.75, 874.79
3785, 27, West, 2018-11-01, 2019, 20.3, 450.64;
run;
I'd like the output to look something like this:
DeptID Name svcdate Fiscal Year Hours Amount Count 7 East 1/1/2019 2019 10 5324 50 7 East 3/1/2019 2019 6 756 3 8 West 2/1/2019 2019 67 2366 34 8 West 4/1/2019 2019 98 7899 467 10 North 2/1/2019 2019 33 956 64 10 North 3/1/2019 2019 100 5899 223 3 South 9/1/2018 2019 93 578 11 3 South 2/1/2019 2019 25 235 176 I know the solution is likely obvious, but I can't figure it out. Thank you!
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.
Find more tutorials on the SAS Users YouTube channel.