BookmarkSubscribeRSS Feed
colabear
Obsidian | Level 7

I am trying to generate a summary table, but I can't seem to think of how to execute it. What I need is to add a column with distinct count of ID by DeptID (or name) and by month of service. The 'hours' and 'amount' columns would need to be summed accordingly. I currently have a table but I would like it to contain distinct count of ID. 

 

proc means data = services_fortab noprint nway sum;
var hours paid;
class vendorid vendorname svc_month Fiscal_Year SvcLocDesc ;
output out = services_fortab2(drop = _TYPE_ _FREQ_) sum=;
run;

 

data welcome;
infile datalines dlm=",";
input ID $ DeptID $ Name :$20. svcdate :yymmdd10. Fiscal_Year Hours Amount ;
format svcdate yymmddd10.;
datalines;
1234, 7, East, 2019-01-01, 2019, 1.50, 109.7
1234, 7, East, 2019-01-01, 2019, 7.50, 509.7
6543, 7, East, 2019-01-1, 2019, 2.75, 340.82
7847, 8, West, 2019-02-01, 2019, 5, 7000
5818, 10, North, 2019-03-01, 2019, 6, 1032
0844, 3, South, 2019-04-01, 2019, 40, 855
4499, 5, West, 2018-09-01, 2019, 10, 790
1234, 7, East, 2019-09-01, 2019, 6.50, 1900.7
3000, 2, North, 2018-12-01, 2019, 11.75, 874.79
3785, 27, West, 2018-11-01, 2019, 20.3, 450.64

;
run;

 

I'd like the output to look something like this:

DeptIDNamesvcdateFiscal YearHoursAmountCount
7East1/1/2019201910532450
7East3/1/2019201967563
8West2/1/2019201967236634
8West4/1/20192019987899467
10North2/1/201920193395664
10North3/1/201920191005899223
3South9/1/201820199357811
3South2/1/2019201925235176

I know the solution is likely obvious, but I can't figure it out. Thank you! 

3 REPLIES 3
Jagadishkatam
Amethyst | Level 16

You can try below code

 

data welcome;
infile datalines dlm=",";
input ID $ DeptID $ Name :$20. svcdate :yymmdd10. Fiscal_Year Hours Amount ;
format svcdate yymmddd10.;
datalines;
1234, 7, East, 2019-01-01, 2019, 1.50, 109.7
1234, 7, East, 2019-01-01, 2019, 7.50, 509.7
6543, 7, East, 2019-01-1, 2019, 2.75, 340.82
7847, 8, West, 2019-02-01, 2019, 5, 7000
5818, 10, North, 2019-03-01, 2019, 6, 1032
0844, 3, South, 2019-04-01, 2019, 40, 855
4499, 5, West, 2018-09-01, 2019, 10, 790
1234, 7, East, 2019-09-01, 2019, 6.50, 1900.7
3000, 2, North, 2018-12-01, 2019, 11.75, 874.79
3785, 27, West, 2018-11-01, 2019, 20.3, 450.64

;
run;

proc sql;
create table want as select DeptID, Name, svcdate, Fiscal_Year , sum(Hours) as hours, sum(amount) as amount, count(id) as count 
from welcome group by DeptID, Name, svcdate, Fiscal_Year ;
quit;
Thanks,
Jag
colabear
Obsidian | Level 7

Hello Jag,

Thank you for the quick response! The count I am getting is not a sum of distinct ID. I want to show how many individuals were seen at the department X during month XX-YYYY. 

Thank you! 

Reeza
Super User

Sadly it isn't. There isn't a default function to do distinct counts in SAS so you need to calculate it externally and then merge it in with your other table. 

 

There are two methods to do this that I recommend, PROC SQL or double PROC FREQ. Examples for both are below. To scale it for multiple variables add you extra variables to the GROUP BY or TABLE statement. 

 

/*This demonstrates how to count the number of unique occurences of a variable
across groups. It uses the SASHELP.CARS dataset which is available with any SAS installation.
The objective is to determine the number of unique car makers by origin/

Note: The SQL solution can be off if you have a large data set and these are not the only two ways to calculate distinct counts.
If you're dealing with a large data set other methods may be appropriate.*/

*Count distinct IDs;
proc sql;
create table distinct_sql as
select origin, count(distinct make) as n_make
from sashelp.cars
group by origin;
quit;

*Double PROC FREQ;
proc freq data=sashelp.cars noprint;
table origin * make / out=origin_make;
run;

proc freq data=origin_make noprint;
table origin / out= distinct_freq;
run;

title 'PROC FREQ';
proc print data=distinct_freq;
run;
title 'PROC SQL';
proc print data=distinct_sql;
run;

@colabear wrote:

I am trying to generate a summary table, but I can't seem to think of how to execute it. What I need is to add a column with distinct count of ID by DeptID (or name) and by month of service. The 'hours' and 'amount' columns would need to be summed accordingly. I currently have a table but I would like it to contain distinct count of ID. 

 

proc means data = services_fortab noprint nway sum;
var hours paid;
class vendorid vendorname svc_month Fiscal_Year SvcLocDesc ;
output out = services_fortab2(drop = _TYPE_ _FREQ_) sum=;
run;

 

data welcome;
infile datalines dlm=",";
input ID $ DeptID $ Name :$20. svcdate :yymmdd10. Fiscal_Year Hours Amount ;
format svcdate yymmddd10.;
datalines;
1234, 7, East, 2019-01-01, 2019, 1.50, 109.7
1234, 7, East, 2019-01-01, 2019, 7.50, 509.7
6543, 7, East, 2019-01-1, 2019, 2.75, 340.82
7847, 8, West, 2019-02-01, 2019, 5, 7000
5818, 10, North, 2019-03-01, 2019, 6, 1032
0844, 3, South, 2019-04-01, 2019, 40, 855
4499, 5, West, 2018-09-01, 2019, 10, 790
1234, 7, East, 2019-09-01, 2019, 6.50, 1900.7
3000, 2, North, 2018-12-01, 2019, 11.75, 874.79
3785, 27, West, 2018-11-01, 2019, 20.3, 450.64

;
run;

 

I'd like the output to look something like this:

DeptID Name svcdate Fiscal Year Hours Amount Count
7 East 1/1/2019 2019 10 5324 50
7 East 3/1/2019 2019 6 756 3
8 West 2/1/2019 2019 67 2366 34
8 West 4/1/2019 2019 98 7899 467
10 North 2/1/2019 2019 33 956 64
10 North 3/1/2019 2019 100 5899 223
3 South 9/1/2018 2019 93 578 11
3 South 2/1/2019 2019 25 235 176

I know the solution is likely obvious, but I can't figure it out. Thank you! 


 

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

Mastering the WHERE Clause in PROC SQL

SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 3 replies
  • 17091 views
  • 0 likes
  • 3 in conversation