Hello all,
So I have a social service dataset that is looking at clients uses of various services and it looks something like this:
Client_ID Service_Code Service_Text Flg_plcm
001 1 Foster Care 1
001 3 Transportation 0
002 1 Foster Care 1
003 3 Transportation 1
003 2 Wellness 0
003 1 Foster Care 1
004 1 Foster Care 1
004 3 Transportation 1
004 3 Transportation 0
004 3 Transportation 1
004 3 Transportation 0
I want to count the number of distinct clients that make use of each type of service. Also, I want to distinguish those that use it during a placement (Thus Flg_plcm) so that the output looks like this:
Service_Code Service_Text cnt_child cnt_child_plcm
1 Foster Care 4 4
2 Wellness 1 0
3 Transportation 3 2
Does that make sense? I appreciate any help I can get. I am pretty new to SAS.
Best,
Marc
@ModeratelyWise Is below doing what you're after?
data have;
infile datalines dsd truncover;
input (Client_ID Service_Code Service_Text Flg_plcm) (:$20.);
datalines;
001,1,Foster Care,1
001,3,Transportation,0
002,1,Foster Care,1
003,3,Transportation,1
003,2,Wellness,0
003,1,Foster Care,1
004,1,Foster Care,1
004,3,Transportation,1
004,3,Transportation,0
004,3,Transportation,1
004,3,Transportation,0
;
proc sort data=have out=inter nodupkey;
by Service_Code Client_ID descending Flg_plcm;
run;
data want;
set inter;
by Service_Code Client_ID;
if first.client_id then
do;
cnt_child+1;
cnt_child_plcm+(Flg_plcm='1');
end;
if last.service_code then
do;
output;
call missing(cnt_child,cnt_child_plcm);
end;
run;
proc print data=want;
var Service_Code Service_Text cnt_child cnt_child_plcm;
run;
Or here a SQL alternative
proc sql;
select
Service_Code,
Service_Text,
count(Client_ID) as cnt_child,
sum(max_plcm) as cnt_child_plcm
from
(
select
Service_Code,
max(Service_Text) as Service_Text,
Client_ID,
max(case when Flg_plcm='1' then 1 else 0 end) as max_plcm
from have
group by
Service_Code,
Client_ID
)
group by
Service_Code,
Service_Text
;
quit;
Do any of your client_id have multiple observations with the same Service code? With the same Service code and different Flg_plcm codes?
The questions relate to how distinctness might be defined for those types of cases. If there aren't any such then not much difficulty but there may be some trickyness involved if you have a client with multiple records with the same Service code and different Plcm. If you have such you might want to show a couple examples and rules for reducing to "unique".
Thanks for the quick response ballardw,
Yes clients can have multiple observations with the same service code. Even multiple while in placement and out of placement too.
I've edited the post to reflect the answer to your question.
Your data appear to be grouped by ID and Service_Code/Service_Text. If so, then you can create an intermediate data set with on observation for each ID/Service_Code/Service_Text, with one new variable: maxflg. maxflg will be a zero if no corresponding observations have flg_plcm=1, or it will be a one if any flg_plcm=1.
The you can run a proc summary asking for the number of intermediate observations for each service_code/service_text and asking for the sum of maxflg for each service_code/service_text:
Untested, in the absence of a working data step with sample data:
data vneed / view=vneed;
set have;
by client_id service_code service_text notsorted;
retain max_flg;
if first.service_text then max_flg=flg_plcm;
else max_flg=max(max_flg,flg_plcm);
if last.service_text;
run;
proc summary data=vneed nway ;
class service_code service_text;
var max_flg;
output out=want (drop=_:) n=cnt_child sum=cnt_child_plcm;
run;
Here one way to get the desired result using the sample data you've posted.
data have;
infile datalines dsd truncover;
input (Client_ID Service_Code Service_Text Flg_plcm) (:$20.);
datalines;
001,1,Foster Care,1
001,3,Transportation,0
002,1,Foster Care,1
003,3,Transportation,1
003,2,Wellness,0
003,1,Foster Care,1
004,1,Foster Care,1
004,3,Transportation,1
004,3,Transportation,0
004,3,Transportation,1
004,3,Transportation,0
;
proc sql;
select distinct
Service_Code,
Service_Text,
count(distinct Client_ID) as cnt_child,
sum(
case
when Flg_plcm='1' then 1
else 0
end
)
as cnt_child_placement
from
( select distinct * from have)
group by
Service_Code,
Service_Text
;
quit;
Thanks Patrick!
So this almost works (See below). There are only 16 individuals in this study so the max should only be 16 for either column. The cnt_child column should be the total number of individual kids that receive the service while cnt_child_placement is a subset of that total who are also in placement. So cnt_child_placement should be at most equal to cnt_child. It seems like it is counting the number of times kids in placement used the service rather that the number distinct kids that use the service while in placement. Does that make sense?
@ModeratelyWise Is below doing what you're after?
data have;
infile datalines dsd truncover;
input (Client_ID Service_Code Service_Text Flg_plcm) (:$20.);
datalines;
001,1,Foster Care,1
001,3,Transportation,0
002,1,Foster Care,1
003,3,Transportation,1
003,2,Wellness,0
003,1,Foster Care,1
004,1,Foster Care,1
004,3,Transportation,1
004,3,Transportation,0
004,3,Transportation,1
004,3,Transportation,0
;
proc sort data=have out=inter nodupkey;
by Service_Code Client_ID descending Flg_plcm;
run;
data want;
set inter;
by Service_Code Client_ID;
if first.client_id then
do;
cnt_child+1;
cnt_child_plcm+(Flg_plcm='1');
end;
if last.service_code then
do;
output;
call missing(cnt_child,cnt_child_plcm);
end;
run;
proc print data=want;
var Service_Code Service_Text cnt_child cnt_child_plcm;
run;
Or here a SQL alternative
proc sql;
select
Service_Code,
Service_Text,
count(Client_ID) as cnt_child,
sum(max_plcm) as cnt_child_plcm
from
(
select
Service_Code,
max(Service_Text) as Service_Text,
Client_ID,
max(case when Flg_plcm='1' then 1 else 0 end) as max_plcm
from have
group by
Service_Code,
Client_ID
)
group by
Service_Code,
Service_Text
;
quit;
Thanks Patrick!
This works like a charm.
Best,
Marc
Don't miss out on SAS Innovate - Register now for the FREE Livestream!
Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.