Solved: Conditionally count the number of distinct individuals who used a part...

ModeratelyWise · Posted 07-23-2021 07:09 PM

Hello all,

So I have a social service dataset that is looking at clients uses of various services and it looks something like this:

Client_ID Service_Code Service_Text Flg_plcm

001 1 Foster Care 1

001 3 Transportation 0

002 1 Foster Care 1

003 3 Transportation 1

003 2 Wellness 0

003 1 Foster Care 1

004 1 Foster Care 1

004 3 Transportation 1

004 3 Transportation 0

004 3 Transportation 1

004 3 Transportation 0

I want to count the number of distinct clients that make use of each type of service. Also, I want to distinguish those that use it during a placement (Thus Flg_plcm) so that the output looks like this:

Service_Code Service_Text cnt_child cnt_child_plcm

1 Foster Care 4 4

2 Wellness 1 0

3 Transportation 3 2

Does that make sense? I appreciate any help I can get. I am pretty new to SAS.

Best,

Marc

Patrick · Posted 07-26-2021 06:56 PM

@ModeratelyWise Is below doing what you're after?

data have;
  infile datalines dsd truncover;
  input (Client_ID Service_Code Service_Text Flg_plcm) (:$20.);
  datalines;
001,1,Foster Care,1
001,3,Transportation,0
002,1,Foster Care,1
003,3,Transportation,1
003,2,Wellness,0
003,1,Foster Care,1
004,1,Foster Care,1
004,3,Transportation,1
004,3,Transportation,0
004,3,Transportation,1
004,3,Transportation,0
;

proc sort data=have out=inter nodupkey;
  by Service_Code Client_ID descending Flg_plcm;
run;

data want;
  set inter;
  by Service_Code Client_ID;

  if first.client_id then
    do;
      cnt_child+1;
      cnt_child_plcm+(Flg_plcm='1');
    end;

  if last.service_code then
    do;
      output;
      call missing(cnt_child,cnt_child_plcm);
    end;
run;

proc print data=want;
  var Service_Code Service_Text cnt_child cnt_child_plcm;
run;

Or here a SQL alternative

proc sql;
  select
    Service_Code,
    Service_Text,
    count(Client_ID) as cnt_child,
    sum(max_plcm) as cnt_child_plcm
  from
    (
      select
        Service_Code,
        max(Service_Text) as Service_Text,
        Client_ID,
        max(case when Flg_plcm='1' then 1 else 0 end) as max_plcm    
      from have
      group by 
        Service_Code,
        Client_ID
    )
  group by
    Service_Code,
    Service_Text
  ;
quit;

View solution in original post

ballardw · Posted 07-23-2021 07:17 PM

Do any of your client_id have multiple observations with the same Service code? With the same Service code and different Flg_plcm codes?

The questions relate to how distinctness might be defined for those types of cases. If there aren't any such then not much difficulty but there may be some trickyness involved if you have a client with multiple records with the same Service code and different Plcm. If you have such you might want to show a couple examples and rules for reducing to "unique".

ModeratelyWise · Posted 07-23-2021 07:22 PM

Thanks for the quick response ballardw,

Yes clients can have multiple observations with the same service code. Even multiple while in placement and out of placement too.

I've edited the post to reflect the answer to your question.

mkeintz · Posted 07-23-2021 10:55 PM

Your data appear to be grouped by ID and Service_Code/Service_Text. If so, then you can create an intermediate data set with on observation for each ID/Service_Code/Service_Text, with one new variable: maxflg. maxflg will be a zero if no corresponding observations have flg_plcm=1, or it will be a one if any flg_plcm=1.

The you can run a proc summary asking for the number of intermediate observations for each service_code/service_text and asking for the sum of maxflg for each service_code/service_text:

Untested, in the absence of a working data step with sample data:

data vneed / view=vneed;
  set have;
  by client_id service_code service_text notsorted;
  retain max_flg;
  if first.service_text then max_flg=flg_plcm;
  else max_flg=max(max_flg,flg_plcm);
  if last.service_text;
run;

proc summary data=vneed nway ;
  class service_code service_text;
  var max_flg;
  output out=want (drop=_:) n=cnt_child sum=cnt_child_plcm;
run;

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

Patrick · Posted 07-23-2021 10:57 PM

Here one way to get the desired result using the sample data you've posted.

data have;
infile datalines dsd truncover;
input (Client_ID Service_Code Service_Text Flg_plcm) (:$20.);
datalines;
001,1,Foster Care,1
001,3,Transportation,0
002,1,Foster Care,1
003,3,Transportation,1
003,2,Wellness,0
003,1,Foster Care,1
004,1,Foster Care,1
004,3,Transportation,1
004,3,Transportation,0
004,3,Transportation,1
004,3,Transportation,0
;

proc sql;
  select distinct
    Service_Code,
    Service_Text,
    count(distinct Client_ID) as cnt_child,
    sum(
        case 
          when Flg_plcm='1' then 1
          else 0    
          end
      )
      as cnt_child_placement

  from 
    ( select distinct * from have)
  group by
    Service_Code,
    Service_Text
  ;
quit;

ModeratelyWise · Posted 07-26-2021 04:49 PM

Thanks Patrick!

So this almost works (See below). There are only 16 individuals in this study so the max should only be 16 for either column. The cnt_child column should be the total number of individual kids that receive the service while cnt_child_placement is a subset of that total who are also in placement. So cnt_child_placement should be at most equal to cnt_child. It seems like it is counting the number of times kids in placement used the service rather that the number distinct kids that use the service while in placement. Does that make sense?

Patrick · Posted 07-26-2021 06:56 PM

@ModeratelyWise Is below doing what you're after?

data have;
  infile datalines dsd truncover;
  input (Client_ID Service_Code Service_Text Flg_plcm) (:$20.);
  datalines;
001,1,Foster Care,1
001,3,Transportation,0
002,1,Foster Care,1
003,3,Transportation,1
003,2,Wellness,0
003,1,Foster Care,1
004,1,Foster Care,1
004,3,Transportation,1
004,3,Transportation,0
004,3,Transportation,1
004,3,Transportation,0
;

proc sort data=have out=inter nodupkey;
  by Service_Code Client_ID descending Flg_plcm;
run;

data want;
  set inter;
  by Service_Code Client_ID;

  if first.client_id then
    do;
      cnt_child+1;
      cnt_child_plcm+(Flg_plcm='1');
    end;

  if last.service_code then
    do;
      output;
      call missing(cnt_child,cnt_child_plcm);
    end;
run;

proc print data=want;
  var Service_Code Service_Text cnt_child cnt_child_plcm;
run;

Or here a SQL alternative

proc sql;
  select
    Service_Code,
    Service_Text,
    count(Client_ID) as cnt_child,
    sum(max_plcm) as cnt_child_plcm
  from
    (
      select
        Service_Code,
        max(Service_Text) as Service_Text,
        Client_ID,
        max(case when Flg_plcm='1' then 1 else 0 end) as max_plcm    
      from have
      group by 
        Service_Code,
        Client_ID
    )
  group by
    Service_Code,
    Service_Text
  ;
quit;

ModeratelyWise · Posted 07-26-2021 08:30 PM

Thanks Patrick!

This works like a charm.

Best,

Marc

Conditionally count the number of distinct individuals who used a particular service

Re: Conditionally count the number of distinct individuals who used a particular service

Re: Conditionally count the number of distinct individuals who used a particular service

Re: Conditionally count the number of distinct individuals who used a particular service

Re: Conditionally count the number of distinct individuals who used a particular service

Re: Conditionally count the number of distinct individuals who used a particular service

Re: Conditionally count the number of distinct individuals who used a particular service

Re: Conditionally count the number of distinct individuals who used a particular service

Re: Conditionally count the number of distinct individuals who used a particular service

Ready to join fellow brilliant minds for the SAS Hackathon?

Classroom Training Available!