BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
ModeratelyWise
Obsidian | Level 7

Hello all,

So I have a social service dataset that is looking at clients uses of various services and it looks something like this:

 

Client_ID  Service_Code   Service_Text      Flg_plcm

001                  1                     Foster Care          1

001                  3                    Transportation      0

002                  1                     Foster Care           1

003                  3                     Transportation      1

003                  2                      Wellness              0

003                  1                      Foster Care          1

004                  1                      Foster Care          1

004                  3                     Transportation      1

004                  3                     Transportation      0

004                  3                     Transportation      1

004                  3                     Transportation      0

 

I want to count the number of distinct clients that make use of each type of service.  Also, I want to distinguish  those that use it during a placement (Thus Flg_plcm) so that the output looks like this:

 

Service_Code     Service_Text     cnt_child     cnt_child_plcm

             1              Foster Care             4                   4

             2              Wellness                 1                    0

             3              Transportation        3                   2

 

Does that make sense?  I appreciate any help I can get.  I am pretty new to SAS.

 

Best,

Marc

1 ACCEPTED SOLUTION

Accepted Solutions
Patrick
Opal | Level 21

@ModeratelyWise Is below doing what you're after?

data have;
  infile datalines dsd truncover;
  input (Client_ID Service_Code Service_Text Flg_plcm) (:$20.);
  datalines;
001,1,Foster Care,1
001,3,Transportation,0
002,1,Foster Care,1
003,3,Transportation,1
003,2,Wellness,0
003,1,Foster Care,1
004,1,Foster Care,1
004,3,Transportation,1
004,3,Transportation,0
004,3,Transportation,1
004,3,Transportation,0
;

proc sort data=have out=inter nodupkey;
  by Service_Code Client_ID descending Flg_plcm;
run;

data want;
  set inter;
  by Service_Code Client_ID;

  if first.client_id then
    do;
      cnt_child+1;
      cnt_child_plcm+(Flg_plcm='1');
    end;

  if last.service_code then
    do;
      output;
      call missing(cnt_child,cnt_child_plcm);
    end;
run;

proc print data=want;
  var Service_Code Service_Text cnt_child cnt_child_plcm;
run;

Patrick_0-1627340271264.png

 

Or here a SQL alternative

proc sql;
  select
    Service_Code,
    Service_Text,
    count(Client_ID) as cnt_child,
    sum(max_plcm) as cnt_child_plcm
  from
    (
      select
        Service_Code,
        max(Service_Text) as Service_Text,
        Client_ID,
        max(case when Flg_plcm='1' then 1 else 0 end) as max_plcm    
      from have
      group by 
        Service_Code,
        Client_ID
    )
  group by
    Service_Code,
    Service_Text
  ;
quit;

View solution in original post

7 REPLIES 7
ballardw
Super User

Do any of your client_id have multiple observations with the same Service code? With the same Service code and different Flg_plcm codes?

 

The questions relate to how distinctness might be defined for those types of cases. If there aren't any such then not much difficulty but there may be some trickyness involved if you have a client with multiple records with the same Service code and different Plcm. If you have such you might want to show a couple examples and rules for reducing to "unique".

ModeratelyWise
Obsidian | Level 7

Thanks for the quick response ballardw, 

 

Yes clients can have multiple observations with the same service code. Even multiple while in placement and out of placement too.

 

I've edited the post to reflect the answer to your question.

mkeintz
PROC Star

Your data appear to be grouped by ID and Service_Code/Service_Text.  If so, then you can create an intermediate data set with on observation for each ID/Service_Code/Service_Text, with one new variable: maxflgmaxflg will be a zero if no corresponding observations have flg_plcm=1, or it will be a one if any flg_plcm=1.

 

The you can run a proc summary asking for the number of intermediate observations for each service_code/service_text and asking for the sum of maxflg for each service_code/service_text:

 

Untested, in the absence of a working data step with sample data:

 

data vneed / view=vneed;
  set have;
  by client_id service_code service_text notsorted;
  retain max_flg;
  if first.service_text then max_flg=flg_plcm;
  else max_flg=max(max_flg,flg_plcm);
  if last.service_text;
run;

proc summary data=vneed nway ;
  class service_code service_text;
  var max_flg;
  output out=want (drop=_:) n=cnt_child sum=cnt_child_plcm;
run;

 

 

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
Patrick
Opal | Level 21

Here one way to get the desired result using the sample data you've posted.

data have;
infile datalines dsd truncover;
input (Client_ID Service_Code Service_Text Flg_plcm) (:$20.);
datalines;
001,1,Foster Care,1
001,3,Transportation,0
002,1,Foster Care,1
003,3,Transportation,1
003,2,Wellness,0
003,1,Foster Care,1
004,1,Foster Care,1
004,3,Transportation,1
004,3,Transportation,0
004,3,Transportation,1
004,3,Transportation,0
;

proc sql;
  select distinct
    Service_Code,
    Service_Text,
    count(distinct Client_ID) as cnt_child,
    sum(
        case 
          when Flg_plcm='1' then 1
          else 0    
          end
      )
      as cnt_child_placement

  from 
    ( select distinct * from have)
  group by
    Service_Code,
    Service_Text
  ;
quit;

 Patrick_0-1627095479297.png

 

ModeratelyWise
Obsidian | Level 7

Thanks Patrick!

So this almost works (See below). There are only 16 individuals in this study so the max should only be 16 for either column. The cnt_child column should be the total number of individual kids that receive the service while cnt_child_placement is a subset of that total who are also in placement. So cnt_child_placement should be at most equal to cnt_child.  It seems like it is counting the number of times kids in placement used the service rather that the number distinct kids that use the service while in placement.  Does that make sense?

 

ModeratelyWise_0-1627331889572.png

 

Patrick
Opal | Level 21

@ModeratelyWise Is below doing what you're after?

data have;
  infile datalines dsd truncover;
  input (Client_ID Service_Code Service_Text Flg_plcm) (:$20.);
  datalines;
001,1,Foster Care,1
001,3,Transportation,0
002,1,Foster Care,1
003,3,Transportation,1
003,2,Wellness,0
003,1,Foster Care,1
004,1,Foster Care,1
004,3,Transportation,1
004,3,Transportation,0
004,3,Transportation,1
004,3,Transportation,0
;

proc sort data=have out=inter nodupkey;
  by Service_Code Client_ID descending Flg_plcm;
run;

data want;
  set inter;
  by Service_Code Client_ID;

  if first.client_id then
    do;
      cnt_child+1;
      cnt_child_plcm+(Flg_plcm='1');
    end;

  if last.service_code then
    do;
      output;
      call missing(cnt_child,cnt_child_plcm);
    end;
run;

proc print data=want;
  var Service_Code Service_Text cnt_child cnt_child_plcm;
run;

Patrick_0-1627340271264.png

 

Or here a SQL alternative

proc sql;
  select
    Service_Code,
    Service_Text,
    count(Client_ID) as cnt_child,
    sum(max_plcm) as cnt_child_plcm
  from
    (
      select
        Service_Code,
        max(Service_Text) as Service_Text,
        Client_ID,
        max(case when Flg_plcm='1' then 1 else 0 end) as max_plcm    
      from have
      group by 
        Service_Code,
        Client_ID
    )
  group by
    Service_Code,
    Service_Text
  ;
quit;
ModeratelyWise
Obsidian | Level 7

Thanks Patrick!

 

This works like a charm.

 

Best,

 

Marc

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 7 replies
  • 1459 views
  • 0 likes
  • 4 in conversation