DATA Step, Macro, Functions and more

counting across observations and variables by group

Reply
Occasional Contributor
Posts: 7

counting across observations and variables by group

I have a data set with multiple observations per subject. I am trying to assign each subject to one of three categories depending on their health care utilization patterns.
1. office visits only
2. office visits and other services (physical therapy, imaging, etc.)
3. other services only

I would like to create a new variable that summarizes information across observations for each subject.  Thus far I’ve created two binary variables-office visit (0/1) and other service (0/1) for each observation.  I’d like to create a new categorical variable that uses the counts of these binaries, but within the subject group.  I think that sql is the right way to go, but I haven’t been able to make it work yet and wonder if there is an easier way.

 The data currently has the variables bene_id, office_visit and other_service.  I would like to add utilization_cat.

bene_id  office_visit other_service    utilization_cat
1        1            0            1    
1        1            0            1
1        1            0            1
1        1            0            1
1        1            0            1
1        1            0            1        
1        1            0            1        
2        1            0            2
2        0            1            2    
2        1            0            2
2        1            0            2
2        1            0            2
2        0            1            2
3        0            1            3
3        0            1            3
3        0            1            3
3        0            1            3

I’ve tried the following code (as well as RETAIN in a data step), without success.

proc sql;
    create view sum_view as
    select sum(other_service) as sum_other_service,
           sum(office_visit) as sum_office_visit,
    from claim_data;
    group by bene_id;
quit;

Any ideas?

THANK YOU!

Super User
Posts: 17,831

Re: counting across observations and variables by group

SQL is a good approach, but if you want the full dataset include an * in your SELECT statement and then it will automatically reemerge the data - but leaves a note on the log to that effect. 

 

You can add a CASE statement to categorize based on the calculated sums. 

 

Case when calculated sum_other_visit=0 and calculated sum_office_visit > 0 then 'Office Only'

        When <condition> then 'Other only'

        When .... then 'Both'

      Else 'checkme' 

end as Utilization_Category

 

 

 

Occasional Contributor
Posts: 10

Re: counting across observations and variables by group

try keeping the bene_id identifier

 

proc sql;
    create view sum_view as
    select

bene_id

,sum(other_service) as sum_other_service,
           sum(office_visit) as sum_office_visit,
    from claim_data;
    group by bene_id;
quit;

Ask a Question
Discussion stats
  • 2 replies
  • 124 views
  • 0 likes
  • 3 in conversation