BookmarkSubscribeRSS Feed
moreilly
Calcite | Level 5

I have a data set with multiple observations per subject. I am trying to assign each subject to one of three categories depending on their health care utilization patterns.
1. office visits only
2. office visits and other services (physical therapy, imaging, etc.)
3. other services only

I would like to create a new variable that summarizes information across observations for each subject.  Thus far I’ve created two binary variables-office visit (0/1) and other service (0/1) for each observation.  I’d like to create a new categorical variable that uses the counts of these binaries, but within the subject group.  I think that sql is the right way to go, but I haven’t been able to make it work yet and wonder if there is an easier way.

 The data currently has the variables bene_id, office_visit and other_service.  I would like to add utilization_cat.

bene_id  office_visit other_service    utilization_cat
1        1            0            1    
1        1            0            1
1        1            0            1
1        1            0            1
1        1            0            1
1        1            0            1        
1        1            0            1        
2        1            0            2
2        0            1            2    
2        1            0            2
2        1            0            2
2        1            0            2
2        0            1            2
3        0            1            3
3        0            1            3
3        0            1            3
3        0            1            3

I’ve tried the following code (as well as RETAIN in a data step), without success.

proc sql;
    create view sum_view as
    select sum(other_service) as sum_other_service,
           sum(office_visit) as sum_office_visit,
    from claim_data;
    group by bene_id;
quit;

Any ideas?

THANK YOU!

2 REPLIES 2
Reeza
Super User

SQL is a good approach, but if you want the full dataset include an * in your SELECT statement and then it will automatically reemerge the data - but leaves a note on the log to that effect. 

 

You can add a CASE statement to categorize based on the calculated sums. 

 

Case when calculated sum_other_visit=0 and calculated sum_office_visit > 0 then 'Office Only'

        When <condition> then 'Other only'

        When .... then 'Both'

      Else 'checkme' 

end as Utilization_Category

 

 

 

Logie
Fluorite | Level 6

try keeping the bene_id identifier

 

proc sql;
    create view sum_view as
    select

bene_id

,sum(other_service) as sum_other_service,
           sum(office_visit) as sum_office_visit,
    from claim_data;
    group by bene_id;
quit;

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 2 replies
  • 1089 views
  • 0 likes
  • 3 in conversation