BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
righcoastmike
Quartz | Level 8

Hi All,

 

I have a dataset that looks something like this:  

 

data have;
input studyID ED_type1 ED_type2 ED_type3;
cards;
1 . 1 1
2 . . 1 
3 1 . .
4 1 . 1
5 . 1 1
;

What I am trying to do is develop a summary variable for each record based on the ED_type variables. Essentially the ed_types are hierarchical (ED_type 1 being the top of the hierarchy, ED_type2 below it, and ED_type3 below that).  The summary variable needs to reflect whichever the "highest" ED type visit is for a given record. 

 

Clear as mud right? It might be easier with an example, in the end I want my new dataset to look like this:

 

data have2;
input studyID ED_type1 ED_type2 ED_type3 ED_Summary;
cards;
1 . 1 1 2
2 . . 1 3
3 1 . . 1
4 1 . 1 1
5 . 1 1 2
;

Since ID 1 has (1) in both ed visits in categories 2 and 3 it get's summarized as a 2 (because 2 falls above 3 on the hierarchy) 

Similarly, since ID 4 has visits in categories 1 and 3, then it get's a 1 because 1 is higher in the hierarchy than 3.

 

Thanks so much in advance folks, you are always a huge help and any thoughts would be much appreciated. 

 

Rightcoast

 

1 ACCEPTED SOLUTION

Accepted Solutions
PaigeMiller
Diamond | Level 26

 

data have2; 
input studyID ED_type1 ED_type2 ED_type3; 
if ed_type1=1 then ED_Summary=1; 
else if ed_type2=1 then ED_Summary=2; 
else if ed_type3=1 then ED_Summary=3; 
cards;
 ... 
;

 

 

 

--
Paige Miller

View solution in original post

7 REPLIES 7
PaigeMiller
Diamond | Level 26

 

data have2; 
input studyID ED_type1 ED_type2 ED_type3; 
if ed_type1=1 then ED_Summary=1; 
else if ed_type2=1 then ED_Summary=2; 
else if ed_type3=1 then ED_Summary=3; 
cards;
 ... 
;

 

 

 

--
Paige Miller
righcoastmike
Quartz | Level 8

Thanks so much for your help everyone. 

 

This solution took a bit more coding than the "whichn" method, but ultimately it was the only one I could make work. I think a lot of it had to do with the fact that I'm not entirely sure how the data was created so I couldn't get the syntax to fit. That being said, I learned about a new tool to use today, and I look forward to playing with it a bit and figuring it out. 

 

In the meantime, thanks so much Paige! That code worked perfectly. 

 

As always, all your help is much appreciated. 

 

Thanks again

 

rightcoast

FreelanceReinh
Jade | Level 19

Hi @righcoastmike,

 

Why not use the WHICHN function?

data want;
set have;
ED_Summary=whichn(1,of ED_type:);
run;
ballardw
Super User

@FreelanceReinh wrote:

Hi @righcoastmike,

 

Why not use the WHICHN function?

data want;
set have;
ED_Summary=whichn(1,of ED_type:);
run;

Just need to be real sure that the sort order of the variable names is the actual hierarchy. Which this example is but you never know when going with actual data that the names might be actually needed in  Ed_typeAr Ed_typeBc Ed_typeA order.

In which case list the hierarchy from left to right in the WHICHN instead of using the "of list:" construct.

FreelanceReinh
Jade | Level 19

Thanks @ballardw, I was aware of the subtleties regarding variable order. But for less experienced readers of this thread it's good that you pointed this out.

 

The ED_type: shortcut abbreviates the list of variables in the PDV (program data vector) whose names start with "ED_type" -- in the order in which they are stored in the PDV, not in the alphabetical order of their names. In dataset HAVE (as created in the initial post) the three variables in question were created as a result of the INPUT statement, hence in the appropriate order for the WHICHN function to produce the desired results.

ballardw
Super User

@FreelanceReinh wrote:

Thanks @ballardw, I was aware of the subtleties regarding variable order. But for less experienced readers of this thread it's good that you pointed this out.

 

The ED_type: shortcut abbreviates the list of variables in the PDV (program data vector) whose names start with "ED_type" -- in the order in which they are stored in the PDV, not in the alphabetical order of their names. In dataset HAVE (as created in the initial post) the three variables in question were created as a result of the INPUT statement, hence in the appropriate order for the WHICHN function to produce the desired results.


Right. I so seldom let anything else name my variables and I would create them in a numeric suffixed list I forget about the creation order.

righcoastmike
Quartz | Level 8

HI All, 

 

Thanks so much for your thoughts on this. It's much appreciated. I'd never heard of the WHICHN function, but I think it will be really useful going forward.  Unfortunately, the dataset I am using was created by someone else, so I'm not sure how things are ordered in the PDV. They are all ordered sequentially though (ED_1, ED_2 etc.) I'll give both the options a try and report back. 

 

Stay tuned! 

 

Rightcoast. 

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 7 replies
  • 778 views
  • 4 likes
  • 4 in conversation