BookmarkSubscribeRSS Feed
Jest
Obsidian | Level 7

HI,

 

I have a longitudinal dataset with a variable that describes disease severity (mild, moderate, severe). Since this is longitudinal subjects may have at some point been mild then became moderate or severe throughout the study. Some the opposite. I would like to pick the most severe for each subject regardless of when. I am pasting below how the data appears:

 

ID       Severity

1           mild

1           mild

1           moderate

1           moderate

2           moderate 

2           mild

2           severe

2           severe

3           mild

3           mild

3           mild

3           mild

 

How I want the data to look like:

 

ID         Severity

1           moderate 

2           severe

3           mild

 

What is the best way to achieve this?

 

Thanks!

6 REPLIES 6
novinosrin
Tourmaline | Level 20

Untested, let me know if this doesn't work:

 

data have;
input ID Severity : $10.;
datalines;
1 severe
1 mild
1 moderate
1 mild
2 moderate
2 mild
2 severe
2 mild
3 mild
3 mild
3 moderate
3 mild
;

data want;
set have;
by id;
retain t;
if first.id then t=severity;
else if t ne 'severe' and severity ne 'mild' then t=severity;
if last.id;
run;

Jest
Obsidian | Level 7

Hi novinosrin,

 

It didn't work. I am trying to play with it. One idea is to do a two step approach. What do you think if I first made the data look like this by making all ID values into picking the most severe. then running a if last.id. How would I make it change all the values per ID to the most severe...

 

ID     severity

1       moderate 

1       moderate

1       moderate

1       moderate

2       severe

2       severe

2       severe

2       severe

 

ballardw
Super User

@Jest wrote:

HI,

  

What is the best way to achieve this?

 

Thanks!


"Best" would depend on what you are measuring. Best by shortest execution time, shortest code, easiest to understand/ maintain code could all be used.

 

Since you have a variable that has an order, if this were my data I would have read the data so that I had and ordinal numeric variable with the "least severe" or what ever term would apply with the value of 1 to a larger integer for the "most severe". There would also be an associated display format to display the desired text when used.

Which would make my code for your request look like:

 

proc summary data=have nway;

   class id;

   var severity;

   output out=idsummary max= ;

run;

Which would have the maximum value of that numeric severity variable for each ID value.

 

There are a number of other things possible with numeric ordinals that aren't possible with most character variables.

Jest
Obsidian | Level 7

Thank you Ballardw. Proc summary worked after converting the variable to numeric. 

However, is there a way to retain the rest of the variables in the dataset. They don't very across time.

 

thanks,

 

ballardw
Super User

@Jest wrote:

Thank you Ballardw. Proc summary worked after converting the variable to numeric. 

However, is there a way to retain the rest of the variables in the dataset. They don't very across time.

 

thanks,

 


Show the code you used to add or replace the severity variable. If done correctly it shouldn't have affected any of the other variables.

 Or if you are talking about adding the single result in the summary back to the other data matching the id variable?

proc sql;
   create table severityadded as
   select a.*, b.severity as maxseverity
   from have as a left join
        idsummary as b
        on a.id=b.id;
run;
Jest
Obsidian | Level 7

I just ran this code after converting the variable to numeric.

 

proc summary data=have nway;
class id;
var severe;
output out=data.idsummary max= ;
run;

 

so the output dataset picks the most severe per ID over time.. It includes ID, _Type_, _FREQ_ and the variable severe. If possible I would like the output dataset to also include all the remainder variable in the original dataset (have) as well. I could technically go back and run an if last.ID on the original dataset and then merge it with the idsummary one which includes the severity variable....but is there a way to just do that in the proc summary code above....

 

Thanks

SAS Innovate 2025: Register Now

Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 6 replies
  • 945 views
  • 0 likes
  • 3 in conversation