BookmarkSubscribeRSS Feed
Jest
Obsidian | Level 7

HI,

 

I have a longitudinal dataset with a variable that describes disease severity (mild, moderate, severe). Since this is longitudinal subjects may have at some point been mild then became moderate or severe throughout the study. Some the opposite. I would like to pick the most severe for each subject regardless of when. I am pasting below how the data appears:

 

ID       Severity

1           mild

1           mild

1           moderate

1           moderate

2           moderate 

2           mild

2           severe

2           severe

3           mild

3           mild

3           mild

3           mild

 

How I want the data to look like:

 

ID         Severity

1           moderate 

2           severe

3           mild

 

What is the best way to achieve this?

 

Thanks!

6 REPLIES 6
novinosrin
Tourmaline | Level 20

Untested, let me know if this doesn't work:

 

data have;
input ID Severity : $10.;
datalines;
1 severe
1 mild
1 moderate
1 mild
2 moderate
2 mild
2 severe
2 mild
3 mild
3 mild
3 moderate
3 mild
;

data want;
set have;
by id;
retain t;
if first.id then t=severity;
else if t ne 'severe' and severity ne 'mild' then t=severity;
if last.id;
run;

Jest
Obsidian | Level 7

Hi novinosrin,

 

It didn't work. I am trying to play with it. One idea is to do a two step approach. What do you think if I first made the data look like this by making all ID values into picking the most severe. then running a if last.id. How would I make it change all the values per ID to the most severe...

 

ID     severity

1       moderate 

1       moderate

1       moderate

1       moderate

2       severe

2       severe

2       severe

2       severe

 

ballardw
Super User

@Jest wrote:

HI,

  

What is the best way to achieve this?

 

Thanks!


"Best" would depend on what you are measuring. Best by shortest execution time, shortest code, easiest to understand/ maintain code could all be used.

 

Since you have a variable that has an order, if this were my data I would have read the data so that I had and ordinal numeric variable with the "least severe" or what ever term would apply with the value of 1 to a larger integer for the "most severe". There would also be an associated display format to display the desired text when used.

Which would make my code for your request look like:

 

proc summary data=have nway;

   class id;

   var severity;

   output out=idsummary max= ;

run;

Which would have the maximum value of that numeric severity variable for each ID value.

 

There are a number of other things possible with numeric ordinals that aren't possible with most character variables.

Jest
Obsidian | Level 7

Thank you Ballardw. Proc summary worked after converting the variable to numeric. 

However, is there a way to retain the rest of the variables in the dataset. They don't very across time.

 

thanks,

 

ballardw
Super User

@Jest wrote:

Thank you Ballardw. Proc summary worked after converting the variable to numeric. 

However, is there a way to retain the rest of the variables in the dataset. They don't very across time.

 

thanks,

 


Show the code you used to add or replace the severity variable. If done correctly it shouldn't have affected any of the other variables.

 Or if you are talking about adding the single result in the summary back to the other data matching the id variable?

proc sql;
   create table severityadded as
   select a.*, b.severity as maxseverity
   from have as a left join
        idsummary as b
        on a.id=b.id;
run;
Jest
Obsidian | Level 7

I just ran this code after converting the variable to numeric.

 

proc summary data=have nway;
class id;
var severe;
output out=data.idsummary max= ;
run;

 

so the output dataset picks the most severe per ID over time.. It includes ID, _Type_, _FREQ_ and the variable severe. If possible I would like the output dataset to also include all the remainder variable in the original dataset (have) as well. I could technically go back and run an if last.ID on the original dataset and then merge it with the idsummary one which includes the severity variable....but is there a way to just do that in the proc summary code above....

 

Thanks

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 6 replies
  • 586 views
  • 0 likes
  • 3 in conversation