BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
corkee
Calcite | Level 5

Hi, I am using SAS to try to calculate the maximum value of a variable for each unique person. However, I want to retain the other columns of the dataset for the row that contains that maximum value. I've tried to use PROC SQL to accomplish this, but I have no idea how to make sure that the other columns also transfer over. For example, my dataset has 250,000 rows, but only 32,000 unique individuals. I want to find the maximum date for each unique person, and for the row that contains the maximum, I also want to retain all other columns. This is my code so far:

 

proc sql;
    create table test as
    select ID, max(date_var) from dataset
    group by 1;
quit;

Any help would be great. Thanks.

1 ACCEPTED SOLUTION

Accepted Solutions
Reeza
Super User

Proc sort and a data step will work easily as well. 

 

Proc sort data=have;
By Id variable;
Run;

Data want;
Set have;
By Id;
If last.id;
Run;

However, if there's the possibility that there may be duplicates of the value SQL is easier. You can use the HAVING clause to keep only the rows of interest. 

 

Proc sql;
Create table want as
Select *, max(variable) as max_value
From have
Group by Id 
Having max(variable)=variable;
Quit;

 


@corkee wrote:

Hi, I am using SAS to try to calculate the maximum value of a variable for each unique person. However, I want to retain the other columns of the dataset for the row that contains that maximum value. I've tried to use PROC SQL to accomplish this, but I have no idea how to make sure that the other columns also transfer over. For example, my dataset has 250,000 rows, but only 32,000 unique individuals. I want to find the maximum date for each unique person, and for the row that contains the maximum, I also want to retain all other columns. This is my code so far:

 

proc sql;
    create table test as
    select ID, max(date_var) from dataset
    group by 1;
quit;

Any help would be great. Thanks.



 

View solution in original post

2 REPLIES 2
Reeza
Super User

Proc sort and a data step will work easily as well. 

 

Proc sort data=have;
By Id variable;
Run;

Data want;
Set have;
By Id;
If last.id;
Run;

However, if there's the possibility that there may be duplicates of the value SQL is easier. You can use the HAVING clause to keep only the rows of interest. 

 

Proc sql;
Create table want as
Select *, max(variable) as max_value
From have
Group by Id 
Having max(variable)=variable;
Quit;

 


@corkee wrote:

Hi, I am using SAS to try to calculate the maximum value of a variable for each unique person. However, I want to retain the other columns of the dataset for the row that contains that maximum value. I've tried to use PROC SQL to accomplish this, but I have no idea how to make sure that the other columns also transfer over. For example, my dataset has 250,000 rows, but only 32,000 unique individuals. I want to find the maximum date for each unique person, and for the row that contains the maximum, I also want to retain all other columns. This is my code so far:

 

proc sql;
    create table test as
    select ID, max(date_var) from dataset
    group by 1;
quit;

Any help would be great. Thanks.



 

Patrick
Opal | Level 21

@corkee

The Proc Sort / Data Step option @Reeza posted will likely perform better than the SQL.

 

Should it be possible that you have more than one record with a max date_var in an ID then the SQL will return all rows with a max value but the Proc Sort / Data Step will only return one row. Your pick!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 2 replies
  • 35578 views
  • 1 like
  • 3 in conversation