Contributor
Posts: 22

# Finding the maximum value of each unique ID while keeping all other columns

Hi, I am using SAS to try to calculate the maximum value of a variable for each unique person. However, I want to retain the other columns of the dataset for the row that contains that maximum value. I've tried to use PROC SQL to accomplish this, but I have no idea how to make sure that the other columns also transfer over. For example, my dataset has 250,000 rows, but only 32,000 unique individuals. I want to find the maximum date for each unique person, and for the row that contains the maximum, I also want to retain all other columns. This is my code so far:

proc sql;
create table test as
select ID, max(date_var) from dataset
group by 1;
quit;

Any help would be great. Thanks.

Super User
Posts: 21,953

## Re: Finding the maximum value of each unique ID while keeping all other columns

[ Edited ]

Proc sort and a data step will work easily as well.

Proc sort data=have;
By Id variable;
Run;

Data want;
Set have;
By Id;
If last.id;
Run;

However, if there's the possibility that there may be duplicates of the value SQL is easier. You can use the HAVING clause to keep only the rows of interest.

Proc sql;
Create table want as
Select *, max(variable) as max_value
From have
Group by Id
Having max(variable)=variable;
Quit;

corkee wrote:

Hi, I am using SAS to try to calculate the maximum value of a variable for each unique person. However, I want to retain the other columns of the dataset for the row that contains that maximum value. I've tried to use PROC SQL to accomplish this, but I have no idea how to make sure that the other columns also transfer over. For example, my dataset has 250,000 rows, but only 32,000 unique individuals. I want to find the maximum date for each unique person, and for the row that contains the maximum, I also want to retain all other columns. This is my code so far:

proc sql;
create table test as
select ID, max(date_var) from dataset
group by 1;
quit;

Any help would be great. Thanks.

Posts: 4,373

## Re: Finding the maximum value of each unique ID while keeping all other columns

@corkee

The Proc Sort / Data Step option @Reeza posted will likely perform better than the SQL.

Should it be possible that you have more than one record with a max date_var in an ID then the SQL will return all rows with a max value but the Proc Sort / Data Step will only return one row. Your pick!

Discussion stats
• 2 replies
• 222 views
• 0 likes
• 3 in conversation