Hi, I am using SAS to try to calculate the maximum value of a variable for each unique person. However, I want to retain the other columns of the dataset for the row that contains that maximum value. I've tried to use PROC SQL to accomplish this, but I have no idea how to make sure that the other columns also transfer over. For example, my dataset has 250,000 rows, but only 32,000 unique individuals. I want to find the maximum date for each unique person, and for the row that contains the maximum, I also want to retain all other columns. This is my code so far:
proc sql;
create table test as
select ID, max(date_var) from dataset
group by 1;
quit;
Any help would be great. Thanks.
Proc sort and a data step will work easily as well.
Proc sort data=have;
By Id variable;
Run;
Data want;
Set have;
By Id;
If last.id;
Run;
However, if there's the possibility that there may be duplicates of the value SQL is easier. You can use the HAVING clause to keep only the rows of interest.
Proc sql;
Create table want as
Select *, max(variable) as max_value
From have
Group by Id
Having max(variable)=variable;
Quit;
@corkee wrote:
Hi, I am using SAS to try to calculate the maximum value of a variable for each unique person. However, I want to retain the other columns of the dataset for the row that contains that maximum value. I've tried to use PROC SQL to accomplish this, but I have no idea how to make sure that the other columns also transfer over. For example, my dataset has 250,000 rows, but only 32,000 unique individuals. I want to find the maximum date for each unique person, and for the row that contains the maximum, I also want to retain all other columns. This is my code so far:
proc sql; create table test as select ID, max(date_var) from dataset group by 1; quit;
Any help would be great. Thanks.
Proc sort and a data step will work easily as well.
Proc sort data=have;
By Id variable;
Run;
Data want;
Set have;
By Id;
If last.id;
Run;
However, if there's the possibility that there may be duplicates of the value SQL is easier. You can use the HAVING clause to keep only the rows of interest.
Proc sql;
Create table want as
Select *, max(variable) as max_value
From have
Group by Id
Having max(variable)=variable;
Quit;
@corkee wrote:
Hi, I am using SAS to try to calculate the maximum value of a variable for each unique person. However, I want to retain the other columns of the dataset for the row that contains that maximum value. I've tried to use PROC SQL to accomplish this, but I have no idea how to make sure that the other columns also transfer over. For example, my dataset has 250,000 rows, but only 32,000 unique individuals. I want to find the maximum date for each unique person, and for the row that contains the maximum, I also want to retain all other columns. This is my code so far:
proc sql; create table test as select ID, max(date_var) from dataset group by 1; quit;
Any help would be great. Thanks.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.