- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hi, I am using SAS to try to calculate the maximum value of a variable for each unique person. However, I want to retain the other columns of the dataset for the row that contains that maximum value. I've tried to use PROC SQL to accomplish this, but I have no idea how to make sure that the other columns also transfer over. For example, my dataset has 250,000 rows, but only 32,000 unique individuals. I want to find the maximum date for each unique person, and for the row that contains the maximum, I also want to retain all other columns. This is my code so far:
proc sql;
create table test as
select ID, max(date_var) from dataset
group by 1;
quit;
Any help would be great. Thanks.
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Proc sort and a data step will work easily as well.
Proc sort data=have;
By Id variable;
Run;
Data want;
Set have;
By Id;
If last.id;
Run;
However, if there's the possibility that there may be duplicates of the value SQL is easier. You can use the HAVING clause to keep only the rows of interest.
Proc sql;
Create table want as
Select *, max(variable) as max_value
From have
Group by Id
Having max(variable)=variable;
Quit;
@corkee wrote:
Hi, I am using SAS to try to calculate the maximum value of a variable for each unique person. However, I want to retain the other columns of the dataset for the row that contains that maximum value. I've tried to use PROC SQL to accomplish this, but I have no idea how to make sure that the other columns also transfer over. For example, my dataset has 250,000 rows, but only 32,000 unique individuals. I want to find the maximum date for each unique person, and for the row that contains the maximum, I also want to retain all other columns. This is my code so far:
proc sql; create table test as select ID, max(date_var) from dataset group by 1; quit;
Any help would be great. Thanks.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Proc sort and a data step will work easily as well.
Proc sort data=have;
By Id variable;
Run;
Data want;
Set have;
By Id;
If last.id;
Run;
However, if there's the possibility that there may be duplicates of the value SQL is easier. You can use the HAVING clause to keep only the rows of interest.
Proc sql;
Create table want as
Select *, max(variable) as max_value
From have
Group by Id
Having max(variable)=variable;
Quit;
@corkee wrote:
Hi, I am using SAS to try to calculate the maximum value of a variable for each unique person. However, I want to retain the other columns of the dataset for the row that contains that maximum value. I've tried to use PROC SQL to accomplish this, but I have no idea how to make sure that the other columns also transfer over. For example, my dataset has 250,000 rows, but only 32,000 unique individuals. I want to find the maximum date for each unique person, and for the row that contains the maximum, I also want to retain all other columns. This is my code so far:
proc sql; create table test as select ID, max(date_var) from dataset group by 1; quit;
Any help would be great. Thanks.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content