I have the following dataset below:
| ID | DATE | AGE | YEAR | max_age | 
| 1 | 5/31/2002 | 21 | 2002 | 21 | 
| 1 | 10/31/2003 | 24 | 2003 | 24 | 
| 1 | 5/31/2003 | 24 | 2003 | 24 | 
| 1 | 12/31/2003 | 24 | 2003 | 24 | 
| 1 | 8/31/2003 | 24 | 2003 | 24 | 
| 1 | 3/31/2003 | 24 | 2003 | 24 | 
| 1 | 1/31/2003 | 23 | 2003 | 24 | 
| 1 | 7/31/2003 | 24 | 2003 | 24 | 
| 2 | 7/31/2004 | 33 | 2004 | 33 | 
| 2 | 5/31/2004 | 33 | 2004 | 33 | 
| 2 | 1/31/2004 | 33 | 2004 | 33 | 
The max_age variable is the max_age by id per year and was created using the
code below:
proc sql;
create table max_age as 
select*, max(age)as max_age
from age2
group by ID, year; 
quit;
I need assistance with the next step I would like my output to have one observation per id per year. My
output should like below. Any assistance would be greatly appreciated.
| ID | DATE | AGE | YEAR | MAX_AGE | 
| 1 | 5/31/2002 | 21 | 2002 | 21 | 
| 1 | 12/31/2003 | 24 | 2003 | 24 | 
| 2 | 7/31/2004 | 33 | 2004 | 33 | 
Hi @luvscandy27 I would take a step back and do the following without having to create a MAX_AGE variable. Sure, you need a subquery-
data have;
input ID	DATE :mmddyy10.	AGE	YEAR;*	max_age;
format date mmddyy10.;
cards;
1	5/31/2002	21	2002	21
1	10/31/2003	24	2003	24
1	5/31/2003	24	2003	24
1	12/31/2003	24	2003	24
1	8/31/2003	24	2003	24
1	3/31/2003	24	2003	24
1	1/31/2003	23	2003	24
1	7/31/2003	24	2003	24
2	7/31/2004	33	2004	33
2	5/31/2004	33	2004	33
2	1/31/2004	33	2004	33
;
proc sql;
create table want as
select *
from (select * from have group by id,year having age=max(age))
group by id, year
having max(date)=date;
quit;
proc print noobs;run;
How did you choose the one observation per ID and year?
What is the criteria for selecting the date (and possibly age) that goes into the output?
The obs observation should be the last observation for the id of the year. So for ID 2 the following obs are present:
| 2 | 7/31/2004 | 33 | 2004 | 33 | 
| 2 | 5/31/2004 | 33 | 2004 | 33 | 
| 2 | 1/31/2004 | 33 | 2004 | 33 | 
7/31/2004 date is the one I want because its the last observation for that id for that year.
@luvscandy27 wrote:
The obs observation should be the last observation for the id of the year. So for ID 2 the following obs are present:
2 7/31/2004 33 2004 33 2 5/31/2004 33 2004 33 2 1/31/2004 33 2004 33 
7/31/2004 date is the one I want because its the last observation for that id for that year.
So you need to add a HAVING clause:
create table max_age as
  select
    *,
    max(age)as max_age
  from age2
  group by ID, year
  having date = max(date)
;Hi @luvscandy27 I would take a step back and do the following without having to create a MAX_AGE variable. Sure, you need a subquery-
data have;
input ID	DATE :mmddyy10.	AGE	YEAR;*	max_age;
format date mmddyy10.;
cards;
1	5/31/2002	21	2002	21
1	10/31/2003	24	2003	24
1	5/31/2003	24	2003	24
1	12/31/2003	24	2003	24
1	8/31/2003	24	2003	24
1	3/31/2003	24	2003	24
1	1/31/2003	23	2003	24
1	7/31/2003	24	2003	24
2	7/31/2004	33	2004	33
2	5/31/2004	33	2004	33
2	1/31/2004	33	2004	33
;
proc sql;
create table want as
select *
from (select * from have group by id,year having age=max(age))
group by id, year
having max(date)=date;
quit;
proc print noobs;run;
Data step is very straightforward here.
proc sort data=have;
by id year date;
run;
data want_data;
set have;
by ID year date;
if last.year;
run;
@luvscandy27 wrote:
I have the following dataset below:
ID DATE AGE YEAR max_age 1 5/31/2002 21 2002 21 1 10/31/2003 24 2003 24 1 5/31/2003 24 2003 24 1 12/31/2003 24 2003 24 1 8/31/2003 24 2003 24 1 3/31/2003 24 2003 24 1 1/31/2003 23 2003 24 1 7/31/2003 24 2003 24 2 7/31/2004 33 2004 33 2 5/31/2004 33 2004 33 2 1/31/2004 33 2004 33 
The max_age variable is the max_age by id per year and was created using the
code below:
proc sql;
create table max_age as
select*, max(age)as max_age
from age2
group by ID, year;
quit;
I need assistance with the next step I would like my output to have one observation per id per year. My
output should like below. Any assistance would be greatly appreciated.
ID DATE AGE YEAR MAX_AGE 1 5/31/2002 21 2002 21 1 12/31/2003 24 2003 24 2 7/31/2004 33 2004 33 
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.
