BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
luvscandy27
Quartz | Level 8

I have the following dataset below: 

ID DATE AGE YEAR max_age
1 5/31/2002 21 2002 21
1 10/31/2003 24 2003 24
1 5/31/2003 24 2003 24
1 12/31/2003 24 2003 24
1 8/31/2003 24 2003 24
1 3/31/2003 24 2003 24
1 1/31/2003 23 2003 24
1 7/31/2003 24 2003 24
2 7/31/2004 33 2004 33
2 5/31/2004 33 2004 33
2 1/31/2004 33 2004 33

 

The max_age variable is the max_age by id per year and was created using the 

code below: 

proc sql;
create table max_age as
select*, max(age)as max_age
from age2
group by ID, year;
quit;

 

I need assistance with the next step I would like my output to have one observation per id per year. My

output should like below. Any assistance would be greatly appreciated. 

 

ID DATE AGE YEAR MAX_AGE
1 5/31/2002 21 2002 21
1 12/31/2003 24 2003 24
2 7/31/2004 33 2004 33
1 ACCEPTED SOLUTION

Accepted Solutions
novinosrin
Tourmaline | Level 20

Hi @luvscandy27  I would take a step back and do the following without having to create a MAX_AGE variable. Sure, you need a subquery-


data have;
input ID	DATE :mmddyy10.	AGE	YEAR;*	max_age;
format date mmddyy10.;
cards;
1	5/31/2002	21	2002	21
1	10/31/2003	24	2003	24
1	5/31/2003	24	2003	24
1	12/31/2003	24	2003	24
1	8/31/2003	24	2003	24
1	3/31/2003	24	2003	24
1	1/31/2003	23	2003	24
1	7/31/2003	24	2003	24
2	7/31/2004	33	2004	33
2	5/31/2004	33	2004	33
2	1/31/2004	33	2004	33
;

proc sql;
create table want as
select *
from (select * from have group by id,year having age=max(age))
group by id, year
having max(date)=date;
quit;

proc print noobs;run;

View solution in original post

6 REPLIES 6
PaigeMiller
Diamond | Level 26

How did you choose the one observation per ID and year?

--
Paige Miller
luvscandy27
Quartz | Level 8

The obs observation should be the last observation for the id of the year. So for ID 2 the following obs are present:

 

2 7/31/2004 33 2004 33
2 5/31/2004 33 2004 33
2 1/31/2004 33 2004 33

 

7/31/2004 date is the one I want because its the last observation for that id for that year. 

Kurt_Bremser
Super User

@luvscandy27 wrote:

The obs observation should be the last observation for the id of the year. So for ID 2 the following obs are present:

 

2 7/31/2004 33 2004 33
2 5/31/2004 33 2004 33
2 1/31/2004 33 2004 33

 

7/31/2004 date is the one I want because its the last observation for that id for that year. 


So you need to add a HAVING clause:

create table max_age as
  select
    *,
    max(age)as max_age
  from age2
  group by ID, year
  having date = max(date)
;
novinosrin
Tourmaline | Level 20

Hi @luvscandy27  I would take a step back and do the following without having to create a MAX_AGE variable. Sure, you need a subquery-


data have;
input ID	DATE :mmddyy10.	AGE	YEAR;*	max_age;
format date mmddyy10.;
cards;
1	5/31/2002	21	2002	21
1	10/31/2003	24	2003	24
1	5/31/2003	24	2003	24
1	12/31/2003	24	2003	24
1	8/31/2003	24	2003	24
1	3/31/2003	24	2003	24
1	1/31/2003	23	2003	24
1	7/31/2003	24	2003	24
2	7/31/2004	33	2004	33
2	5/31/2004	33	2004	33
2	1/31/2004	33	2004	33
;

proc sql;
create table want as
select *
from (select * from have group by id,year having age=max(age))
group by id, year
having max(date)=date;
quit;

proc print noobs;run;
Reeza
Super User

Data step is very straightforward here. 

 

proc sort data=have;
by id year date;
run;

data want_data;
set have;
by ID year date;
if last.year;
run;

@luvscandy27 wrote:

I have the following dataset below: 

ID DATE AGE YEAR max_age
1 5/31/2002 21 2002 21
1 10/31/2003 24 2003 24
1 5/31/2003 24 2003 24
1 12/31/2003 24 2003 24
1 8/31/2003 24 2003 24
1 3/31/2003 24 2003 24
1 1/31/2003 23 2003 24
1 7/31/2003 24 2003 24
2 7/31/2004 33 2004 33
2 5/31/2004 33 2004 33
2 1/31/2004 33 2004 33

 

The max_age variable is the max_age by id per year and was created using the 

code below: 

proc sql;
create table max_age as
select*, max(age)as max_age
from age2
group by ID, year;
quit;

 

I need assistance with the next step I would like my output to have one observation per id per year. My

output should like below. Any assistance would be greatly appreciated. 

 

ID DATE AGE YEAR MAX_AGE
1 5/31/2002 21 2002 21
1 12/31/2003 24 2003 24
2 7/31/2004 33 2004 33

 

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 6 replies
  • 1408 views
  • 3 likes
  • 5 in conversation