BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
luvscandy27
Quartz | Level 8

I have the following dataset below: 

ID DATE AGE YEAR max_age
1 5/31/2002 21 2002 21
1 10/31/2003 24 2003 24
1 5/31/2003 24 2003 24
1 12/31/2003 24 2003 24
1 8/31/2003 24 2003 24
1 3/31/2003 24 2003 24
1 1/31/2003 23 2003 24
1 7/31/2003 24 2003 24
2 7/31/2004 33 2004 33
2 5/31/2004 33 2004 33
2 1/31/2004 33 2004 33

 

The max_age variable is the max_age by id per year and was created using the 

code below: 

proc sql;
create table max_age as
select*, max(age)as max_age
from age2
group by ID, year;
quit;

 

I need assistance with the next step I would like my output to have one observation per id per year. My

output should like below. Any assistance would be greatly appreciated. 

 

ID DATE AGE YEAR MAX_AGE
1 5/31/2002 21 2002 21
1 12/31/2003 24 2003 24
2 7/31/2004 33 2004 33
1 ACCEPTED SOLUTION

Accepted Solutions
novinosrin
Tourmaline | Level 20

Hi @luvscandy27  I would take a step back and do the following without having to create a MAX_AGE variable. Sure, you need a subquery-


data have;
input ID	DATE :mmddyy10.	AGE	YEAR;*	max_age;
format date mmddyy10.;
cards;
1	5/31/2002	21	2002	21
1	10/31/2003	24	2003	24
1	5/31/2003	24	2003	24
1	12/31/2003	24	2003	24
1	8/31/2003	24	2003	24
1	3/31/2003	24	2003	24
1	1/31/2003	23	2003	24
1	7/31/2003	24	2003	24
2	7/31/2004	33	2004	33
2	5/31/2004	33	2004	33
2	1/31/2004	33	2004	33
;

proc sql;
create table want as
select *
from (select * from have group by id,year having age=max(age))
group by id, year
having max(date)=date;
quit;

proc print noobs;run;

View solution in original post

6 REPLIES 6
PaigeMiller
Diamond | Level 26

How did you choose the one observation per ID and year?

--
Paige Miller
luvscandy27
Quartz | Level 8

The obs observation should be the last observation for the id of the year. So for ID 2 the following obs are present:

 

2 7/31/2004 33 2004 33
2 5/31/2004 33 2004 33
2 1/31/2004 33 2004 33

 

7/31/2004 date is the one I want because its the last observation for that id for that year. 

Kurt_Bremser
Super User

@luvscandy27 wrote:

The obs observation should be the last observation for the id of the year. So for ID 2 the following obs are present:

 

2 7/31/2004 33 2004 33
2 5/31/2004 33 2004 33
2 1/31/2004 33 2004 33

 

7/31/2004 date is the one I want because its the last observation for that id for that year. 


So you need to add a HAVING clause:

create table max_age as
  select
    *,
    max(age)as max_age
  from age2
  group by ID, year
  having date = max(date)
;
novinosrin
Tourmaline | Level 20

Hi @luvscandy27  I would take a step back and do the following without having to create a MAX_AGE variable. Sure, you need a subquery-


data have;
input ID	DATE :mmddyy10.	AGE	YEAR;*	max_age;
format date mmddyy10.;
cards;
1	5/31/2002	21	2002	21
1	10/31/2003	24	2003	24
1	5/31/2003	24	2003	24
1	12/31/2003	24	2003	24
1	8/31/2003	24	2003	24
1	3/31/2003	24	2003	24
1	1/31/2003	23	2003	24
1	7/31/2003	24	2003	24
2	7/31/2004	33	2004	33
2	5/31/2004	33	2004	33
2	1/31/2004	33	2004	33
;

proc sql;
create table want as
select *
from (select * from have group by id,year having age=max(age))
group by id, year
having max(date)=date;
quit;

proc print noobs;run;
Reeza
Super User

Data step is very straightforward here. 

 

proc sort data=have;
by id year date;
run;

data want_data;
set have;
by ID year date;
if last.year;
run;

@luvscandy27 wrote:

I have the following dataset below: 

ID DATE AGE YEAR max_age
1 5/31/2002 21 2002 21
1 10/31/2003 24 2003 24
1 5/31/2003 24 2003 24
1 12/31/2003 24 2003 24
1 8/31/2003 24 2003 24
1 3/31/2003 24 2003 24
1 1/31/2003 23 2003 24
1 7/31/2003 24 2003 24
2 7/31/2004 33 2004 33
2 5/31/2004 33 2004 33
2 1/31/2004 33 2004 33

 

The max_age variable is the max_age by id per year and was created using the 

code below: 

proc sql;
create table max_age as
select*, max(age)as max_age
from age2
group by ID, year;
quit;

 

I need assistance with the next step I would like my output to have one observation per id per year. My

output should like below. Any assistance would be greatly appreciated. 

 

ID DATE AGE YEAR MAX_AGE
1 5/31/2002 21 2002 21
1 12/31/2003 24 2003 24
2 7/31/2004 33 2004 33

 

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 6 replies
  • 1465 views
  • 3 likes
  • 5 in conversation