BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
aljones1816
Fluorite | Level 6

I have a list of zip codes associated with IDs and years, and the zip codes for a given ID can be different for each year:

 

ID        zip        year

1         2           2018 

1         3           2017

1         2           2018

1         3           2017

 

I need to make it so that the zip code for a given ID is set to the zip code appearing in the most recent year:

 

ID        zip        year

1         2           2018 

1         2           2017

1         2           2018

1         2           2017

 

What I have done is try sort by ID and year, but I am lost as to how to move forward with the rest of the logic. I suspect I need to use the first. function, but beyond that I am having no luck.

1 ACCEPTED SOLUTION

Accepted Solutions
novinosrin
Tourmaline | Level 20

And a Datastep solution

 

data have;
input ID        zip        year	;
cards;
1         2           2018 
1         3           2017
1         2           2018
1         3           2017
;

data want;
 do _n_=1 by 1 until(last.id);
  set have;
  by id;
  _m=_m max year;
  if _m=year then _z=zip;
 end;
 do _n_=1 to _n_;
  set have;
  zip=_z;
  output;
 end;
 drop _:;
run;

View solution in original post

5 REPLIES 5
novinosrin
Tourmaline | Level 20

Hi @aljones1816   Welcome to SAS communities, here is a proc sql welcome for you

data have;
input ID        zip        year	;
cards;
1         2           2018 
1         3           2017
1         2           2018
1         3           2017
;

proc sql;
create table want as
select ID, max(t) as zip,year
from (select ID, zip,year, (year=max(year))*zip as t from have group by id)
group by id;
quit;

novinosrin
Tourmaline | Level 20

And a Datastep solution

 

data have;
input ID        zip        year	;
cards;
1         2           2018 
1         3           2017
1         2           2018
1         3           2017
;

data want;
 do _n_=1 by 1 until(last.id);
  set have;
  by id;
  _m=_m max year;
  if _m=year then _z=zip;
 end;
 do _n_=1 to _n_;
  set have;
  zip=_z;
  output;
 end;
 drop _:;
run;
aljones1816
Fluorite | Level 6

Thanks so much, @novinosrin ! I tried all of your solutions and each worked beautifully. I chose this as the answer only because it was the option I ultimately stuck with in my code. 

novinosrin
Tourmaline | Level 20

Perhaps your thought process was after this?

 

proc sort data=have out=_have(rename=(zip=z) keep=id zip);
by id descending year;
run;

data want;
 merge have(drop=zip) _have;
 by id;
 retain zip;
 if first.id then zip=z;
 drop z;
run;
mkeintz
PROC Star

I see you have multiple observations for the highest year within an ID.  In your sample ID 1 have 2 records for 2018.  They both have zip=2.  Can you assume that the zip code will be constant across all highest year records within an ID?   If not, then what rule would you use to choose a zip value?

 

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 5 replies
  • 700 views
  • 0 likes
  • 3 in conversation