Outputting the first and last observation for each ID

Accepted Solution Solved
Reply
Occasional Contributor
Posts: 18
Accepted Solution

Outputting the first and last observation for each ID

I have a dataset that has variables ID, Date, and Value. For each ID that has more than one Value, I want to output the earliest observation into a new column 'First', and the latest observation into a new column 'Last'. For IDs that only have one Value, I want the observation to be ignored. The final aim is to do a scatter plot of 'First' vs 'Last'.

Any help would be appreciated. Smiley Happy Preferably using the UI in SAS EG, but if that isn't possible then a script will do as well.


Accepted Solutions
Solution
‎06-03-2014 11:56 AM
Trusted Advisor
Posts: 1,204

Re: Outputting the first and last observation for each ID

proc sql;
create table one as
select * from have
order by id, date, value;
quit;

data want (keep=id earliest last) ;

set one;
by id;
if first.id and last.id then delete;
retain earliest;
if first.id then earliest=value;
if last.id then last=value;
if last.id;
run;

View solution in original post


All Replies
Super User
Posts: 6,955

Re: Outputting the first and last observation for each ID

First, you need to sort the dataset by ID and Date.

Then you do

data want (keep=ID first_date last_date first_value last_value);

set have;

by ID;

retain first_date last_date first_value last_value counter;

if first.ID

then do;

  counter = 0;

  first_date = Date;

  first_value = Value;

end;

counter + 1;

if last.ID and counter > 1

then do;

  last_date = Date;

  last_value = Value;

  output;

end;

run;

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
Super Contributor
Posts: 275

Re: Outputting the first and last observation for each ID

proc sql;

  create table first as

  select * from have group by id having date=min(date) and count(*)>1;

  create table last as

  select * from have group by id having date=max(date) and count(*)>1;

quit;

Solution
‎06-03-2014 11:56 AM
Trusted Advisor
Posts: 1,204

Re: Outputting the first and last observation for each ID

proc sql;
create table one as
select * from have
order by id, date, value;
quit;

data want (keep=id earliest last) ;

set one;
by id;
if first.id and last.id then delete;
retain earliest;
if first.id then earliest=value;
if last.id then last=value;
if last.id;
run;

☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 3 replies
  • 5723 views
  • 6 likes
  • 4 in conversation