I have a dataset that has variables ID, Date, and Value. For each ID that has more than one Value, I want to output the earliest observation into a new column 'First', and the latest observation into a new column 'Last'. For IDs that only have one Value, I want the observation to be ignored. The final aim is to do a scatter plot of 'First' vs 'Last'.
Any help would be appreciated. Preferably using the UI in SAS EG, but if that isn't possible then a script will do as well.
proc sql;
create table one as
select * from have
order by id, date, value;
quit;
data want (keep=id earliest last) ;
set one;
by id;
if first.id and last.id then delete;
retain earliest;
if first.id then earliest=value;
if last.id then last=value;
if last.id;
run;
First, you need to sort the dataset by ID and Date.
Then you do
data want (keep=ID first_date last_date first_value last_value);
set have;
by ID;
retain first_date last_date first_value last_value counter;
if first.ID
then do;
counter = 0;
first_date = Date;
first_value = Value;
end;
counter + 1;
if last.ID and counter > 1
then do;
last_date = Date;
last_value = Value;
output;
end;
run;
proc sql;
create table first as
select * from have group by id having date=min(date) and count(*)>1;
create table last as
select * from have group by id having date=max(date) and count(*)>1;
quit;
proc sql;
create table one as
select * from have
order by id, date, value;
quit;
data want (keep=id earliest last) ;
set one;
by id;
if first.id and last.id then delete;
retain earliest;
if first.id then earliest=value;
if last.id then last=value;
if last.id;
run;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.