DATA Step, Macro, Functions and more

How to get last data available in a variable by group

Accepted Solution Solved
Reply
New Contributor
Posts: 3
Accepted Solution

How to get last data available in a variable by group

[ Edited ]

Hello!

I'll try to explain what I mean in the thread title as clear as possible. 

 

I have a database in SAS 9.4 with +1500 variables and around 200,000 observations. I want to group those observations by ID (around 60,000 unique IDs), ordered by a date and get the last data available for each variable and ID. For example, if my data is:

 

ID VAR1 VAR2 DATE (already ordered)

1  2           3          Day1

1           4          Day2

1  .            2          Day3

2  2           3          Day1

3           .           Day2

 

And I want to get:

ID VAR1 VAR2 DATE

1  1           2          Day3

2  3           3          Day2

 

I've been using the sentence BY and last.ID/last.Date since most of the data is available in the last observation, but I don't know if it is possible to get into such detail as (I made up the following code in order to try to explain what I need):

 

from VAR1 to VAR1500
i = 0
while (last-i).ID (VAR1) = . do
i+1
last.ID (VAR1) = (last-i).ID (VAR1)

I believe it is impossible especially bearing in mind that there will be 60.000 groups of IDs and 1.500 variables making the loop work 90 million times, but I'd appreciate any kind of help!

 

Thank you!

 


Accepted Solutions
Highlighted
Solution
‎10-23-2017 02:30 AM
Super User
Posts: 5,516

Re: How to get last data available in a variable by group

Posted in reply to DaniLagetsson

SAS actually has tools to make this incredibly simple.  Assuming you have already completed this step:

 

proc sort data=have;

by id date;

run;

 

All you need to do from that point is:

 

data want;

update have (obs=0) have;

by id;

run;

 

You don't even need to know all the variable names.

View solution in original post


All Replies
Super User
Super User
Posts: 7,988

Re: How to get last data available in a variable by group

Posted in reply to DaniLagetsson

By group processing, its well documented in the manual.  For your issue, sort, then by group:

proc sort data=have;
  by id date;
run;
data want;
  set have;
  by id;
  if last.id;
run;

You could also cheat somewhat by using the way proc sort works:

proc sort data=have out=want nodupkey;
  by id descending date;
run;

Sort keeps the first record in any sequence when nodupkey is on, so descending means last is first record.

New Contributor
Posts: 3

Re: How to get last data available in a variable by group

Hi RW9, thank you for your answer.

Nevertheless that will only get the whole last observation, not minding whether a value is missing or not. 

Super User
Posts: 7,854

Re: How to get last data available in a variable by group

Posted in reply to DaniLagetsson

The basic code for what you want to achieve with one variable is this:

data want;
set have (rename=(var1=_var1));
by id date;
retain var1;
if _var1 ne . then var1 = _var1;
if last.id then output;
run;

With macro processing, we can expand this code to deal with an arbitrary number of similarly named variables:

%macro make_want(numvars);
data want;
set have (rename=(
%do i = 1 %to &numvars;
  var&i.=_var&i.
%end;
));
by id date;
retain
%do i = 1 %to &numvars;
  var&i.
%end;
;
%do i = 1 %to &numvars;
if _var&i. ne . then var&i. = _var&i.;
%end;
if last.id then output;
run;
%mend;
%make_want(1500)
---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
Super User
Super User
Posts: 7,988

Re: How to get last data available in a variable by group

Posted in reply to DaniLagetsson

Ah, sorry, I didn't see the value carry forward.  Maybe:

data have;
  input id var1 var2 date $;
datalines;
1 2 3 Day1
1 1 4 Day2
1 . 2 Day3
2 2 3 Day1
2 3 . Day2
;
run;
data want;
  set have;
  by id;
  retain lst_var1 lst_var2;
  if first.id then call missing(lst_var1,lst_var2);
  if var1 ne . then lst_var1=var1;
  if var2 ne . then lst_var2=var2;
  if var1=. then var1=lst_var1;
  if var2=. then var2=lst_var2;
  if last.id;
run;

Could simplfy it a bit.

Frequent Contributor
Posts: 149

Re: How to get last data available in a variable by group

Posted in reply to DaniLagetsson

Assuming that all var* are  numbers:

 

data want;
   set have;
   by id;

   length _var1-_var2 i 8;
   retain _var:;
   drop _var: i;

   array v var1-var2;
   array r _var1-_var2;

   do i = 1 to dim(v);
      if not missing(v[i]) then r[i] = v[i];
   end;

   if last.id then do;
      do i = 1 to dim(v);
         v[i] = r[i];
      end;
      output;
   end;
run;
Highlighted
Solution
‎10-23-2017 02:30 AM
Super User
Posts: 5,516

Re: How to get last data available in a variable by group

Posted in reply to DaniLagetsson

SAS actually has tools to make this incredibly simple.  Assuming you have already completed this step:

 

proc sort data=have;

by id date;

run;

 

All you need to do from that point is:

 

data want;

update have (obs=0) have;

by id;

run;

 

You don't even need to know all the variable names.

Frequent Contributor
Posts: 149

Re: How to get last data available in a variable by group

Posted in reply to Astounding

@Astounding: NOOOOO! This can't be  that simple ... ok, have to find something else to increase job security Smiley Wink

New Contributor
Posts: 3

Re: How to get last data available in a variable by group

Posted in reply to DaniLagetsson

Thank you all! Alas, each of the 1500 variables have a different name so most of the solutions would take a lot of space to develop, although they'd work. The one from Astounding makes it so simple. 

New Contributor
Posts: 4

Re: How to get last data available in a variable by group

Posted in reply to DaniLagetsson

This can do the job.

 


proc sort data=konsort out=konsort;by gruppe;

 

data last; set konsort;by gruppe;
if last.gruppe then output;
run;

 

 

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 9 replies
  • 335 views
  • 8 likes
  • 6 in conversation