BookmarkSubscribeRSS Feed
deleted_user
Not applicable
I have:

ID VALUE
12 .
12 .
12 4
17 .
17 1
37 .
37 .
37 .
37 2

I want:

ID VALUE
12 4
12 4
12 4
17 1
17 1
37 2
37 2
37 2
37 2

The value in the last record should be in all records for each ID. This must be easy.

Thanks for any help.

Susan
13 REPLIES 13
deleted_user
Not applicable
Hi,

please try this..

proc sql;
create table ab(drop=value1) as
select id ,value as value1,max(value) as value
from ab
group by id ;
quit;

Rgds,
skm
Doc_Duke
Rhodochrosite | Level 12
Susan,

skm's elegant solution assumes that the ONLY non-missing data for VALUE is the last one for each ID (which is what your example showed, but not quite what the question stated). If there are multiple non-missing data for VALUE within ID, then the problem is considerably messier and likely requires two DATA steps and a SORT.

Doc Muhlbaier
Duke
sbb
Lapis Lazuli | Level 10 sbb
Lapis Lazuli | Level 10
Assign the _N_ variable to your_variable OBSNUM in a DATA step to get the OBS NUMBER, then you can sort the data with a BY list and include DESCENDING OBSNUM and use BY with FIRST. and/or LAST. to propagate values onto other associated observations -- you will need to use a RETAIN variable or work within a DO loop with UNTIL(EOF); to avoid needing the RETAIN variable.

Scott Barry
SBBWorks, Inc.
DanielSantos
Barite | Level 11
Being IN the input dataset (and assuming it was previously sorted by ID), you could produce OUT the output dataset with the following code:

data OUT;
set IN;
by ID;
drop _C;
if first.ID then _C=0; /* reset counter */
_C+1; /* count */
if last.ID;
do _N_=1 to _C;
output; /* multiple output with last value of the group */
end;
run;

Cheers from Portugal.

Daniel Santos @ www.cgd.t
deleted_user
Not applicable
Thank you Daniel,

Your code seems to work, but I would like to have a "then" placed like this:

if last.ID then
do _N_=1 to _C;
output; /* multiple output with last value of the group */
end;

A single text row "if last.ID;" means to me that only the last record for each ID is kept.
Is it because "if last.ID;" is directly followed by a "do", that your code works?

Susan
deleted_user
Not applicable
Daniel,

My table have more variables than ID and VALUE. I want to keep the different values for the those variables in each post, and your solution doesn't manage that.

Thanks anyway.

I hope you answer my previous question about the missing(?) "then".

Susan
DanielSantos
Barite | Level 11
Hello Susan.

the use of, if last.ID;

is know as a subsetting if.

only rows satisfying the if condition will pass beside the if (an reach the run; and so implicit output).

since after that, there is a do loop, with explicit output, this loop will be executed only at the last ID row.

Be careful that the syntax you have posted will not work, you should change it to:

if last.ID then do;
do _N_=1 to _C;
output;
end;

and this will do exactly the same as my code, because the do loop will only be executed at the last ID row, also.

Cheers from Portugal.

Daniel Santos @ www.cgd.pt
sbb
Lapis Lazuli | Level 10 sbb
Lapis Lazuli | Level 10
Suggest the _N_ variable name not be used since it is used by SAS for counting the observation number (and is considered a reserved name variable).

Also, the SAS coding construct is totally valid and permissable for the purpose listed (as is the SAS coding technique from Daniel Santos), the objective would be to output iterative observations on the last occurrence of the BY variable identified in IF LAST.ID while maintaining the last VALUE , such as:

if last.a then do i=1 to _c;
drop i;
output;
end;


Scott Barry
SBBWorks, Inc.
sbb
Lapis Lazuli | Level 10 sbb
Lapis Lazuli | Level 10
Sorry - incorrect sample code posted:

if last.ID then do i=1 to _c;
drop i _c;
/* generate same obs count as input but with last VALUE occurrence */
output;
end;



Scott Barry
SBBWorks, Inc.
DanielSantos
Barite | Level 11
I'm sorry Susan.

The syntax you posted is indeed correct, and will work as expected.

Scott, the use of the variable _N_, is a common optimization technique which I believe firstly appeared by the hand of the great Paul Dorfman.

_N_ is not used by SAS for counting rows, the row number is just passed to this automatic variable at the begin of each datastep loop. It's just meant to be used inside each loop, of course, if needed, which is not the case. So, if your not interested about its value, there's nothing wrong about changing it's value, and you get a perfectly good temporary variable, which no extra memory and that does not need to be dropped at the end of the datastep. Same goes for other automatic variable, like _ERROR_ and _IORC_.

But this would not be the subject here, so returning to Susan's question, Scott's example is right, and will do what is expected.

Cheers from Portugal.

Daniel Santos @ www.cgd.pt
deleted_user
Not applicable
I did like this:

data temporary(keep=id value);
set old;
by id;
if last.id;
run;

data new;
merge old temporary;
by id;
run;

As I said above there are other variables in my table that I want to keep in every record.

Susan Message was edited by: SusanZusan
sbb
Lapis Lazuli | Level 10 sbb
Lapis Lazuli | Level 10
You will want to share a "complete" INPUT and OUTPUT (desired) data view for a an accurate reply, presuming you still have unsatisfactory results generated by the MERGE process you have shared? Also, you may want to investigate adding some self-diagnosis SAS commands to your second DATA step in order to clearly understand the results / benefits of your MERGE process -

PUTLOG _ALL_;

If you still have unusable output, re-post a reply with some SAS log results using the diagnosis command and all SAS-generated code for review/comment.


Scott Barry
SBBWorks, Inc.
HenrikOlsen
Calcite | Level 5
Or like this:

proc sql;
create table test2 as
select a.Id,coalesce(b.value,0) as value, a.var2, a.var3 etc....
from test a left join test b
on a.id=b.id
and b.value ne .;
quit;

coalesce < if all values is missing then 0 as default !

SAS Innovate 2025: Register Now

Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 13 replies
  • 2196 views
  • 0 likes
  • 5 in conversation