Re: replacing a missing value with the last known value within a group...

Anita_n · Posted 03-27-2021 09:48 AM

Hi all,

I have a table that looks like this :

id	date	var1	var2	var3
3	06.01.81	2.5	50	40
3	15.11.88		20	50
3	18.08.89		18	60
3	11.12.90	10.78	190	70
10	06.10.90		350	80
10	16.10.98		184	90
10	17.10.95	1.65	79	100
10	18.10.85		367	110
10	19.10.66		500
10	20.10.98		497
10	21.10.77		600

I want sas to obtain the last known value if the next known is missing. This should take place within the same id so that my tables looks like this at the end

id	date	var1	var2	var3
3	06.01.81	2.5	50	40
3	15.11.88	2.5	20	50
3	18.08.89	2.5	18	60
3	11.12.90	10.78	190	70
10	06.10.90	.	350	80
10	16.10.98	.	184	90
10	17.10.95	1.65	79	100
10	18.10.85	1.65	367	110
10	19.10.66	1.65	500	110
10	20.10.98	1.65	497	110
10	21.10.77	1.65	600	110

any help?

Anita_n · Posted 03-27-2021 09:45 AM

Hi all,

I have a table that looks like this :

id	date	var1	var2	var3
3	06.01.81	2.5	50	40
3	15.11.88		20	50
3	18.08.89		18	60
3	11.12.90	10.78	190	70
10	06.10.90		350	80
10	16.10.98		184	90
10	17.10.95	1.65	79	100
10	18.10.85		367	110
10	19.10.66		500
10	20.10.98		497
10	21.10.77		600

I want sas to obtain the last known value if the next known is missing. This should take place within the same id so that my tables looks like this at the end

id	date	var1	var2	var3
3	06.01.81	2.5	50	40
3	15.11.88	2.5	20	50
3	18.08.89	2.5	18	60
3	11.12.90	10.78	190	70
10	06.10.90	.	350	80
10	16.10.98	.	184	90
10	17.10.95	1.65	79	100
10	18.10.85	1.65	367	110
10	19.10.66	1.65	500	110
10	20.10.98	1.65	497	110
10	21.10.77	1.65	600	110

any help?

Tom · Posted 03-27-2021 10:45 AM

Take advantage of how the UPDATE statement handles transactions with missing values.

Use an empty version of your dataset as the original data to be updated and treat all of your data as transactions. Include an OUTPUT statement so that all observations per BY group are written instead of just the final observation in the group.

data want;
  update have(obs=0) have;
  by id;
  output;
run;

2021-03-27 12:10 - fixed typo

sbxkoenk · Posted 03-27-2021 10:55 AM

Hello,

I guess @Tom wanted to put an UPDATE statement instead of a SET statement, like:

data want;
  update have(obs=0) have;
  by id;
  output;
run;

Here's an alternative way of doing it. Using PROC EXPAND (SAS/ETS):

data in0;
input id date $ var1 var2 var3;
cards;
3	06.01.81 2.5    50  40
3	15.11.88 .      20  50
3	18.08.89 .      18  60
3	11.12.90 10.78 190  70
10	06.10.90 .     350  80
10	16.10.98 .     184  90
10	17.10.95 1.65  79   100
10	18.10.85 .     367  110
10	19.10.66 .     500  .
10	20.10.98 .     497  .
10	21.10.77 .     600  .
;
run;

data in1; set in0; by id; if first.id then seqnr=0; seqnr+1; run;

proc expand data=in1
            out=out0
            METHOD=STEP
            EXTRAPOLATE;
  by id;
  id seqnr;
  convert var1-var3 / ;
run;
/* end of program */

Cheers,

Koen

Kurt_Bremser · Posted 03-28-2021 05:26 AM

data want;
set have;
by id;
retain
  _var1
;
if first.id then _var1 = .;
if var1 ne .
then _var1 = var1;
else var1 = _var1;
drop _var1;
run;

Add code for the other variables you want to carry forward; you may want to do that in a macro %DO loop.

Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
The macro for direct download as ZIP
How to post code
Please vote for Provide Sequential Search Capability for Hash Objects
How to deal with locked files on UNIX

himofhimself · Posted 03-28-2021 11:58 AM

Some body help me as to why this won't work, its just populating only for one record.

data in0;
input id date $ var1 var2 var3;
cards;
3	06.01.81 2.5    50  40
3	15.11.88 .      20  50
3	18.08.89 .      18  60
3	11.12.90 10.78 190  70
10	06.10.90 .     350  80
10	16.10.98 .     184  90
10	17.10.95 1.65  79   100
10	18.10.85 .     367  110
10	19.10.66 .     500  .
10	20.10.98 .     497  .
10	21.10.77 .     600  .
;
run;

proc sort  data=in0;
  by id;
run; 

data in01;
set in0;
by id;
if first.id then call missing(var4,var5);
if not missing(var1) then var4=var1;
if not missing(var3) then var5=var3;
retain var4 var5 ;
if missing(var1) then var1=var4;
if missing(var3) then var3=var5;
run;

run;

Tom · Posted 03-28-2021 12:11 PM

You are just taking the previous value, not the previous non-missing value.

Instead only set the variables when the value is not missing and use RETAIN so they keep the value across observations.

Also what out for when starting a new ID group.

data in01;
  set in0;
  by id;
  if first.id then call missing(var4,var5);
  if not missing(var1) then var4=var1;
  if not missing(var3) then var5=var3;
  retain var4 var5 ;
  if missing(var1) then var1=var4;
  if missing(var3) then var3=var5;
run;

himofhimself · Posted 03-28-2021 12:14 PM

perfect , thanks @Tom

replacing a missing value with the last known value within a group of id's

replacing missings with last known value within a group

Re: replacing a missing value with the last known value within a group of id's

Re: replacing a missing value with the last known value within a group of id's

Re: replacing a missing value with the last known value within a group of id's

Re: replacing a missing value with the last known value within a group of id's

Re: replacing a missing value with the last known value within a group of id's

Re: replacing a missing value with the last known value within a group of id's

SAS Innovate 2025: Register Now

SAS Training: Just a Click Away