BookmarkSubscribeRSS Feed
RandyStan
Fluorite | Level 6
Dear All:

My data is as follows:

ID       VAR_A
1            MF
1            MF
1            MF
2            HUF
2           
2           HUF
2
3         
3
3
4
4           I
4           I
4           I
4           NRI
4 

I want my data to look as follow -- where the rule is that if the last observation is missing the missing value should 
be the previous non-missing observation. ID VAR_A 1 MF 1 MF 1 MF 2 HUF 2 2 HUF 2 HUF 3 3 3 4 4 I 4 I 4 I 4 NRI 4 NRI
3 REPLIES 3
jimbarbour
Meteorite | Level 14

@RandyStan,

 

I think if you retain the most recent non-missing Var_A for each ID, you can accomplish what you're asking for fairly easily.  See sample code, below.  However, note that if all of the Var_A's are missing (as in ID 3), then the last occurrence of the ID will still be missing; there's not much we can do there.

 

Jim

DATA	Unaltered_Data;
	INFILE	Datalines	MISSOVER;
	INPUT	ID		$
			VAR_A	$
			;
DATALINES;
1            MF
1            MF
1            MF
2            HUF
2           
2           HUF
2
3         
3
3
4
4           I
4           I
4           I
4           NRI
4 
;
RUN;

DATA	Non_Missing_Last;
	DROP	_:;
	RETAIN	_Prior_Not_Missing '       ';

	SET	Unaltered_Data;
		BY	ID	NOTSORTED;

	IF	FIRST.ID				THEN
		CALL	MISSING(_Prior_Not_Missing);

	IF	LAST.ID					AND
		MISSING(VAR_A)			THEN
		VAR_A				=	_Prior_Not_Missing;

	IF	NOT	MISSING(VAR_A)		THEN
		_Prior_Not_Missing	=	VAR_A;
RUN;

 

novinosrin
Tourmaline | Level 20

Merge fun:

 

DATA	have;
INFILE	Datalines	MISSOVER;
INPUT	ID	$	VAR_A	$;
DATALINES;
1            MF
1            MF
1            MF
2            HUF
2           
2           HUF
2
3         
3
3
4
4           I
4           I
4           I
4           NRI
4 
;
RUN;

data want;
 merge have have(rename=(var_a=_var_a) where=(_var_a>' '));
 by id;
 if last.id and var_a=' ' then var_a=_var_a;
 drop _:;
run;

 
RichardDeVen
Barite | Level 11

You would use the LAG(VAR_A) under the conditions of missing VAR_A and last.id in a group of size > 1.

 

The logic evaluating such a condition can be encapsulated in a single functional expression.

 

Example:

data want;
  set have;
  by  id;

  var_a = coalesceC (var_a, ifc (last.id and not first.id, lag(var_a), ''));
run;

sas-innovate-wordmark-2025-midnight.png

Register Today!

Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.


Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 3 replies
  • 754 views
  • 2 likes
  • 4 in conversation