BookmarkSubscribeRSS Feed
RandyStan
Fluorite | Level 6
Dear All:

My data is as follows:

ID       VAR_A
1            MF
1            MF
1            MF
2            HUF
2           
2           HUF
2
3         
3
3
4
4           I
4           I
4           I
4           NRI
4 

I want my data to look as follow -- where the rule is that if the last observation is missing the missing value should 
be the previous non-missing observation. ID VAR_A 1 MF 1 MF 1 MF 2 HUF 2 2 HUF 2 HUF 3 3 3 4 4 I 4 I 4 I 4 NRI 4 NRI
3 REPLIES 3
jimbarbour
Meteorite | Level 14

@RandyStan,

 

I think if you retain the most recent non-missing Var_A for each ID, you can accomplish what you're asking for fairly easily.  See sample code, below.  However, note that if all of the Var_A's are missing (as in ID 3), then the last occurrence of the ID will still be missing; there's not much we can do there.

 

Jim

DATA	Unaltered_Data;
	INFILE	Datalines	MISSOVER;
	INPUT	ID		$
			VAR_A	$
			;
DATALINES;
1            MF
1            MF
1            MF
2            HUF
2           
2           HUF
2
3         
3
3
4
4           I
4           I
4           I
4           NRI
4 
;
RUN;

DATA	Non_Missing_Last;
	DROP	_:;
	RETAIN	_Prior_Not_Missing '       ';

	SET	Unaltered_Data;
		BY	ID	NOTSORTED;

	IF	FIRST.ID				THEN
		CALL	MISSING(_Prior_Not_Missing);

	IF	LAST.ID					AND
		MISSING(VAR_A)			THEN
		VAR_A				=	_Prior_Not_Missing;

	IF	NOT	MISSING(VAR_A)		THEN
		_Prior_Not_Missing	=	VAR_A;
RUN;

 

novinosrin
Tourmaline | Level 20

Merge fun:

 

DATA	have;
INFILE	Datalines	MISSOVER;
INPUT	ID	$	VAR_A	$;
DATALINES;
1            MF
1            MF
1            MF
2            HUF
2           
2           HUF
2
3         
3
3
4
4           I
4           I
4           I
4           NRI
4 
;
RUN;

data want;
 merge have have(rename=(var_a=_var_a) where=(_var_a>' '));
 by id;
 if last.id and var_a=' ' then var_a=_var_a;
 drop _:;
run;

 
RichardDeVen
Barite | Level 11

You would use the LAG(VAR_A) under the conditions of missing VAR_A and last.id in a group of size > 1.

 

The logic evaluating such a condition can be encapsulated in a single functional expression.

 

Example:

data want;
  set have;
  by  id;

  var_a = coalesceC (var_a, ifc (last.id and not first.id, lag(var_a), ''));
run;

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 3 replies
  • 645 views
  • 2 likes
  • 4 in conversation