BookmarkSubscribeRSS Feed
RandyStan
Fluorite | Level 6
Dear All:

My data is as follows:

ID       VAR_A
1            MF
1            MF
1            MF
2            HUF
2           
2           HUF
2
3         
3
3
4
4           I
4           I
4           I
4           NRI
4 

I want my data to look as follow -- where the rule is that if the last observation is missing the missing value should 
be the previous non-missing observation. ID VAR_A 1 MF 1 MF 1 MF 2 HUF 2 2 HUF 2 HUF 3 3 3 4 4 I 4 I 4 I 4 NRI 4 NRI
3 REPLIES 3
jimbarbour
Meteorite | Level 14

@RandyStan,

 

I think if you retain the most recent non-missing Var_A for each ID, you can accomplish what you're asking for fairly easily.  See sample code, below.  However, note that if all of the Var_A's are missing (as in ID 3), then the last occurrence of the ID will still be missing; there's not much we can do there.

 

Jim

DATA	Unaltered_Data;
	INFILE	Datalines	MISSOVER;
	INPUT	ID		$
			VAR_A	$
			;
DATALINES;
1            MF
1            MF
1            MF
2            HUF
2           
2           HUF
2
3         
3
3
4
4           I
4           I
4           I
4           NRI
4 
;
RUN;

DATA	Non_Missing_Last;
	DROP	_:;
	RETAIN	_Prior_Not_Missing '       ';

	SET	Unaltered_Data;
		BY	ID	NOTSORTED;

	IF	FIRST.ID				THEN
		CALL	MISSING(_Prior_Not_Missing);

	IF	LAST.ID					AND
		MISSING(VAR_A)			THEN
		VAR_A				=	_Prior_Not_Missing;

	IF	NOT	MISSING(VAR_A)		THEN
		_Prior_Not_Missing	=	VAR_A;
RUN;

 

novinosrin
Tourmaline | Level 20

Merge fun:

 

DATA	have;
INFILE	Datalines	MISSOVER;
INPUT	ID	$	VAR_A	$;
DATALINES;
1            MF
1            MF
1            MF
2            HUF
2           
2           HUF
2
3         
3
3
4
4           I
4           I
4           I
4           NRI
4 
;
RUN;

data want;
 merge have have(rename=(var_a=_var_a) where=(_var_a>' '));
 by id;
 if last.id and var_a=' ' then var_a=_var_a;
 drop _:;
run;

 
RichardDeVen
Barite | Level 11

You would use the LAG(VAR_A) under the conditions of missing VAR_A and last.id in a group of size > 1.

 

The logic evaluating such a condition can be encapsulated in a single functional expression.

 

Example:

data want;
  set have;
  by  id;

  var_a = coalesceC (var_a, ifc (last.id and not first.id, lag(var_a), ''));
run;

hackathon24-white-horiz.png

The 2025 SAS Hackathon Kicks Off on June 11!

Watch the live Hackathon Kickoff to get all the essential information about the SAS Hackathon—including how to join, how to participate, and expert tips for success.

YouTube LinkedIn

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 3 replies
  • 971 views
  • 2 likes
  • 4 in conversation