BookmarkSubscribeRSS Feed
RandyStan
Fluorite | Level 6
Dear All:

My data is as follows:

ID       VAR_A
1            MF
1            MF
1            MF
2            HUF
2           
2           HUF
2
3         
3
3
4
4           I
4           I
4           I
4           NRI
4 

I want my data to look as follow -- where the rule is that if the last observation is missing the missing value should 
be the previous non-missing observation. ID VAR_A 1 MF 1 MF 1 MF 2 HUF 2 2 HUF 2 HUF 3 3 3 4 4 I 4 I 4 I 4 NRI 4 NRI
3 REPLIES 3
jimbarbour
Meteorite | Level 14

@RandyStan,

 

I think if you retain the most recent non-missing Var_A for each ID, you can accomplish what you're asking for fairly easily.  See sample code, below.  However, note that if all of the Var_A's are missing (as in ID 3), then the last occurrence of the ID will still be missing; there's not much we can do there.

 

Jim

DATA	Unaltered_Data;
	INFILE	Datalines	MISSOVER;
	INPUT	ID		$
			VAR_A	$
			;
DATALINES;
1            MF
1            MF
1            MF
2            HUF
2           
2           HUF
2
3         
3
3
4
4           I
4           I
4           I
4           NRI
4 
;
RUN;

DATA	Non_Missing_Last;
	DROP	_:;
	RETAIN	_Prior_Not_Missing '       ';

	SET	Unaltered_Data;
		BY	ID	NOTSORTED;

	IF	FIRST.ID				THEN
		CALL	MISSING(_Prior_Not_Missing);

	IF	LAST.ID					AND
		MISSING(VAR_A)			THEN
		VAR_A				=	_Prior_Not_Missing;

	IF	NOT	MISSING(VAR_A)		THEN
		_Prior_Not_Missing	=	VAR_A;
RUN;

 

novinosrin
Tourmaline | Level 20

Merge fun:

 

DATA	have;
INFILE	Datalines	MISSOVER;
INPUT	ID	$	VAR_A	$;
DATALINES;
1            MF
1            MF
1            MF
2            HUF
2           
2           HUF
2
3         
3
3
4
4           I
4           I
4           I
4           NRI
4 
;
RUN;

data want;
 merge have have(rename=(var_a=_var_a) where=(_var_a>' '));
 by id;
 if last.id and var_a=' ' then var_a=_var_a;
 drop _:;
run;

 
RichardDeVen
Barite | Level 11

You would use the LAG(VAR_A) under the conditions of missing VAR_A and last.id in a group of size > 1.

 

The logic evaluating such a condition can be encapsulated in a single functional expression.

 

Example:

data want;
  set have;
  by  id;

  var_a = coalesceC (var_a, ifc (last.id and not first.id, lag(var_a), ''));
run;

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 16. Read more here about why you should contribute and what is in it for you!

Submit your idea!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 3 replies
  • 546 views
  • 2 likes
  • 4 in conversation