Pyrite | Level 9

## Fill missing values with the previous values

Hello,

I have a dataset that looks like this:

 SEX COUNTRY PERIOD X1 X2 MALE A 2000 2 3 2 2 2 2 FEMALE 1 5 4 1 2 3 MALE B 1 3 4 3 3 4 FEMALE 5 3 4 1 4 1 MALE C 2 2 … … … Z 2050 6 4

I want to fill the missing value of SEX COUNTRY and PERIOD with the previous value, until it changes. So it should look like this:

 SEX COUNTRY PERIOD X1 X2 MALE A 2000 2 3 MALE A 2000 2 2 MALE A 2000 2 2 FEMALE A 2000 1 5 FEMALE A 2000 4 1 FEMALE A 2000 2 3 MALE B 2000 1 3 MALE B 2000 4 3 MALE B 2000 3 4 FEMALE B 2000 5 3 FEMALE B 2000 4 1 FEMALE B 2000 4 1 MALE C 2000 2 2 … … … Z 2050

I'm not really sure how to do this.

1 ACCEPTED SOLUTION

Accepted Solutions
Pyrite | Level 9

## Re: Fill missing values with the previous values

NVM, I found the solution

data want;
set have;
retain _period;
if not missing(period) then _period=period;
else period=_period;
drop _period;

retain _sex;
if not missing(sex) then _sex=sex;
else sex=_sex;
drop _sex;

retain _country;
if not missing(country) then _country=country;
else country=_country;
drop _country;
run;

22 REPLIES 22
Pyrite | Level 9

## Re: Fill missing values with the previous values

NVM, I found the solution

data want;
set have;
retain _period;
if not missing(period) then _period=period;
else period=_period;
drop _period;

retain _sex;
if not missing(sex) then _sex=sex;
else sex=_sex;
drop _sex;

retain _country;
if not missing(country) then _country=country;
else country=_country;
drop _country;
run;

Super User

## Re: Fill missing values with the previous values

You see that you write a piece of code repeatedly. This is a case for using a macro to reduce your work:

``````%macro process(variable);
retain help_&variable;
if not missing(&variable)
then _&variable = &variable;
else &variable = _&variable;
drop _&variable;
%mend;

data want;
set have;
%process(sex)
%process(country)
%process(period)
run;``````
Diamond | Level 26

## Re: Fill missing values with the previous values

Or use an array and loop over the array rather than bit by bit.

Lapis Lazuli | Level 10

## Re: Fill missing values with the previous values

Hello Kurt,

I succeeded to replace missing values by the previous values.

but only on the column "year_2017" .

my dataset has many columns ... year_2013, year_2014,year_2015,.......year_2018.

how could I apply this code below for each column year_xxxx

retain _an_2017 ;

if not missing(an_2017) then _an_2017 = an_2017 ; else an_2017 = _an_2017 ;

drop _an_2017 ;

regards

Nasser

Super User

## Re: Fill missing values with the previous values

@Nasser_DRMCP wrote:

Hello Kurt,

I succeeded to replace missing values by the previous values.

but only on the column "year_2017" .

my dataset has many columns ... year_2013, year_2014,year_2015,.......year_2018.

how could I apply this code below for each column year_xxxx

retain _an_2017 ;

if not missing(an_2017) then _an_2017 = an_2017 ; else an_2017 = _an_2017 ;

drop _an_2017 ;

regards

Nasser

Erroneous posting in the wrong topic?

Quartz | Level 8

## Re: Fill missing values with the previous values

Hello,

What changes would I make to the DATA step below to get  the output  appended below?

Thanks,

```data test;
infile datalines DSD  missover;
input name:\$13. School: \$30.;
datalines;
,,Pinecrest Elementary School
Craig Smith,
Paula Johnson,
Chung Wang,
;
proc print; run;

data test1;
set test;
retain _school;
if not missing(school) then _school=school;
else school = _school;
drop _school;
run;
proc print data=test1;
run;```

(Desired output, not generated by the above code blocks)

Obs name School

1 Craig Smith Pinecrest Elementary School
2 Paula Johnson Pinecrest Elementary School
3 Chung Wang Pinecrest Elementary School

Super User

## Re: Fill missing values with the previous values

If you want to fix the first step to generate that output without the need for the second step then this is classic example of reading in a report where some "header" information is not repeated.  You just need to read the header information once and remember it. Much like the second step is doing with data coming from a dataset instead of a text file.

``````data test;
infile datalines dsd truncover ;
length name \$13 _school school \$30 ;
retain school;
input name _school;
if not missing(_school) then do;
school=_school;
delete;
end;
drop _school ;
datalines;
,Pinecrest Elementary School
Craig Smith,
Paula Johnson,
Chung Wang,
;``````

PS I fixed the first line of data so that it has only TWO values and not the THREE it had it what you posted by removing the extra comma.

Quartz | Level 8

## Re: Fill missing values with the previous values

Thanks, Tom. Your code has worked.  However, I am trying to understand the usage of the DELETE statement within the IF-THEN-DO-END block? Is this statement telling SAS to delete the observations with missing values?

Super User

## Re: Fill missing values with the previous values

In you example data you did not want to write out that observation.  The DELETE statement will end the iteration of the data step right there.  Since there is no explicit OUTPUT statement there is an implicit OUTPUT done when the step reaches the end of an iteration, so with the DELETE statement that implicit OUTPUT never happens.

If your data was different such that you did want to write out that observation then just don't include the DELETE statement.

A common example would be a report were the first column is used to group the data so it is only present on the first observation in a group.  You wouldn't want to delete that first observation.  But if you had a report where the group name is on its own line, like your example , then you don't want to include that line.

Quartz | Level 8

## Re: Fill missing values with the previous values

Hi Tom. Thanks for the clarification.

Obsidian | Level 7

## Re: Fill missing values with the previous values

Love this!
Diamond | Level 26

## Re: Fill missing values with the previous values

A more important question would be why the "data" is like that in the first place.  It looks a bit like your reading in a produced report - not a recommended approach for multiple reasons (populations, calculations, assumptions etc.).  Get the real "data" and use that.

Pyrite | Level 9

## Re: Fill missing values with the previous values

I copied/pasted in Excel the output of a proc tabulate and then imported it in SAS. When I tried the out statement, it gaves me only one column for variables X1 X2. I suppose this was probably not the most efficient way.

Diamond | Level 26

## Re: Fill missing values with the previous values

No, its really not.  Use out= from tabulate, then manipulate the output dataset to get what you want.  Alternative, use one of the other procedures (means/summary/SQL aggregates) or one of the other hundreds of ways to get output you want.  For example, your output doesn't make any sense to me:

 SEX COUNTRY PERIOD X1 X2 MALE A 2000 2 3 2 2 2 2

Why are there three rows, there is no identifer on each row to say what the rows reflect other then the first one?

Discussion stats
• 22 replies
• 82948 views
• 16 likes
• 9 in conversation