Solved: Re: replacing missing value by the next one

draeko · Posted 04-30-2017 05:36 PM

I have a datase order by decile bin. Some of the decile bin are missing. I know how to replace a missing value by the previous one, but I do not know how to replace a missing value by the next not missing value within the same group.

Could any one give me a suggestion?

Thank you!

Shmuel · Posted 04-30-2017 08:17 PM

You said: "I know how to replace a missing value by the previous one".

Sort your file by descending ID.

Replace missing value by "previous" one.

Sort back by ascending ID.

View solution in original post

mkeintz · Posted 04-30-2017 07:14 PM

Please show a smple of the data you have, and the data you want:

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

draeko · Posted 04-30-2017 07:38 PM

Thank you,

This is my input:

ID Score bin

1 1 . .

2 1 0

3 1 0

4 1 0

5 1 0

6 1 0

7 1 .

8 2 .

9 2 1

10 2 1

11 2 1

12 2 .

I want my output to be:

ID Score bin

1 1 0 .

2 1 0

3 1 0

4 1 0

5 1 0

6 1 0

7 1 0

8 2 1

9 2 1

10 2 1

11 2 1

12 2 1

draeko · Posted 04-30-2017 07:43 PM

this dataset I already sorted it.

Shmuel · Posted 04-30-2017 08:17 PM

You said: "I know how to replace a missing value by the previous one".

Sort your file by descending ID.

Replace missing value by "previous" one.

Sort back by ascending ID.

draeko · Posted 04-30-2017 08:30 PM

thank you for you reply!

But I do not get what do you mean by descending ID. It is already in descending ID.

mkeintz · Posted 04-30-2017 10:23 PM

Your description of the problem does not match the sample data and output you provide: Apparently you want

For each SCORE group, you want to replace missing BIN values with

1. The next valid BIN value within the SCORE group, but also
2. Last valid BIN value carried forward for missing BIN value at the end of a SCORE group.

data have;
 input ID          Score      bin;
datalines;
1              1              .                                              .
2              1              0
3              1              0
4              1              0
5              1              0
6              1              0
7              1              .
8              2              .
9              2              1
10            2              1             
11            2              1             
12            2              .
run;

data want (drop=n i validbin);
  /* Set N to # of records until a non-missing BIN or end of score group */
  do N=1 by 1 until (last.score or bin^=.);
    set have ;
    by score;
    retain validbin;
    if first.score then validbin=bin;
    else validbin=coalesce(bin,validbin);  /*Keep the latest non-missing BIN */
  end;

  /* Re-read the records and assign validbin */
  do I=1 to N;
    set have;
    bin=validbin;
  end;
run;

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

Ksharp · Posted 04-30-2017 11:49 PM

What if there are a series of missing value,what you gonna do ?

data have;
 input ID          Score      bin;
datalines;
1              1              .                                              .
2              1              0
3              1              0
4              1              0
5              1              0
6              1              0
7              1              .
8              2              .
9              2              1
10            2              1             
11            2              1             
12            2              .
run;

data temp;
 set have;
 by score notsorted;
 retain _bin;
 if first.score then call missing(_bin);
 if not missing(bin) then _bin=bin;
 drop bin;
run;
proc sort data=temp;by descending id;run;
data want;
 set temp;
 by score notsorted;
 retain bin;
 if first.score then call missing(bin);
 if not missing(_bin) then bin=_bin;
 drop _bin;
run;
proc sort data=want;by id;run;

draeko · Posted 05-22-2017 11:31 AM

Thank you for your answer!

The 2025 SAS Hackathon has begun!

SAS Training: Just a Click Away