BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
draeko
Fluorite | Level 6

I have a datase order by decile bin. Some of the decile bin are missing. I know how to replace a missing value by the previous one, but I do not know how to replace a missing value by the next not missing value within the same group.

 

Could any one give me a suggestion?

 

Thank you!

1 ACCEPTED SOLUTION

Accepted Solutions
Shmuel
Garnet | Level 18

You said: "I know how to replace a missing value by the previous one".

 

Sort your file by descending ID.

Replace missing value by "previous" one.

Sort back by ascending ID.

View solution in original post

8 REPLIES 8
mkeintz
PROC Star

Please show a smple of the data you have, and the data you want:

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
draeko
Fluorite | Level 6

Thank you,

 

This is my input:

 

ID          Score      bin

1              1              .                                              .

2              1              0

3              1              0

4              1              0

5              1              0

6              1              0

7              1              .

8              2              .

9              2              1

10            2              1             

11            2              1             

12            2              .

 

I want my output to be:

 

ID       Score          bin

1              1              0                                             .

2              1              0

3              1              0

4              1              0

5              1              0

6              1              0

7              1              0

8              2              1

9              2              1

10            2              1             

11            2              1             

12            2              1

draeko
Fluorite | Level 6
this dataset I already sorted it.
Shmuel
Garnet | Level 18

You said: "I know how to replace a missing value by the previous one".

 

Sort your file by descending ID.

Replace missing value by "previous" one.

Sort back by ascending ID.

draeko
Fluorite | Level 6

thank you for you reply!

 

But I do not get what do you mean by descending ID. It is already in descending ID.

 

 

mkeintz
PROC Star

Your description of the problem does not match the sample data and output you provide:  Apparently you want

 

For each SCORE group, you want to replace missing BIN values with

   1. The next valid BIN value within the SCORE group, but also
   2. Last valid BIN value carried forward for missing BIN value at the end of a SCORE group.

 

 

data have;
 input ID          Score      bin;
datalines;
1              1              .                                              .
2              1              0
3              1              0
4              1              0
5              1              0
6              1              0
7              1              .
8              2              .
9              2              1
10            2              1             
11            2              1             
12            2              .
run;

data want (drop=n i validbin);
  /* Set N to # of records until a non-missing BIN or end of score group */
  do N=1 by 1 until (last.score or bin^=.);
    set have ;
    by score;
    retain validbin;
    if first.score then validbin=bin;
    else validbin=coalesce(bin,validbin);  /*Keep the latest non-missing BIN */
  end;

  /* Re-read the records and assign validbin */
  do I=1 to N;
    set have;
    bin=validbin;
  end;
run;

 

 

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
Ksharp
Super User

What if there are a series of missing value,what you gonna do ?

 

 

data have;
 input ID          Score      bin;
datalines;
1              1              .                                              .
2              1              0
3              1              0
4              1              0
5              1              0
6              1              0
7              1              .
8              2              .
9              2              1
10            2              1             
11            2              1             
12            2              .
run;

data temp;
 set have;
 by score notsorted;
 retain _bin;
 if first.score then call missing(_bin);
 if not missing(bin) then _bin=bin;
 drop bin;
run;
proc sort data=temp;by descending id;run;
data want;
 set temp;
 by score notsorted;
 retain bin;
 if first.score then call missing(bin);
 if not missing(_bin) then bin=_bin;
 drop _bin;
run;
proc sort data=want;by id;run;
draeko
Fluorite | Level 6
Thank you for your answer!

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 8 replies
  • 4067 views
  • 3 likes
  • 4 in conversation