BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
ML86
Obsidian | Level 7

Dear community,

 

I have a problem and I hope you can help me. Image I have the following table

 

id   Var1   Var2   Var3   Var4   Var5   Var6

1   10       .         .           0.25   .         .     

1   10      .          .           0.25  .         .

1   .       11         .           .        0.3     .

1   .       11        .           .         0.3    .

1   .      .           41        .        .        0.03

1   .     .           41        .        .        0.03

1  .      .          41        .        .         0.03

2   46  .         .          0.02     .       .

3   28 .        .           0.03     .        .

3   .              29        .          .        0.02

 

I want to replace the missing values (.) with the previous or next values by considering the id-variable. That means I want to have the following table

 

id   Var1   Var2   Var3   Var4   Var5   Var6

1   10      11        41         0.25    0.3       0.03 

1   10      11       41         0.25     0.3      0.03

1   10      11       41         0.25    0.3       0.03

1   10      11       41         0.25    0.3       0.03

1   10      11       41        0.25    0.3        0.03

1   10      11       41        0.25    0.3        0.03

1   10     11       41        0.25     0.3        0.03

2   46     11       41        0.02      .           .

3   28        .      29        0.03      .          0.02

3   28        .     29        0.03        .        0.02

 

How I can I do this?

 

Thank you very much and kind regards

 

1 ACCEPTED SOLUTION

Accepted Solutions
novinosrin
Tourmaline | Level 20

or use an UPDATE by ID to fill, and loop the filled for the count of occurrences of ID-


data have;
input id   Var1   Var2   Var3   Var4   Var5   Var6;
cards;
1   10       .         .           0.25   .         .     

1   10      .          .           0.25  .         .

1   .       11         .           .        0.3     .

1   .       11        .           .         0.3    .

1   .      .           41        .        .        0.03

1   .     .           41        .        .        0.03

1  .      .          41        .        .         0.03

2   46  .         .          0.02     .       .

3   28 .        .           0.03     .        .

3   .     .         29        .          .        0.02
;

data want ;
  do _n_ = 1 by 1 until (last.id ) ;
    update have(obs=0) have ;
    by id ;
  end ;
  do _n_ = 1 to _n_ ;
    output ;
  end ;
run ;

 

View solution in original post

6 REPLIES 6
PaigeMiller
Diamond | Level 26

From the data you show, a solution which uses the mean value in place of the missings ought to work, and is very simple to do using PROC STDIZE. Would that work for you?

--
Paige Miller
novinosrin
Tourmaline | Level 20

or use an UPDATE by ID to fill, and loop the filled for the count of occurrences of ID-


data have;
input id   Var1   Var2   Var3   Var4   Var5   Var6;
cards;
1   10       .         .           0.25   .         .     

1   10      .          .           0.25  .         .

1   .       11         .           .        0.3     .

1   .       11        .           .         0.3    .

1   .      .           41        .        .        0.03

1   .     .           41        .        .        0.03

1  .      .          41        .        .         0.03

2   46  .         .          0.02     .       .

3   28 .        .           0.03     .        .

3   .     .         29        .          .        0.02
;

data want ;
  do _n_ = 1 by 1 until (last.id ) ;
    update have(obs=0) have ;
    by id ;
  end ;
  do _n_ = 1 to _n_ ;
    output ;
  end ;
run ;

 

mkeintz
PROC Star

@novinosrin 

 

I like the concept of using UPDATE, followed by a do loop of SET/OUTPUT statements.  But it can be cleaned a bit:

 

data want ;
  update have(obs=0) have ;
  by id ;

  if last.id then do until (last.id);
    set have (keep=id);
    by id;
    output;
  end ;
run ;

 

I should add that this provides a way to specify a variable list in the "keep=" parameter that allows original values to prevail.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
Tom
Super User Tom
Super User

You example seems to result in all observations in a group being exactly the same.

What do you want to happen when the non-missing values are not all the same?

 

Is there another variable that can be used to insure that the observations are in the right order?  A sequence number or date perhaps?

Ksharp
Super User

If your these variables have the same value within the same group ID .

 

proc sql;
create table want(drop=dummy) as
select id,var1 as dummy,
max(var1) as var1 ,
max(var2) as var2 ,
max(var3) as var3 ,
max(var4) as var4 ,
max(var5) as var5 
 from have 
  group by id;
quit;
ML86
Obsidian | Level 7

Thank you very much for the suggested solutions.

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 6 replies
  • 2625 views
  • 4 likes
  • 6 in conversation