BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
ML86
Obsidian | Level 7

Dear community,

 

I have a problem and I hope you can help me. Image I have the following table

 

id   Var1   Var2   Var3   Var4   Var5   Var6

1   10       .         .           0.25   .         .     

1   10      .          .           0.25  .         .

1   .       11         .           .        0.3     .

1   .       11        .           .         0.3    .

1   .      .           41        .        .        0.03

1   .     .           41        .        .        0.03

1  .      .          41        .        .         0.03

2   46  .         .          0.02     .       .

3   28 .        .           0.03     .        .

3   .              29        .          .        0.02

 

I want to replace the missing values (.) with the previous or next values by considering the id-variable. That means I want to have the following table

 

id   Var1   Var2   Var3   Var4   Var5   Var6

1   10      11        41         0.25    0.3       0.03 

1   10      11       41         0.25     0.3      0.03

1   10      11       41         0.25    0.3       0.03

1   10      11       41         0.25    0.3       0.03

1   10      11       41        0.25    0.3        0.03

1   10      11       41        0.25    0.3        0.03

1   10     11       41        0.25     0.3        0.03

2   46     11       41        0.02      .           .

3   28        .      29        0.03      .          0.02

3   28        .     29        0.03        .        0.02

 

How I can I do this?

 

Thank you very much and kind regards

 

1 ACCEPTED SOLUTION

Accepted Solutions
novinosrin
Tourmaline | Level 20

or use an UPDATE by ID to fill, and loop the filled for the count of occurrences of ID-


data have;
input id   Var1   Var2   Var3   Var4   Var5   Var6;
cards;
1   10       .         .           0.25   .         .     

1   10      .          .           0.25  .         .

1   .       11         .           .        0.3     .

1   .       11        .           .         0.3    .

1   .      .           41        .        .        0.03

1   .     .           41        .        .        0.03

1  .      .          41        .        .         0.03

2   46  .         .          0.02     .       .

3   28 .        .           0.03     .        .

3   .     .         29        .          .        0.02
;

data want ;
  do _n_ = 1 by 1 until (last.id ) ;
    update have(obs=0) have ;
    by id ;
  end ;
  do _n_ = 1 to _n_ ;
    output ;
  end ;
run ;

 

View solution in original post

6 REPLIES 6
PaigeMiller
Diamond | Level 26

From the data you show, a solution which uses the mean value in place of the missings ought to work, and is very simple to do using PROC STDIZE. Would that work for you?

--
Paige Miller
novinosrin
Tourmaline | Level 20

or use an UPDATE by ID to fill, and loop the filled for the count of occurrences of ID-


data have;
input id   Var1   Var2   Var3   Var4   Var5   Var6;
cards;
1   10       .         .           0.25   .         .     

1   10      .          .           0.25  .         .

1   .       11         .           .        0.3     .

1   .       11        .           .         0.3    .

1   .      .           41        .        .        0.03

1   .     .           41        .        .        0.03

1  .      .          41        .        .         0.03

2   46  .         .          0.02     .       .

3   28 .        .           0.03     .        .

3   .     .         29        .          .        0.02
;

data want ;
  do _n_ = 1 by 1 until (last.id ) ;
    update have(obs=0) have ;
    by id ;
  end ;
  do _n_ = 1 to _n_ ;
    output ;
  end ;
run ;

 

mkeintz
PROC Star

@novinosrin 

 

I like the concept of using UPDATE, followed by a do loop of SET/OUTPUT statements.  But it can be cleaned a bit:

 

data want ;
  update have(obs=0) have ;
  by id ;

  if last.id then do until (last.id);
    set have (keep=id);
    by id;
    output;
  end ;
run ;

 

I should add that this provides a way to specify a variable list in the "keep=" parameter that allows original values to prevail.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
Tom
Super User Tom
Super User

You example seems to result in all observations in a group being exactly the same.

What do you want to happen when the non-missing values are not all the same?

 

Is there another variable that can be used to insure that the observations are in the right order?  A sequence number or date perhaps?

Ksharp
Super User

If your these variables have the same value within the same group ID .

 

proc sql;
create table want(drop=dummy) as
select id,var1 as dummy,
max(var1) as var1 ,
max(var2) as var2 ,
max(var3) as var3 ,
max(var4) as var4 ,
max(var5) as var5 
 from have 
  group by id;
quit;
ML86
Obsidian | Level 7

Thank you very much for the suggested solutions.

sas-innovate-white.png

Our biggest data and AI event of the year.

Don’t miss the livestream kicking off May 7. It’s free. It’s easy. And it’s the best seat in the house.

Join us virtually with our complimentary SAS Innovate Digital Pass. Watch live or on-demand in multiple languages, with translations available to help you get the most out of every session.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 6 replies
  • 2000 views
  • 4 likes
  • 6 in conversation