BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
ML86
Obsidian | Level 7

Dear community,

 

I have a problem and I hope you can help me. Image I have the following table

 

id   Var1   Var2   Var3   Var4   Var5   Var6

1   10       .         .           0.25   .         .     

1   10      .          .           0.25  .         .

1   .       11         .           .        0.3     .

1   .       11        .           .         0.3    .

1   .      .           41        .        .        0.03

1   .     .           41        .        .        0.03

1  .      .          41        .        .         0.03

2   46  .         .          0.02     .       .

3   28 .        .           0.03     .        .

3   .              29        .          .        0.02

 

I want to replace the missing values (.) with the previous or next values by considering the id-variable. That means I want to have the following table

 

id   Var1   Var2   Var3   Var4   Var5   Var6

1   10      11        41         0.25    0.3       0.03 

1   10      11       41         0.25     0.3      0.03

1   10      11       41         0.25    0.3       0.03

1   10      11       41         0.25    0.3       0.03

1   10      11       41        0.25    0.3        0.03

1   10      11       41        0.25    0.3        0.03

1   10     11       41        0.25     0.3        0.03

2   46     11       41        0.02      .           .

3   28        .      29        0.03      .          0.02

3   28        .     29        0.03        .        0.02

 

How I can I do this?

 

Thank you very much and kind regards

 

1 ACCEPTED SOLUTION

Accepted Solutions
novinosrin
Tourmaline | Level 20

or use an UPDATE by ID to fill, and loop the filled for the count of occurrences of ID-


data have;
input id   Var1   Var2   Var3   Var4   Var5   Var6;
cards;
1   10       .         .           0.25   .         .     

1   10      .          .           0.25  .         .

1   .       11         .           .        0.3     .

1   .       11        .           .         0.3    .

1   .      .           41        .        .        0.03

1   .     .           41        .        .        0.03

1  .      .          41        .        .         0.03

2   46  .         .          0.02     .       .

3   28 .        .           0.03     .        .

3   .     .         29        .          .        0.02
;

data want ;
  do _n_ = 1 by 1 until (last.id ) ;
    update have(obs=0) have ;
    by id ;
  end ;
  do _n_ = 1 to _n_ ;
    output ;
  end ;
run ;

 

View solution in original post

6 REPLIES 6
PaigeMiller
Diamond | Level 26

From the data you show, a solution which uses the mean value in place of the missings ought to work, and is very simple to do using PROC STDIZE. Would that work for you?

--
Paige Miller
novinosrin
Tourmaline | Level 20

or use an UPDATE by ID to fill, and loop the filled for the count of occurrences of ID-


data have;
input id   Var1   Var2   Var3   Var4   Var5   Var6;
cards;
1   10       .         .           0.25   .         .     

1   10      .          .           0.25  .         .

1   .       11         .           .        0.3     .

1   .       11        .           .         0.3    .

1   .      .           41        .        .        0.03

1   .     .           41        .        .        0.03

1  .      .          41        .        .         0.03

2   46  .         .          0.02     .       .

3   28 .        .           0.03     .        .

3   .     .         29        .          .        0.02
;

data want ;
  do _n_ = 1 by 1 until (last.id ) ;
    update have(obs=0) have ;
    by id ;
  end ;
  do _n_ = 1 to _n_ ;
    output ;
  end ;
run ;

 

mkeintz
PROC Star

@novinosrin 

 

I like the concept of using UPDATE, followed by a do loop of SET/OUTPUT statements.  But it can be cleaned a bit:

 

data want ;
  update have(obs=0) have ;
  by id ;

  if last.id then do until (last.id);
    set have (keep=id);
    by id;
    output;
  end ;
run ;

 

I should add that this provides a way to specify a variable list in the "keep=" parameter that allows original values to prevail.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
Tom
Super User Tom
Super User

You example seems to result in all observations in a group being exactly the same.

What do you want to happen when the non-missing values are not all the same?

 

Is there another variable that can be used to insure that the observations are in the right order?  A sequence number or date perhaps?

Ksharp
Super User

If your these variables have the same value within the same group ID .

 

proc sql;
create table want(drop=dummy) as
select id,var1 as dummy,
max(var1) as var1 ,
max(var2) as var2 ,
max(var3) as var3 ,
max(var4) as var4 ,
max(var5) as var5 
 from have 
  group by id;
quit;
ML86
Obsidian | Level 7

Thank you very much for the suggested solutions.

hackathon24-white-horiz.png

The 2025 SAS Hackathon Kicks Off on June 11!

Watch the live Hackathon Kickoff to get all the essential information about the SAS Hackathon—including how to join, how to participate, and expert tips for success.

YouTube LinkedIn

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 6 replies
  • 2132 views
  • 4 likes
  • 6 in conversation