Filling empty cells with valued from cell above

Accepted Solution Solved
Reply
Regular Contributor
Posts: 182
Accepted Solution

Filling empty cells with valued from cell above

Hi everyone

I recently posted a question on filling cells with values from cells above 

https://communities.sas.com/t5/General-SAS-Programming/Filling-a-cell-with-the-value-above-by-group/...

this was kindly answered by both @Tom and @Jagadishkatam

 

I am reposting the question with a change...

 

I have a dataset with a column containing missing values.

I am trying to fill these missing values from the cells above provided that the two rows belong to the same group.

 

Here is the dataset:

 

data a;
input N	Group	Var;
datalines;

1	1	3
2	1	6
3	1	2
4	1	.
5	2	.
6	2	6
7	3	3
8	3	.
9	3	5
10	4	. 
11     3      . 
;
run;

The solusion suggested is 

data want ;
  set a ;
  by group ;
  if first.group then want=var;
  else want=coalesce(var,want);
  retain want;
run;

The problem is that this solution assumes that you can sort the database by group so that you can use the Data...Set...By and then first.group.

i can not sort the data by group because it will mess the order of the rows, and the whole idea of filling the cells with valuse above is based on the current order.

So, is there any other way of achieving the same thing, ie filling an empty cell with the value above it provided that they both belong to the same group

Thanks


Accepted Solutions
Solution
‎09-15-2017 11:35 PM
Super User
Posts: 6,934

Re: Filling empty cells with valued from cell above

Since your GROUP values contain integers, it should be possible to set up a temporary array holding the most recent VAR value for each group:

 

data want;

array groups {100} _temporary_;

set have;

if var=. then var = groups{group};

else groups{group}=var;

run;

 

This solution does assume that you have no more than 100 groups (that part is easily adjustable) and that GROUP is always populated.  That last part can be worked around pretty easily as well.

View solution in original post


All Replies
Super User
Posts: 6,934

Re: Filling empty cells with valued from cell above

Why not sort by GROUP, use the logic that you know works, then sort again by N to put the observations back in their original order?

 

Sorting by GROUP does not change the order of the observations within each GROUP.

Super User
Super User
Posts: 8,279

Re: Filling empty cells with valued from cell above

[ Edited ]

If the data is not sorted by GROUP then why are you using the BY statement?

If you want to treat the data as if it was sorted by GROUP, without actually sorting it by GROUP, then you could use the NOTSORTED option on the BY statement.  Then SAS won't complain if the values of GROUP are not monotonically increasing but it will still set the FIRST.GROUP flag varaible whenever GROUP changes.

 

Here is some data that shows the difference better than your example.

data a;
 N+1;
 input Group Var Expected;
datalines;
1 3 3
1 . 3
1 2 2
1 . 2
2 . .
2 6 6
2 . 6
1 . .
1 5 5
;

data want ;
  set a ;
  by group notsorted;
  if first.group then want=var;
  else want=coalesce(var,want);
  retain want;
run;

image.png

 

 

Note that this is not going to generate the same result as you would get if you sorted by GROUP and N , calculated the new varaible, and then sorted again by N to put the data back into the original order.  

 

For example in the last group of data above the first observation has a missing value for VAR so the new WANT variable is missing also since this is the first for the block.  But if you first sorted by GROUP then this record would get 2 instead since the earlier block of data with GROUP=1 had VAR=2 as the last non-missing value.

Solution
‎09-15-2017 11:35 PM
Super User
Posts: 6,934

Re: Filling empty cells with valued from cell above

Since your GROUP values contain integers, it should be possible to set up a temporary array holding the most recent VAR value for each group:

 

data want;

array groups {100} _temporary_;

set have;

if var=. then var = groups{group};

else groups{group}=var;

run;

 

This solution does assume that you have no more than 100 groups (that part is easily adjustable) and that GROUP is always populated.  That last part can be worked around pretty easily as well.

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 3 replies
  • 203 views
  • 0 likes
  • 3 in conversation