Solved: Re: Finding the next value by group processing

mansour_ib_sas · Posted 05-29-2019 09:57 AM

Hello,

I have this data;

data test;
input id $ val;
cards;
A .
A .
A .
A 5
A .
A 6
B .
B .
B .
B 4
B .
B .
B 7
; run;

I want this result :

data test;
input id $ val;
cards;
A 5
A 5
A 5
A 5
A 6
A 6
A 6
B 4
B 4
B 4
B 4
B 7
B 7
B 7
; run;

How i can get this result?

Thank you for your help

Astounding · Posted 05-29-2019 10:08 AM

Here's one approach (assuming you fix the number of observations in your example):

data want;
do until (last.id or val > .) ;
   set have;
   by id;
end;
replacement_val = val;
do until (last.id or val > .) ;
   set have;
   by id;
   output;
end;
drop val;
rename replacement_val = val;
run;

The bottom loop reads the same observations as the top loop, but has an additional variable to work with as it outputs observations.

View solution in original post

33pedro · Posted 05-29-2019 10:04 AM

I am not quite sure I understand. There are more observations in the second list than in the first one.

If you wanted to carry over the value of val for all missing entries until you get to a new value and then carry that value on until a new value etc you could use the RETAIN statement but it would require careful ordering of the data.

Without another variable to distinguish the observation other than id it will be tricky to assign different values to the same id value.

Astounding · Posted 05-29-2019 10:08 AM

Here's one approach (assuming you fix the number of observations in your example):

data want;
do until (last.id or val > .) ;
   set have;
   by id;
end;
replacement_val = val;
do until (last.id or val > .) ;
   set have;
   by id;
   output;
end;
drop val;
rename replacement_val = val;
run;

The bottom loop reads the same observations as the top loop, but has an additional variable to work with as it outputs observations.

mansour_ib_sas · Posted 05-29-2019 10:40 AM

thank you for your reply.
I can't visualize the process of treatment including the ability of the program to change the value of replacement_val at the right time.
I know that the condition (or val>.) Has a role.
Thanks again

Astounding · Posted 05-29-2019 01:35 PM

Here are some of the initial steps, to get you thinking about the process correctly.

The top loop starts reading in observations. It finds VAL is 5 on the fourth observation, so that's where the loop stops.
SAS copies the value of 5 into REPLACEMENT_VAL.
The bottom loop starts reading the same observations. Each SET statement acts independently of other SET statements, so the bottom loop begins with the first observation, and finishes with the fourth observation (VAL=5).
The bottom loop outputs those observations. Because of the later DROP and RENAME statements, VAL is 5 on all those observations.
The top loop begins again, reading the next set of observations. The SET statement tracks which observations it has already read, and begins with the fifth observation until it finds another one with a nonmissing value for VAL.

This version of the program never copies VAL from the previous ID. It always begins again when it finds a new ID. I assumed that's what you wanted, but could be wrong about that.

mansour_ib_sas · Posted 05-29-2019 05:48 PM

Good.

Thank you

Amir · Posted 05-29-2019 10:56 AM

Based on the presented data, looks like by-processing is not required, so your code could be reduced to:

data want;
   do until (val > .) ;
      set have;
   end;

   replacement_val = val;

   do until (val > .) ;
      set have;
      output;
   end;

   drop val;
   rename replacement_val = val;
run;

Amir.

AMSAS · Posted 05-29-2019 10:36 AM

Here's a different approach that only reads the input data once

data test;
input id $ val;
cards;
A .
A .
A .
A 5
A .
A 6
B .
B .
B .
B 4
B .
B .
B 7
; run;

data want ;
	retain cntr 0 ;
	set test ;
	cntr+1 ;
	if val ne . then do ;
		do i=1 to cntr ;
			output want ;
		end ;
		cntr=0 ;
	end ;
run ;

It counts the number of observations that have a missing value, and then when it gets to an obs that has a value it writes to the output dataset, and resets the counter (cntr)

Registration is open

SAS Training: Just a Click Away