Re: Fill sequence gaps

brophymj · Posted 11-24-2014 11:18 AM

I have a table of the form:

Field1	Field2	Field3	Sequence
...	...	...	4
...	...	...	13
...	...	...	27

I would like to adjust the table so that I fill all the sequence gaps with identical information to the last sequence so for the gap between 4 and 13 I would create 8 rows with identical field entries to those in sequence 4 (not 13). Similarly for the gap between 13 and 27 I would create 13 new rows with identical field values to thos in 13. And so on.

Any help would be much appreciated.

CTorres · Posted 11-24-2014 02:52 PM

Try this:

data have;

input field1 field2 field3 sequence;

cards;

1 2 3 4

4 5 6 13

7 8 9 27

;

run;

data want(drop=seq1 seq2 i);

if 0 then set have nobs=n;

retain rec seq1 seq2;

rec+1;

if rec=n+1 then stop;

set have point=rec;

output;

seq1=sequence;

rec+1;

if rec=n+1 then stop;

set have point=rec;

seq2=sequence;

rec+(-1);

set have point=rec;

do i=seq1+1 to seq2-1;

sequence=i;

output;

end;

run;

CTorres

brophymj · Posted 11-25-2014 06:13 AM

Many thanks Torres, Please see my response below. I had left out some detail that i thought would not be a problem but looking at your solution I think the code needs to be adjusted to allow for the additional detail.

slchen · Posted 11-24-2014 04:49 PM

data want(drop=_sequence);

merge have have(firstobs=2 keep=sequence rename=(sequence=_sequence)) end=last;

if last then output;

else do sequence=sequence to _sequence-1;

output;

end;

run;

brophymj · Posted 11-25-2014 06:12 AM

Hi Slchen

Thanks for your help - please see my query below. It provides a bit more detail I originally had left out. It complicates things a bit more so i'm not sure how to adapt your approach to this problem.

Astounding · Posted 11-24-2014 04:52 PM

Here's one way:

data want;

set have end=done;

if not done then do;

set have (firstobs=2 keep=sequence rename=(sequence=next_sequence));

do sequence = sequence to next_sequence-1;

output;

end;

else output;

drop next_sequence;

run;

Good luck.

brophymj · Posted 11-25-2014 06:09 AM

Many thank for that I just have one further query. The above works well for the sample dataset I gave but I'm not sure how it would apply to the actual dataset I'm using as it is a little more complex.

Basically, I have a very large dataset containing insurance policy transaction information for commercial property risks (so all the historic renewals etc). Each policy has a unique policy number and within each policy there are multiple locations. Within each of these locations there is a historic set of transactions each with a unique sequence number. When you go into one of these sequences you get a list of all the poperties in a scroll screen.The property details are stored in a seperate file but I the common link between the two is policy_no||policy_location||policy_seq. Now the main data set will have all the sequences with no gaps so the sequence numbers will range from 1 to, say 150, and the latest one is always 99999. So the sequence is 1,2,3,...,149,150,99999. The scroll screen with the property details is only update each time a building is added or deleted so the sequence has gaps. So, using the example above, the sequence in the scroll screen might be 1,5,13,24,144. The scroll screen with sequence 1 will be attached to the main dataset with sequence 1-4; the scroll screen with sequence 5 will be attached to the main dataset with sequence 5 to 12, and so on... The scroll screen with sequence 144 will be attached to the main dataset with sequence 144 to 99999.

The main dataset contains 100s of policies with 1000s of locations and 10000s of sequence numbers. So my question is if the above code can be adjusted to allow for this i.e. repeat the process above for every location for every policy?

Many thanks

Astounding · Posted 11-25-2014 11:37 AM

Well, the code as is might be adaptable but I'm wary of doing that. You already described a data set with billions of observations and now you are asking about how to expand that. Perhaps you would be better off describing your ultimate goal and seeing if there might be another approach to achieving it.

Ready to join fellow brilliant minds for the SAS Hackathon?

Classroom Training Available!