Sorry for the poor description.
Here's what I am trying to achieve:
Basically, I have a table which contains unique 5 digit number and total record counts associated with it. Now the 5 digit numbers are not in sequence, however they could have either the first four digits or first three or first two digits in common. (01521, 01529). Now using this numbers, I want to roll up the date based on each number. So let's say if number 01521 have more than 10 records, however 01529 doesn't have 10 records, I would like to create a new column which will tell me which numbers where grouped together (for ex: 01521-01529). Here's are some sample records:
Column A # of Records
01111 111
01119 6
Now since the first four digits are common for above numbers, I would like to create a new column with value for each row as '01111-01119'.
Similarly, here is the second scenario where I have exhaused the first four combinations. So I would go for matching the first three digits :
Column A # of Records
01255 99
01299 6
Again I only need to do a grouping if the number of records are less than 10.
Here is the code that I have which definitely not the best piece of code:
data want; merge test1 test1 (firstobs=2 rename=(columna=_columna1 count=_count1 flag=_flag1 )) test1 (firstobs=3 rename=(columna=_columna2 count=_count2 flag=_flag2 )) ;
if count (# of records) < 10 then do;
if substr(columna,1,4) = substr(_columna1,1,4) then do;
if count + _count1 >= 10 then derived_column = strip(column) || '-' || strip(_columna1);
else if substr(column,1,4) = substr(_columna2,1,4) then do; if count + _count1 + _count2 >= 10 then do; derived_column = strip(column) || '-' || strip(_columna2); end;
end;
The problem with above program is I am not able to derive the new column for records that have value of > 10.
HTH. Thanks!
... View more