You can use PRXMATCH to detect the 'new group of records' start condition.
Use a retained result string to accumulate a concatenation of each row in the group.
Example (generate some sample data)
data have;
call streaminit(123);
do rownum = 1 to 1000;
length mfrecord $25; /* mainframe record */
if rand('uniform') < 0.25 or rownum in (1, 101,102,103) then do;
mfrecord =
byte(64+rand('integer',26)) ||
put(rand('integer',999),z3.) ||
' AAAAAAAAA'
;
seq = 1;
end;
else do;
seq + 1;
mfrecord = repeat(byte(64+seq), rand('integer',0,24));
end;
OUTPUT;
end;
keep rownum mfrecord;
run;
Example (compute result as a group aggregate that is concatenation of rows)
Version 1. One result row for each group
data want;
set have end=done;
length result $1000; retain result;
if prxmatch('/^[A-Z]\d+ /', mfrecord) then do;
if _n_ > 1 then OUTPUT;
call missing(result);
end;
result = catx(' ', result, mfrecord);
if done then OUTPUT;
keep result;
run;
Version 2. Concatenation result on last record in group
Requires lead processing due to synthetic group definition
data want;
/* MERGE with NO BY variables is 1:1 merge and provides a LEAD variable 'nextrecord' */
merge
have
have(firstobs=2 rename=mfrecord=nextrecord)
end=done
;
donex = done;
length accum $1000; retain accum;
length result $1000;
if prxmatch('/^[A-Z]\d+ /', nextrecord) then do;
result = catx(' ', accum, mfrecord);
OUTPUT;
group_id + 1;
call missing(accum, result);
end;
else do;
accum = catx(' ', accum, mfrecord);
if done then result = accum;
OUTPUT;
end;
keep rownum mfrecord result group_id donex nextrecord;
run;
Aggregating can be simplified if you code a grouping view first
data groups;
set have;
if prxmatch('/^[A-Z]\d+ /', mfrecord) then groupno+1;
run;
data want;
do until (last.groupno);
set groups;
by groupno;
length accum result $1000;
accum = catx(' ', accum, mfrecord);
if last.groupno then result = accum;
output;
end;
drop accum;
run;
... View more