Yes, he would need to add any rule on top which is needed, in the case of the data you give:
data want;
set have;
length firstnumber $20;
firstnumber=scan(compress(string," ,","kd"),1," ,");
if lengthn(firstnumber) < 4 then firstnumber="";
run;
That would fix it, however if you need to take the date as well, or further conditions, then maybe scanning over each delimited word from the compress is the way to go:
data want (drop=temp i);
set have;
length firstnumber temp $200;
temp=compress(string," ,","kd");
do i=1 to countw(temp," ,");
if lengthn(scan(temp,1," ,")) > =4 and firstnumber="" then firstnumber=scan(temp,1," ,");
end;
run;
... View more