Hi:
As elegant as the PRX-only solution is, I agree with Scott that sometimes a more verbose or step-wise solution might be easier to maintain. For example, compare the 7 statements of the PRXCHANGE/REVERSE/SCAN solution with the 5 statements in the PRX-only solution.
If I am a PRX newbie and my ever inventive data entry folks throw a few curve balls like Dr. Casey (no first name) or Dr. First Last, Sr. MD or Dr. First Last, III MD I have a better chance of -successfully- adjusting my code using the hybrid approach. If I'm not a PRX newbie, then adjusting the regex will probably be easy.
cynthia
[pre]
data drname;
length name $30;
infile datalines dsd dlm=',';
input idnum name $;
return;
datalines;
1,"Dr. Smith T. Bauer MD"
2,"Samuel I Rodriguez M.D."
3,"Will Glader MD"
4,"Dr. Greg House"
5,"Dr Drake Morgan"
6,"DR Donnie Darko, Sr. MD"
7,"Dr. Casey"
;
run;
data parsename;
length first last z revname zz $30;
set drname;
** get rid of Dr, if present;
z = left(prxchange('s/(Dr |Dr. |DR |DR. )/ /',1, name));
** get rid of periods and commas;
z = compress(z,',.');
** reverse the string so that MD, Jr, Sr is first, if present;
revname = reverse(z);
** change MD, Jr, Sr (in reverse) to spaces;
zz = left(prxchange('s/(DM |rJ |rS )/ /',2, revname ));
** now first name is always the first chunk of the string;
first = scan(z,1,' ');
** and last name is always the first chunk of the reversed string;
** but the scanned string has to be reversed again to be correct;
last = reverse(scan(zz,1,' '));
** If first name and last name are the same (Dr. Casey), then set ;
** spaces/missing for first name;
if first = last then first = ' ';
** entire PRX solution;
* prepare reg. expression for text parsing;
_EXPR='s/(Dr.)?(\s*\S*)\s*?\S*?(\s+)(\S*)\s*(MD|M.D.)\s*/$2 $4/i';
_REGX=prxparse(_EXPR);
_PRX=prxchange(_REGX,1,name);
* split parsed text into desired variables;
alt_FIRST=scan(_PRX,1);
alt_LAST=scan(_PRX,2);
run;
ods listing;
proc print data=parsename;
var first last name z revname zz alt_first alt_last;
run;
[/pre]