I have a variable that contains company names. However, if a company name contains abbreviation, it creates space between the letters. For example, instead of "ACCO Brand Corp", the variable is " A C C O BRAND CORP". My question is, is there a way to group the ACCO together? Thanks
With "conventional" data step means:
data test;
mystr = ' A C C O BRAND CORP';
*mystr = left(mystr);
i = 2;
do until (i >= length(mystr) or x>10);
if substr(mystr,i,1) ne ' ' and substr(mystr,i-1,1) = ' ' and substr(mystr,i+1,1) = ' '
then do;
if i > 2
then mystr = substr(mystr,1,i-2) !! substr(mystr,i);
else mystr = substr(mystr,2);
end;
else i + 1;
end;
run;
Please share the code you are referring to when you say "it creates space between the letters".
1) Is the original data in a SAS data set?
2) Is the output data in a SAS data set?
3) Please share the log with any messages.
4) Please share some code and that can be run as is to demonstrate the problem, e.g., using dataines in a data step.
Thanks,
Amir.
Hi @somebody,
Try this:
data have;
company=" A C C O BRAND CORP";
run;
data want;
set have;
company=left(prxchange('s/(?<=(\b[A-Z])) (?=([A-Z]\b))//o',-1,propcase(company)));
run;
The Perl regular expression deletes single blanks which are preceded by a single uppercase letter (which is separated from preceding text, if any, by a word boundary) and followed by another single uppercase letter (which is separated from subsequent text, if any, by a word boundary).
With "conventional" data step means:
data test;
mystr = ' A C C O BRAND CORP';
*mystr = left(mystr);
i = 2;
do until (i >= length(mystr) or x>10);
if substr(mystr,i,1) ne ' ' and substr(mystr,i-1,1) = ' ' and substr(mystr,i+1,1) = ' '
then do;
if i > 2
then mystr = substr(mystr,1,i-2) !! substr(mystr,i);
else mystr = substr(mystr,2);
end;
else i + 1;
end;
run;
data test;
mystr = 'A C C O BRAND CORP';
temp=prxchange('s/(\w{2,})/ $1 /',-1,mystr);
want=compbl(prxchange('s/ (\w) /$1/',-1,temp));
run;
Thanks, your method works too. just one minor comment is that the result has a space at the start of the string.
Add one more function LEFT() around it .
April 27 – 30 | Gaylord Texan | Grapevine, Texas
Walk in ready to learn. Walk out ready to deliver. This is the data and AI conference you can't afford to miss.
Register now and lock in 2025 pricing—just $495!
Still thinking about your presentation idea? The submission deadline has been extended to Friday, Nov. 14, at 11:59 p.m. ET.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.