data have;
infile datalines;
input ln_no $ Branch $50. ;
return;
datalines;
33 CHARLOTTE COUNTY - DEEP CREEK
22 CHASTAIN SQUARE PUBLIX (closed)
62 CHARLOTTESVILLE AREA - ALBEMARE
64 MARSHALL SQUARE PUBLIX (closed)
;run;
/*I want to accomplish the following:
1. if the Branch is separated by - only show the string after the dash (ie DEEP CREEK, ALBERMARLE)
2. if the Branch is not separated by - just show the string inside the "( )" only (ie closed)
FOOTNOTE
for #2 also how would I do this if I just want to show the characters before the ( ) or in this case MARSHALL SQUARE PUBLIX
SCAN() can probably do what you want.
data have;
input ln_no $ Branch $50. ;
datalines;
33 CHARLOTTE COUNTY - DEEP CREEK
22 CHASTAIN SQUARE PUBLIX (closed)
62 CHARLOTTESVILLE AREA - ALBEMARE
64 MARSHALL SQUARE PUBLIX (closed)
;
data want;
set have;
length want1 want2 $50;
want1 = strip(coalescec(scan(branch,2,'-'),scan(branch,-2,'()')));
want2 = strip(coalescec(scan(branch,2,'-'),scan(branch,1,'()')));
run;
proc print;
run;
Obs ln_no Branch want1 want2 1 33 CHARLOTTE COUNTY - DEEP CREEK DEEP CREEK DEEP CREEK 2 22 CHASTAIN SQUARE PUBLIX (closed) closed CHASTAIN SQUARE PUBLIX 3 62 CHARLOTTESVILLE AREA - ALBEMARE ALBEMARE ALBEMARE 4 64 MARSHALL SQUARE PUBLIX (closed) closed MARSHALL SQUARE PUBLIX
Below should do the job.
data have(drop=_:);
infile datalines;
input ln_no $ _str $50.;
length branch $20 branch2 $40;
branch=scan(_str,2,'-');
if missing(branch) then branch=scan(_str,-2,'()');
branch2=scan(_str,1,'()-');
datalines;
33 CHARLOTTE COUNTY - DEEP CREEK
22 CHASTAIN SQUARE PUBLIX (closed)
62 CHARLOTTESVILLE AREA - ALBEMARE
64 MARSHALL SQUARE PUBLIX (closed)
;
run;
proc print;
run;
Hi @Q1983
You can also use pearl functions to achieve that:
data have;
infile datalines;
input ln_no $ Branch $50.;
if prxmatch('/(.*-)(.*)/', Branch) then Branch=prxchange('s/(.*-)(.*)/$2/', -1, Branch);
else Branch=prxchange('s/(.*)(\(.*\))/$2/', -1, Branch); /* put $1 instead of $2 if you want to display the characters before the ( ) */
return;
datalines;
33 CHARLOTTE COUNTY - DEEP CREEK
22 CHASTAIN SQUARE PUBLIX (closed)
62 CHARLOTTESVILLE AREA - ALBEMARE
64 MARSHALL SQUARE PUBLIX (closed)
;
run;
proc print;
Explanation :
- prxmatch('/(.*-)(.*)/',branch) looks for the pattern (.*-)(.*), meaning any character (.) zero, one or more times (*), an hyphen (-), followed by any character (.) zero, one or more times (*). If the condition is true, then the prxchange function retrieves the second argument in parenthesis ($2), so the string after the hyphen (prxchange('s/(.*-)(.*)/$2/', -1, Branch).
- It is quite the same logic for the second use case: the prxchange function looks for the pattern (.*)(\(.*\)), meaning any character (.) zero, one or more times (*), followed by any character (.) zero, one or more times (*) in parenthesis \( and \)
If you want to display the characters before the parenthesis, you can use $1, meaning the first group in parenthesis.
Appreciate the explanation
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.