data have;
infile datalines;
input ln_no $ Branch $50. ;
return;
datalines;
33 CHARLOTTE COUNTY - DEEP CREEK
22 CHASTAIN SQUARE PUBLIX (closed)
62 CHARLOTTESVILLE AREA - ALBEMARE
64 MARSHALL SQUARE PUBLIX (closed)
;run;
/*I want to accomplish the following:
1. if the Branch is separated by - only show the string after the dash (ie DEEP CREEK, ALBERMARLE)
2. if the Branch is not separated by - just show the string inside the "( )" only (ie closed)
FOOTNOTE
for #2 also how would I do this if I just want to show the characters before the ( ) or in this case MARSHALL SQUARE PUBLIX
SCAN() can probably do what you want.
data have;
input ln_no $ Branch $50. ;
datalines;
33 CHARLOTTE COUNTY - DEEP CREEK
22 CHASTAIN SQUARE PUBLIX (closed)
62 CHARLOTTESVILLE AREA - ALBEMARE
64 MARSHALL SQUARE PUBLIX (closed)
;
data want;
set have;
length want1 want2 $50;
want1 = strip(coalescec(scan(branch,2,'-'),scan(branch,-2,'()')));
want2 = strip(coalescec(scan(branch,2,'-'),scan(branch,1,'()')));
run;
proc print;
run;
Obs ln_no Branch want1 want2 1 33 CHARLOTTE COUNTY - DEEP CREEK DEEP CREEK DEEP CREEK 2 22 CHASTAIN SQUARE PUBLIX (closed) closed CHASTAIN SQUARE PUBLIX 3 62 CHARLOTTESVILLE AREA - ALBEMARE ALBEMARE ALBEMARE 4 64 MARSHALL SQUARE PUBLIX (closed) closed MARSHALL SQUARE PUBLIX
Below should do the job.
data have(drop=_:);
infile datalines;
input ln_no $ _str $50.;
length branch $20 branch2 $40;
branch=scan(_str,2,'-');
if missing(branch) then branch=scan(_str,-2,'()');
branch2=scan(_str,1,'()-');
datalines;
33 CHARLOTTE COUNTY - DEEP CREEK
22 CHASTAIN SQUARE PUBLIX (closed)
62 CHARLOTTESVILLE AREA - ALBEMARE
64 MARSHALL SQUARE PUBLIX (closed)
;
run;
proc print;
run;
Hi @Q1983
You can also use pearl functions to achieve that:
data have;
infile datalines;
input ln_no $ Branch $50.;
if prxmatch('/(.*-)(.*)/', Branch) then Branch=prxchange('s/(.*-)(.*)/$2/', -1, Branch);
else Branch=prxchange('s/(.*)(\(.*\))/$2/', -1, Branch); /* put $1 instead of $2 if you want to display the characters before the ( ) */
return;
datalines;
33 CHARLOTTE COUNTY - DEEP CREEK
22 CHASTAIN SQUARE PUBLIX (closed)
62 CHARLOTTESVILLE AREA - ALBEMARE
64 MARSHALL SQUARE PUBLIX (closed)
;
run;
proc print;
Explanation :
- prxmatch('/(.*-)(.*)/',branch) looks for the pattern (.*-)(.*), meaning any character (.) zero, one or more times (*), an hyphen (-), followed by any character (.) zero, one or more times (*). If the condition is true, then the prxchange function retrieves the second argument in parenthesis ($2), so the string after the hyphen (prxchange('s/(.*-)(.*)/$2/', -1, Branch).
- It is quite the same logic for the second use case: the prxchange function looks for the pattern (.*)(\(.*\)), meaning any character (.) zero, one or more times (*), followed by any character (.) zero, one or more times (*) in parenthesis \( and \)
If you want to display the characters before the parenthesis, you can use $1, meaning the first group in parenthesis.
Appreciate the explanation
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.