data have;
infile datalines;
input ln_no $ Branch $50. ;
return;
datalines;
33 CHARLOTTE COUNTY - DEEP CREEK
22 CHASTAIN SQUARE PUBLIX (closed)
62 CHARLOTTESVILLE AREA - ALBEMARE
64 MARSHALL SQUARE PUBLIX (closed)
;run;
/*I want to accomplish the following:
1. if the Branch is separated by - only show the string after the dash (ie DEEP CREEK, ALBERMARLE)
2. if the Branch is not separated by - just show the string inside the "( )" only (ie closed)
FOOTNOTE
for #2 also how would I do this if I just want to show the characters before the ( ) or in this case MARSHALL SQUARE PUBLIX
SCAN() can probably do what you want.
data have;
input ln_no $ Branch $50. ;
datalines;
33 CHARLOTTE COUNTY - DEEP CREEK
22 CHASTAIN SQUARE PUBLIX (closed)
62 CHARLOTTESVILLE AREA - ALBEMARE
64 MARSHALL SQUARE PUBLIX (closed)
;
data want;
set have;
length want1 want2 $50;
want1 = strip(coalescec(scan(branch,2,'-'),scan(branch,-2,'()')));
want2 = strip(coalescec(scan(branch,2,'-'),scan(branch,1,'()')));
run;
proc print;
run;
Obs ln_no Branch want1 want2 1 33 CHARLOTTE COUNTY - DEEP CREEK DEEP CREEK DEEP CREEK 2 22 CHASTAIN SQUARE PUBLIX (closed) closed CHASTAIN SQUARE PUBLIX 3 62 CHARLOTTESVILLE AREA - ALBEMARE ALBEMARE ALBEMARE 4 64 MARSHALL SQUARE PUBLIX (closed) closed MARSHALL SQUARE PUBLIX
Below should do the job.
data have(drop=_:);
infile datalines;
input ln_no $ _str $50.;
length branch $20 branch2 $40;
branch=scan(_str,2,'-');
if missing(branch) then branch=scan(_str,-2,'()');
branch2=scan(_str,1,'()-');
datalines;
33 CHARLOTTE COUNTY - DEEP CREEK
22 CHASTAIN SQUARE PUBLIX (closed)
62 CHARLOTTESVILLE AREA - ALBEMARE
64 MARSHALL SQUARE PUBLIX (closed)
;
run;
proc print;
run;
Hi @Q1983
You can also use pearl functions to achieve that:
data have;
infile datalines;
input ln_no $ Branch $50.;
if prxmatch('/(.*-)(.*)/', Branch) then Branch=prxchange('s/(.*-)(.*)/$2/', -1, Branch);
else Branch=prxchange('s/(.*)(\(.*\))/$2/', -1, Branch); /* put $1 instead of $2 if you want to display the characters before the ( ) */
return;
datalines;
33 CHARLOTTE COUNTY - DEEP CREEK
22 CHASTAIN SQUARE PUBLIX (closed)
62 CHARLOTTESVILLE AREA - ALBEMARE
64 MARSHALL SQUARE PUBLIX (closed)
;
run;
proc print;
Explanation :
- prxmatch('/(.*-)(.*)/',branch) looks for the pattern (.*-)(.*), meaning any character (.) zero, one or more times (*), an hyphen (-), followed by any character (.) zero, one or more times (*). If the condition is true, then the prxchange function retrieves the second argument in parenthesis ($2), so the string after the hyphen (prxchange('s/(.*-)(.*)/$2/', -1, Branch).
- It is quite the same logic for the second use case: the prxchange function looks for the pattern (.*)(\(.*\)), meaning any character (.) zero, one or more times (*), followed by any character (.) zero, one or more times (*) in parenthesis \( and \)
If you want to display the characters before the parenthesis, you can use $1, meaning the first group in parenthesis.
Appreciate the explanation
Available on demand!
Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Select SAS Training centers are offering in-person courses. View upcoming courses for: