Hi I need some help. I have the variable like site with values
Site
CT-BENGALURU-J.P.NAGAR
CT-BENGALURU-ORION MALL
CT-BENGALURU-SOUL SPACE SPIRIT
CT-MUMBAI-GOREGAON-OBEROI MALL
CT-PUNE-ASCENT MALL
CT-PUNE-MSM PARANJAPE MALL-KARVE ROAD
I want output like
J.P.NAGAR
ORION MALL
SOUL SPACE SPIRIT
OBEROI MALL
ASCENT MALL
PARANJAPE MALL
data have;
input Site $80.;
length want $ 80;
pid=prxparse('/\w+\s+MALL/i');
if prxmatch(pid,site) then do;
call prxsubstr(pid,site,p,l);
want=substr(site,p,l);
end;
else want=scan(site,-1,'-');
drop p l;
cards;
CT-BENGALURU-J.P.NAGAR
CT-BENGALURU-ORION MALL
CT-BENGALURU-SOUL SPACE SPIRIT
CT-MUMBAI-GOREGAON-OBEROI MALL
CT-PUNE-ASCENT MALL
CT-PUNE-MSM PARANJAPE MALL-KARVE ROAD
;
run;
I can't see a consistent rule here.
Usually it's the last "word" of those separatd by hyphens, but in the last line it is the second (and not even the whole second) and not the last. Without a consistent rule, no algorithm can be built.
Actually there is data inconsistency.
Looks like addresses, right?
Like @Kurt_Bremser said, there seems not be a clear rule.
That leaves us to do data standardization.
And I suspect that addresses are available in the SA Data Quality Base, India edition, contained in the data flux product(s).
Here are some statements that approximate what you are trying to do:
new_site = scan(site, 3, '-');
new_site = scan(site, -1, '-');
The reason I say "approximate" is because you have rules in your head that are not part of the program. For example, consider:
CT-PUNE-MSM PARANJAPE MALL-KARVE ROAD
Why should the result be PARANJAPE MALL instead of MSM PARANJAPE MALL? You have some rules about that, but all of your rules have to be made known in order to incorporate them into the program.
Good luck.
data have;
input Site $80.;
length want $ 80;
pid=prxparse('/\w+\s+MALL/i');
if prxmatch(pid,site) then do;
call prxsubstr(pid,site,p,l);
want=substr(site,p,l);
end;
else want=scan(site,-1,'-');
drop p l;
cards;
CT-BENGALURU-J.P.NAGAR
CT-BENGALURU-ORION MALL
CT-BENGALURU-SOUL SPACE SPIRIT
CT-MUMBAI-GOREGAON-OBEROI MALL
CT-PUNE-ASCENT MALL
CT-PUNE-MSM PARANJAPE MALL-KARVE ROAD
;
run;
Thanks a lot.
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9.
Early bird rate extended! Save $200 when you sign up by March 31.
Need to connect to databases in SAS Viya? SAS’ David Ghan shows you two methods – via SAS/ACCESS LIBNAME and SAS Data Connector SASLIBS – in this video.
Find more tutorials on the SAS Users YouTube channel.