I have some dataset like this:
KAWASAN PERINDTRIAN MASJID TANAH,78300 MASJID TANAH, MELAKA,MALAYSIA
MUKIM SERKAM 77300 MERLIMAU, MELAKA
PAYA MENGKUANG 78300 MASJID TANAH MELAKA
I want to remove the 5 digit postcode and string after postcode.
The output that I want is like this:
KAWASAN PERINDTRIAN MASJID TANAH
MUKIM SERKAM
PAYA MENGKUANG
Thank you.
Is it 100% that only one five-digit-number is in the data? If yes, then try:
data want;
set have;
length new $ 100;
new = prxchange('s/(.*)\W+\d{5}.*/$1/', 1, string);
run;
Is it 100% that only one five-digit-number is in the data? If yes, then try:
data want;
set have;
length new $ 100;
new = prxchange('s/(.*)\W+\d{5}.*/$1/', 1, string);
run;
Alternative code:
newvar = substr(string,1,indexc(string,'0123456789')-1);
This code searches for the first digit in string, even if it is not a postcode of 5 digits.
data have;
input x $80.;
cards;
KAWASAN PERINDTRIAN MASJID TANAH,78300 MASJID TANAH, MELAKA,MALAYSIA
MUKIM SERKAM 77300 MERLIMAU, MELAKA
PAYA MENGKUANG 78300 MASJID TANAH MELAKA
;
data want;
set have;
want=substr(x,1,prxmatch('/\d{5}/',x)-1);
run;
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.