BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
monikka1991
Obsidian | Level 7

have to look from the last by space delimiter and remove the last three strings  WHICH IS COUNTRY , ZIPCODE,STATE CODE

 

I HAVE

 

2001 P LARRY MN MINNESOTA MN 606123714 US

345 R TEENA TR FRANCE  ST TN 60612345 UK

 

WHAT I WANT:

2001 P LARRY MN MINNESOTA

345 R TEENA TR FRANCE  ST

1 ACCEPTED SOLUTION

Accepted Solutions
Kurt_Bremser
Super User

Brute force attack:

do i = 1 to countw(charvar) - 3;
  tempvar = catx(' ',tempvar,scan(charvar,i));
end;
charvar = tempvar;
drop i tempvar;

For tested code, provide example data in a data step with datalines.

View solution in original post

7 REPLIES 7
Kurt_Bremser
Super User

Brute force attack:

do i = 1 to countw(charvar) - 3;
  tempvar = catx(' ',tempvar,scan(charvar,i));
end;
charvar = tempvar;
drop i tempvar;

For tested code, provide example data in a data step with datalines.

monikka1991
Obsidian | Level 7

 

 

hi,

 

I tried your approach, but for countw function it gives me warning .

 

 

data A;
INPUT ADDRESS $50.;
DATALINES;
2001 P LARRY MN MINNESOTA MN 606123714 US
345 R TEENA TR FRANCE ST TN 60612345 UK
RUN;

 

DATA B;
SET A;
do i = 1 to countw(ADDRESS) - 3;
tempvar = catx(' ',tempvar,scan(ADDRESS,i));
end;
charvar = tempvar;
drop i tempvar;
RUN;

Kurt_Bremser
Super User

Add a statement that forces tempvar to be created as character:

data b;
set a;
length tempvar $50;
do i = 1 to countw(address) - 3;
  tempvar = catx(' ',tempvar,scan(address,i));
end;
charvar = tempvar;
drop i tempvar;
run;
monikka1991
Obsidian | Level 7

I have attached input excel

 PROC IMPORT OUT= HAVE
DATAFILE= "\LOCATION\INPUT.XLSX"
DBMS=EXCEL REPLACE;
GETNAMES=YES;
MIXED=NO;
SCANTEXT=YES;
USEDATE=YES;
SCANTIME=YES;
RUN;

Need to remove city state , zip from address field . If I go by position of space placed , there would be a problem when city is having space in between. (example : new York)

 So I have tried below method to derive only address

data want(DROP=I ADDRESS COMB);
set have ;
length hcp_address_1 $50.;
comb = cat(strip(City)," ",strip(STATE)," ",strip(ZIP));
do i = 1 to countw(address) - countw(comb);
hcp_address_1 = catx(' ', address_1,scan(address,i-1));
end;
run;

 

learsaas
Quartz | Level 8
	POSION=prxmatch('m/(\S+\s+\S+\s+\S+)$/',trim(text));
	if POSION>1 then result=substr(text,1,POSION-1);

 

RW9
Diamond | Level 26 RW9
Diamond | Level 26

That second example row does not look right.  We do not have "zipcode's" in the UK, nor "states".  What I would suggest, as its obviously US data you are using, is to find the first numeric, then substr to that -3, something like:

data want;
  set have;
  want=substr(have,1,findc(have,"","d")-3);
run;
Ksharp
Super User
data have;
input x $60.;
call scan(x,-3,p,l,' ');
want=substr(x,1,p-1);
drop p l;
cards;
2001 P LARRY MN MINNESOTA MN 606123714 US
345 R TEENA TR FRANCE  ST TN 60612345 UK
;
run;

proc print;run;

hackathon24-white-horiz.png

2025 SAS Hackathon: There is still time!

Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!

Register Now

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 7 replies
  • 2055 views
  • 0 likes
  • 5 in conversation