BookmarkSubscribeRSS Feed
SASnewbie2
Fluorite | Level 6

Hi Mate, 

 

Appreciate your advice how can I remove the text after a specific character/words

 

For instance, 

 

House and Office > Lighting > Floor Lamps
Building and  Office > Lighting > Table Lamps
Factory and Office > Lighting > Wall Lamps & Sconces
Home and Office > Lighting > Hanging & Pendant Lights

 

I would like to search and remove all the words after "Office". Usually I would use the Excel Function "Find and Replace" to find "Office*~ " and replace . However, I can't find a similar function in SAS advanced expression. Appreciate your advice.

 

Ultimate Result as follows

House and Office
Building and  Office
Factory and Office
Home and Office

 

4 REPLIES 4
ed_sas_member
Meteorite | Level 14

Hi @SASnewbie2 

 

You can use for example the PRXCHANGE function.

In the below code, the function looks for the following pattern in variable var1:

- Any character (.) zero, one or more time (*) at the beginning of the string (^) 

- Followed by Office

- Followed by any character (.) zero, one or more time (*) at the end of the string ($) 

 

The first group of characters (in rose) is enclosed in ( ) and can be retrieved by using $1

data want;
	set have;
	var2 = prxchange('s/(^.*Office).*$/$1/',1,var1);
run;

 

SASnewbie2
Fluorite | Level 6

Is that a simple function or formula I can use it in the advanced expression as shown below. 

 

my data was vary as show below so I can't use a delimiter and I try Scan() function, it doesn't show the result i wanted.Could be there's a mistake on my scan() formula

 

Factory House and Office > Lighting > Floor Lamps
House Building and Office follows by Lighting then Table Lamps
Factory and Office next Lighting last Wall Lamps & Sconces
Home and Office then Lighting > Hanging & Pendant Lights

 

 

 

Ultimate Result as follows

House and Office
Building and  Office
Factory and Office
Home and Office

 

 

f.png

 

Patrick
Opal | Level 21

Using below data:

data have;
  infile datalines truncover;
  input full_string $80.;
  datalines;
Factory House and Office > Lighting > Floor Lamps
House Building and Office follows by Lighting then Table Lamps
Factory and Office next Lighting last Wall Lamps & Sconces
Home and Office then Lighting > Hanging & Pendant Lights
Home and Work then Lighting > Hanging & Pendant Lights
;

Using the query builder you then can define a computed column want_string using the following formula:

substrn(t1.full_string, 1, ifn(findw(t1.full_string,'Office',' ','ip')>0, findw(t1.full_string,'Office',' ','ip')+5, 0) )

Which then returns the following result:

Capture.JPG

The IFN() function gets used to return a value of Zero for strings with no word Office in it.