BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
shanmukh2
Fluorite | Level 6

Hi,

 

I have a dataset with the following data

 

id           comment

 

123      ~~%abc~asd~

234      abc~def~

456     ~~  .vbr~123~567

678    ~*bvs~

 

The Required result I need is to remove leading and trailing ~ but need the ~ in between.

 

id           comment

 

123      %abc~asd

234      abc~def

456      .vbr~123~567

678     *bvs

 

I am guessing I should use regex here, but not sure how, can some one help thanks.

 

1 ACCEPTED SOLUTION

Accepted Solutions
Shmuel
Garnet | Level 18

Try next code:

 

do i=1 to length(var);
    if substr(var,i,1) ne '~' then do;
       var = substr(var,i);
       leave;
    end;
end;
do j=length(var) to 1 by -1;
    if substr(var,i,1) ne '~' then do;
       var = substr(var,1,j);
       leave;
    end;
end; 

check the code what happens, and what do you want to do, in case var = '~~~~~~~' (only delimiters: put i= j=).

 

View solution in original post

9 REPLIES 9
Shmuel
Garnet | Level 18

If there are no spaces in the text then:

var = tran(compbl(tran(var,' ','~')),'~',' ');

that will replace multi '~' into one only.

 

 

shanmukh2
Fluorite | Level 6

It's not working , there are spaces in the  comments

Shmuel
Garnet | Level 18

Try next code:

 

do i=1 to length(var);
    if substr(var,i,1) ne '~' then do;
       var = substr(var,i);
       leave;
    end;
end;
do j=length(var) to 1 by -1;
    if substr(var,i,1) ne '~' then do;
       var = substr(var,1,j);
       leave;
    end;
end; 

check the code what happens, and what do you want to do, in case var = '~~~~~~~' (only delimiters: put i= j=).

 

Tom
Super User Tom
Super User

You could probably figure out how to write a regular expression pretty easily.

But you could also just figure out how to find the part of the string you want by calculating some pointers into the string.

----+---10----+---20
~~  .vbr~123~567
  ^------- first_nondlm = 3
            ^------- last_dlm = 13
               ^----- str_length = 16
  ^---------^   substr_length = 10

Here is some code. You might need a little more logic to handle cases without trailing delimiters.

data have;
 input id comment $40.;
cards;
123 ~~%abc~asd~
234 abc~def~
456 ~~  .vbr~123~567
678 ~*bvs~
999
;

data want;
  set have;
  first_nondlm=verify(comment,'~');
  last_dlm=findc(comment,'~','b');
  str_length=lengthn(comment);
  substr_length=min(str_length,last_dlm) - first_nondlm;
  want=substrn(comment,first_nondlm,substr_length);
run;
proc print;
run;

Result

                                  first_                 str_     substr_
Obs     id    comment             nondlm    last_dlm    length     length    want

 1     123    ~~%abc~asd~            3         11         11          8      %abc~asd
 2     234    abc~def~               1          8          8          7      abc~def
 3     456    ~~  .vbr~123~567       3         13         16         10        .vbr~123
 4     678    ~*bvs~                 2          6          6          4      *bvs
 5     999                           1          0          0         -1

 

shanmukh2
Fluorite | Level 6
Thank you, this solution works too.
r_behata
Barite | Level 11

data want;
	IF _N_ = 1 THEN
		
			PATTERN = prxparse('/[^~+].+[^~+]/');

		

	RETAIN PATTERN;
	LENGTH comment_ $20 ;
	INPUT Id comment $5-20;
	CALL PRXSUBSTR(PATTERN,trim(comment),START,LENGTH);


			comment_ = SUBSTR(comment,START,LENGTH);

	KEEP Id comment comment_;
	DATALINES;
123 ~~%abc~asd~
234 abc~def~
456 ~~  .vbr~123~567
678 ~*bvs~
;
run;
mkeintz
PROC Star

I wouldn't bother with regex, which I believe can be slow.  Instead, consider taking advantage of the CHAR, LENGTH, and SUBSTR functions:

 

data have;
  input id comment $20.;
  put (_all_) (=);
datalines;
123 ~~%abc~asd~
234 abc~def~
456 ~~  .vbr~123~567
678 ~*bvs~
run;


data want (drop=_:);
  set have;
  original=comment;
  do _c=1 by 1 until (char(comment,_c)^='~');;
  end;
  comment=substr(comment,_c);
  do _c=length(comment) by -1 until(char(comment,_c)^='~');
  end;
  comment=substr(comment,1,_c);
run;

This leads leading blanks in the ID=456 comment.  If you want to remove all leading and trailing blanks and ~'s, then consider NOT IN instead of ^=, as in:

data want (drop=_:);
  set have;
  original=comment;
  do _c=1 by 1 until (char(comment,_c) not in ('~',' '));
  end;
  comment=substr(comment,_c);
  do _c=length(comment) by -1 until(char(comment,_c) not in ('~',' '));
  end;
  comment=substr(comment,1,_c);
run;
--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
novinosrin
Tourmaline | Level 20

SCAN might be very fast IMHO

 


data have;
 input id comment $40.;
cards;
123 ~~%abc~asd~
234 abc~def~
456 ~~  .vbr~123~567
678 ~*bvs~
;

data want;
set have;
length want $50;
do _n_=1 to countw(strip(comment),'~');
 want=catx('~',want,scan(strip(comment),_n_,'~'));
end;
run;

proc print noobs;run;
id comment want
123 ~~%abc~asd~ %abc~asd
234 abc~def~ abc~def
456 ~~ .vbr~123~567 .vbr~123~567
678 ~*bvs~ *bvs
Ksharp
Super User
data have;
 input id comment $40.;
cards;
123 ~~%abc~asd~
234 abc~def~
456 ~~  .vbr~123~567
678 ~*bvs~
999
;
data want;
 set have;
 want=prxchange('s/^~+|~+$//',-1,strip(comment));
run;

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 9 replies
  • 1741 views
  • 4 likes
  • 7 in conversation