DATA Step, Macro, Functions and more

How to remove unwanted characters and the values after those characters

Reply
Super Contributor
Posts: 272

How to remove unwanted characters and the values after those characters

Dear,

In my data some values contain "!!" characters in the middle of values. I need to remove the characters and the values after the characters. Thank you

 

data

term

Subject did not drink 50% of the meal!! Chicken

 

output needed;

Subject did not drink 50% of the meal

Super Contributor
Posts: 272

how to remove unwanted characters and the values after the characters

Posted in reply to knveraraju91

Dear,

In my data some values contain "!!" characters in the middle of values. I need to remove the characters and the values after the characters. Thank you

 

data

term

Subject did not drink 50% of the meal!! Chicken

 

output needed;

Subject did not drink 50% of the meal

Super User
Posts: 7,758

Re: how to remove unwanted characters and the values after the characters

[ Edited ]
Posted in reply to knveraraju91

I'd try the tranwrd() function, and then scan():

data have;
infile cards truncover;
input term $100.;
cards;
Subject did not drink 50% of the meal!! Chicken
;
run;

data want;
set have;
term = tranwrd(term,"!!","|");
term = scan(term,1,"|");
run;

 Since one can only use single characters as delimiters in scan().

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
PROC Star
Posts: 733

Re: how to remove unwanted characters and the values after the characters

Posted in reply to knveraraju91

Try this

 

data test;
string='Subject did not drink 50% of the meal!! Chicken';
newstring = substr( string,1,index(string,'!!') - 1);
run;
Respected Advisor
Posts: 4,173

Re: how to remove unwanted characters and the values after the characters

Posted in reply to knveraraju91

If you just want everything before the first exclamation mark then the scan() function on its own should do.

Not sure why @KurtBremser believes that duplicates need replacement for your use case.

data have;
  infile cards truncover;
  input term $100.;
  cards;
Subject did not drink 50% of the meal!! Chicken
;
run;

data want;
  set have;
  term = scan(term,1,"!");
run;
Super User
Posts: 7,758

Re: how to remove unwanted characters and the values after the characters


Patrick wrote:

.....

Not sure why @KurtBremser believes that duplicates need replacement for your use case.


Well, if you have a single exclamation mark in the text before the doubles ...

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
Respected Advisor
Posts: 4,173

Re: how to remove unwanted characters and the values after the characters

Posted in reply to KurtBremser

@KurtBremser

But didn't the OP tell us that he just wants "everything" from the beginning up to the first exclamation mark? So what do we care about any futher exclamation marks in the string. Based on your code I must be missing something.

Super User
Posts: 7,758

Re: how to remove unwanted characters and the values after the characters

[ Edited ]

Look at the log from this:

data _null_;
term = 'Subject complains! Subject did not drink 50% of the meal!! Chicken';
term1 = tranwrd(term,"!!","|");
term1 = scan(term,1,"|");
term2 = scan(term,1,"!");
put term1=;
put term2=;
run;

The OP stated that the delimiter is a sequence of two exclamation marks.

Granted that I may be a little overcautious here, but better safe than sorry.

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
Respected Advisor
Posts: 4,173

Re: how to remove unwanted characters and the values after the characters

[ Edited ]
Posted in reply to KurtBremser

@KurtBremser

Thanks. I did miss that the delimiter is TWO pipes. That explains what your'e doing.

 

Here a variant using RegEx

data have;
  term = 'Subject complains!! Subject did not drink 50% of the meal!! Chicken';
  output;
  term = 'Subject complains! Subject did not drink 50% of the meal!! Chicken';
  output;
  term = 'something';
  output;
run;

data want;
  set have;
  if 0 then want_str=term;
  want_str=prxchange('s/(!{2}.*)//oi',1,term);
run;

 

Super User
Posts: 19,769

Re: How to remove unwanted characters and the values after those characters

Posted in reply to knveraraju91

Scan function. 

Super User
Super User
Posts: 7,942

Re: How to remove unwanted characters and the values after those characters

Posted in reply to knveraraju91

Sounds like a job for the compress function - which removes given characters:

data have;
  infile cards truncover;
  input term $100.;
  term2=compress(term,"!");
cards;
Subject did not drink 50% of the meal!! Chicken
;
run;
Ask a Question
Discussion stats
  • 10 replies
  • 537 views
  • 9 likes
  • 6 in conversation