BookmarkSubscribeRSS Feed
gandikk
Obsidian | Level 7

Hi,

 

I need to find the records that have words wrapped with single/double quotes. For example, "Tech Data" Company, "WOW" Company etc. But my code shouldn't count the record like this - John's & Gary's world. 

 

Thanks in advance!!

SAS@EMMAUS
4 REPLIES 4
ballardw
Super User

Think very carefully and provide a rule that describes exactly why the second record is not counted. An example is not a rule that can be programmed.

gandikk
Obsidian | Level 7

Thanks for your reply!! I didn't get the rule from customer. I got examples only. I believe that the customer doesn't want to consider a single quote if it is an apostrophe. So, in the 2nd example, though there are 2 single quotes, both are apostrophes. My rule is that if a single with 's' as an immediate character, we shouldn't count it. 

SAS@EMMAUS
ballardw
Super User

Problem with examples.

What if you have measurements in feet and inches?  5' 6" 

Angular measurements, such as latitude and longitude where the ' and " can be used for minutes and seconds (fractions of a degree)?

Contractions:  Don't

 

You might be considering exclusion when there are any 2 characters on either side of the ' (more common) or ". But the hanging possessives like Jones' are a headache.

 

 

There might be some slick approaches with Regular expressions but I'm frankly too lazy at the moment to tackle a potentially very obnoxious set of search and refine search codes for such.

I spent some time a long time ago dealing with cleaning up text with measurements such as 2" by 36" in the middle of other text and remember it was a headache to differentiate from the other quoted text.

 

BrunoMueller
SAS Super FREQ

I guess this is a case for using regular expressions. Have a look at the code below. I am not regular expression expert and there might be easier ways of doing this.

 


data have;
  infile cards truncover;
  input
    text $char256.
  ;

  re1 = prxparse('/([\s"|"\A][^"]*"\s)/');
  qt1 = prxmatch(re1, text);
  re2 = prxparse("/([\s'|'\A][^']*'\s)/");
  qt2 = prxmatch(re2, text);

  if qt1 then do;
    value = prxposn(re1, 1, text);
  end;

  if qt2 then do;
    value = prxposn(re2, 1, text);
  end;

  value2 = dequote(value);
  cards;
"Tech Data1"
'Tech Data2' Company
abc "Tech Data" Company
def 'WOW' Company
John's & Gary's world
abc
123 "O'Reilly" here
456 'O"Reilly' here
;

 

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 4 replies
  • 1301 views
  • 0 likes
  • 3 in conversation