BookmarkSubscribeRSS Feed
gandikk
Obsidian | Level 7

Hi,

 

I need to find the records that have words wrapped with single/double quotes. For example, "Tech Data" Company, "WOW" Company etc. But my code shouldn't count the record like this - John's & Gary's world. 

 

Thanks in advance!!

SAS@EMMAUS
4 REPLIES 4
ballardw
Super User

Think very carefully and provide a rule that describes exactly why the second record is not counted. An example is not a rule that can be programmed.

gandikk
Obsidian | Level 7

Thanks for your reply!! I didn't get the rule from customer. I got examples only. I believe that the customer doesn't want to consider a single quote if it is an apostrophe. So, in the 2nd example, though there are 2 single quotes, both are apostrophes. My rule is that if a single with 's' as an immediate character, we shouldn't count it. 

SAS@EMMAUS
ballardw
Super User

Problem with examples.

What if you have measurements in feet and inches?  5' 6" 

Angular measurements, such as latitude and longitude where the ' and " can be used for minutes and seconds (fractions of a degree)?

Contractions:  Don't

 

You might be considering exclusion when there are any 2 characters on either side of the ' (more common) or ". But the hanging possessives like Jones' are a headache.

 

 

There might be some slick approaches with Regular expressions but I'm frankly too lazy at the moment to tackle a potentially very obnoxious set of search and refine search codes for such.

I spent some time a long time ago dealing with cleaning up text with measurements such as 2" by 36" in the middle of other text and remember it was a headache to differentiate from the other quoted text.

 

BrunoMueller
SAS Super FREQ

I guess this is a case for using regular expressions. Have a look at the code below. I am not regular expression expert and there might be easier ways of doing this.

 


data have;
  infile cards truncover;
  input
    text $char256.
  ;

  re1 = prxparse('/([\s"|"\A][^"]*"\s)/');
  qt1 = prxmatch(re1, text);
  re2 = prxparse("/([\s'|'\A][^']*'\s)/");
  qt2 = prxmatch(re2, text);

  if qt1 then do;
    value = prxposn(re1, 1, text);
  end;

  if qt2 then do;
    value = prxposn(re2, 1, text);
  end;

  value2 = dequote(value);
  cards;
"Tech Data1"
'Tech Data2' Company
abc "Tech Data" Company
def 'WOW' Company
John's & Gary's world
abc
123 "O'Reilly" here
456 'O"Reilly' here
;

 

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 4 replies
  • 333 views
  • 0 likes
  • 3 in conversation