BookmarkSubscribeRSS Feed
Georg_UPB
Fluorite | Level 6

Hi all,

 

I currently process strings that contain comma seperated substrings.

 

The Example below demonstrates that I can succesfully access each of these substrings using the SCAN function...

...unless there is a substring (here, it is the 3rd one) that was enclosed in quotations marks because it contains the delimiter, i.e. the comma. In this case, the substring is only partially read until the comma occurs. Usually, it is suggested to use the q modifier to account for this situation. Unfortunately, the q modifier (just remove the comment symbols from the code below) does not work for me because of the unmatched quotation marks that can occur in the strings.

 

Is there an option that I'm missing or an elegant alternative to account for both: (1) substrings containing the delimiter enclosed in quotation marks and (2) unmatched quotation marks?

 

The wanted outcome is:

1: Barcelona

2: Jack's Place

3: Blue Sea, LA

 

As always, any help is appreciated!

 

 

data _null_;

	have = "Barcelona,Jack's Place,""Blue Sea, LA""";

	put have;

	do i=1 to 3;
		want = scan( have, i, ',', /*'q'*/ );
		put i ":" want;
	end;

run;

 

 

6 REPLIES 6
LinusH
Tourmaline | Level 20

This is an assignment in SAS. But how is the data sored at the very source? DBMS table, flat file, XML...?

And secondly, what's the application? I understand that you supply some simple non-relevant data just for simplicity, but it would be good to know what and why you are doing this.

Data never sleeps
Astounding
PROC Star

It's not that elegant, but it should be workable.  Perhaps someone who knows parsing functions can do better.

 

have_copy = have;

do i=1 to 3;

   if have_copy =: ',' then have_copy = left(substr(have_copy, 2));

    if have_copy =: '""' then do;

      have_copy = substr(have_copy, 3);

      end_of_string = index(have_copy, '""') - 1;

      want = substr(have_copy, 1, end_of_string);

      have_copy = substr(have_copy, end_of_string + 2);

   end;

   else do;

      want = scan(have_copy, 1, ',');

      have_copy = left(substr(have_copy, length(want) + 1));

   end;

   put i ':' want;

end;

 

It's untested, but should be workable (perhaps with a small amount of tweaking).  The LEFT function might be overkill, but removes blanks that follow a delimiting comma.  Good luck.

Tom
Super User Tom
Super User

if you can be assured that only the double quote character will be used to quote strings with delimiters then you could just move the single quotes out of the way to make it work.  SAS ought to let you specify what quote character you want to use with the 'Q' modifier.  Perhaps borrowing the '1' and '2' modifiers from the CATQ() function.

 

So one way might be to translate the single quote to a vertical bar (or some other invalid character) and then back again.

 

 want = translate(scanq( translate(have,'|',"'"), i, ',', 'qm' ),"'",'|');

 

You also might want to add a DEQUOTE() function call to remove the quotes.

want=dequote(want);

 

I also notice that SCAN() works if you add double quotes around the values that contain single quotes.

have = "Barcelona,""Jack's Place"",""Blue Sea, LA""";
Ksharp
Super User

Really not easy .

 

data _null_;
	have = "Barcelona,Jack's Place,""Blue Sea, LA""";
	put have;

pid=prxparse('/[^",]+|"[^"]+"/');
start = 1;
stop = length(have);
call prxnext(pid, start, stop, have, position, length);
do while (position > 0);
i+1;
found = dequote(substr(have, position, length));
put i ":" found ;
call prxnext(pid, start, stop, have, position, length);
end;

	
run;

 

Ksharp
Super User

Just one more think , Can you put these into a csv/txt file , and use  

infile ........  dsd 

import it ?

Tom
Super User Tom
Super User

As Ksharp suggested you could use the _INFILE_ trick to convert.  Basically you read one dummy record from CARDS or a dummy file and then replace the automatic variable _INFILE_ with the content of your character variable. You can then use INPUT statement to read the data.

 

data have;
  have = "Barcelona,Jack's Place,""Blue Sea, LA""";
run;

data want ;
  set have ;
  length want $50;
  infile cards dsd truncover ;
  input @1 @ ;
  _infile_ = have;
  do i=1 to 3;
    input want @;
    output ;
  end;
cards;

;

proc print; run;

 

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 6 replies
  • 3116 views
  • 0 likes
  • 5 in conversation