Hi all,
I currently process strings that contain comma seperated substrings.
The Example below demonstrates that I can succesfully access each of these substrings using the SCAN function...
...unless there is a substring (here, it is the 3rd one) that was enclosed in quotations marks because it contains the delimiter, i.e. the comma. In this case, the substring is only partially read until the comma occurs. Usually, it is suggested to use the q modifier to account for this situation. Unfortunately, the q modifier (just remove the comment symbols from the code below) does not work for me because of the unmatched quotation marks that can occur in the strings.
Is there an option that I'm missing or an elegant alternative to account for both: (1) substrings containing the delimiter enclosed in quotation marks and (2) unmatched quotation marks?
The wanted outcome is:
1: Barcelona
2: Jack's Place
3: Blue Sea, LA
As always, any help is appreciated!
data _null_; have = "Barcelona,Jack's Place,""Blue Sea, LA"""; put have; do i=1 to 3; want = scan( have, i, ',', /*'q'*/ ); put i ":" want; end; run;
This is an assignment in SAS. But how is the data sored at the very source? DBMS table, flat file, XML...?
And secondly, what's the application? I understand that you supply some simple non-relevant data just for simplicity, but it would be good to know what and why you are doing this.
It's not that elegant, but it should be workable. Perhaps someone who knows parsing functions can do better.
have_copy = have;
do i=1 to 3;
if have_copy =: ',' then have_copy = left(substr(have_copy, 2));
if have_copy =: '""' then do;
have_copy = substr(have_copy, 3);
end_of_string = index(have_copy, '""') - 1;
want = substr(have_copy, 1, end_of_string);
have_copy = substr(have_copy, end_of_string + 2);
end;
else do;
want = scan(have_copy, 1, ',');
have_copy = left(substr(have_copy, length(want) + 1));
end;
put i ':' want;
end;
It's untested, but should be workable (perhaps with a small amount of tweaking). The LEFT function might be overkill, but removes blanks that follow a delimiting comma. Good luck.
if you can be assured that only the double quote character will be used to quote strings with delimiters then you could just move the single quotes out of the way to make it work. SAS ought to let you specify what quote character you want to use with the 'Q' modifier. Perhaps borrowing the '1' and '2' modifiers from the CATQ() function.
So one way might be to translate the single quote to a vertical bar (or some other invalid character) and then back again.
want = translate(scanq( translate(have,'|',"'"), i, ',', 'qm' ),"'",'|');
You also might want to add a DEQUOTE() function call to remove the quotes.
want=dequote(want);
I also notice that SCAN() works if you add double quotes around the values that contain single quotes.
have = "Barcelona,""Jack's Place"",""Blue Sea, LA""";
Really not easy .
data _null_;
have = "Barcelona,Jack's Place,""Blue Sea, LA""";
put have;
pid=prxparse('/[^",]+|"[^"]+"/');
start = 1;
stop = length(have);
call prxnext(pid, start, stop, have, position, length);
do while (position > 0);
i+1;
found = dequote(substr(have, position, length));
put i ":" found ;
call prxnext(pid, start, stop, have, position, length);
end;
run;
Just one more think , Can you put these into a csv/txt file , and use
infile ........ dsd
import it ?
As Ksharp suggested you could use the _INFILE_ trick to convert. Basically you read one dummy record from CARDS or a dummy file and then replace the automatic variable _INFILE_ with the content of your character variable. You can then use INPUT statement to read the data.
data have;
have = "Barcelona,Jack's Place,""Blue Sea, LA""";
run;
data want ;
set have ;
length want $50;
infile cards dsd truncover ;
input @1 @ ;
_infile_ = have;
do i=1 to 3;
input want @;
output ;
end;
cards;
;
proc print; run;
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.