Hi Guys,
Need some help please!
I am trying to import a .dat file into SAS using infile.
The data is pipe(|) delimited. However some of the data is like the following: a|b|c|d|"Mr John|Smith"|e|f|. As you can see the delimiter is embedded in between two data columns but in between double quotes (there are some with single quotes too). How can I get around this so that the delimiter is recognised and the data is imported correctly?
Example of my data step:
data library.file;
infile "filepath" DLM = '|' DSD MISSOVER firstobs=3;
format
var1 $99.
var2 $99.
var3 $99.
var4 $99.
firstname $99.
surname $99.
var5 $99.
var6 $99.
;
input
var1 $
var2 $
var3 $
var4 $
firstname $
surname $
var5 $
var6 $
;
run;
Thank you!
If you want want to treat the adjacent delimiters as NOT meaning one word boundary (like spaces used to align columns) then you will need to modify the line in some way. Either remove (or replace) the quotes
infile xx dsd dlm='|' ;
input @;
_infile_ = translate(_infile_,' ','"');
input ...;
or introduce something between the delimiters.
infile xx dlm='|' ;
input @;
_infile_ = tranwrd(_infile_,'||','| |');
input ...;
The purpose of quotes in delimited files it to protect the delimiters when it appears in the the value of one of the fields. The DSD option is that is enabling the dataset to parse the line.
So in a normal interpretation of the string:
a|b|c|d|"Mr John|Smith"|e|f|.
There are 8 values there:
a b c d Mr John|Smith e f .
How do you want that line to be parsed instead?
There a some combinations of strings that can confuse SAS's parsing of lines. Since SAS supports either single or double quotes around string the normal input process does also. So these two lines are the same to SAS.
a|'hi|there'|c a|"hi|there"|c
If you have messy data where there exists on the same line a value that starts with a single quote and another that ends in a single quote you could have problems. Like this:
a|b|c|d a|'hi|c|x'
Is that your situation?
Yes, you are correct.
I would like my data to be parsed like this:
a
b
c
d
Mr John
Smith
e
f
I would ideally like that the double or single quotes don't hinder the delimiter.
Would you know how to go around this please?
Remove the DSD option.
But the quotes will be part of the values.
a b c d "Mr John Smith" e f
Yeah, I did try that, but some of my data is empty/missing in between delimiters so it would simply ignore it which causes further issues 😞
Is it possible to fix the problem at the source? The values with the quotes should have been quoted.
a|b|c|d|"""Mr John"|"Smith"""|e|f|
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.