BookmarkSubscribeRSS Feed
Feefee
Calcite | Level 5

Hi Guys,

 

Need some help please! 

 

I am trying to import a .dat file into SAS using infile. 

 

The data is pipe(|) delimited. However some of the data is like the following: a|b|c|d|"Mr John|Smith"|e|f|. As you can see the delimiter is embedded in between two data columns but in between double quotes (there are some with single quotes too). How can I get around this so that the delimiter is recognised and the data is imported correctly? 

 

Example of my data step:

 

data library.file;

        infile "filepath" DLM = '|' DSD MISSOVER firstobs=3;

        format

        var1 $99.
        var2 $99.

        var3 $99.

        var4 $99.

        firstname  $99.

        surname $99.

        var5  $99.

        var6  $99.

        ;

        input

        var1 $
        var2 $

        var3 $

        var4 $

        firstname  $

        surname $

        var5  $

        var6  $

        ;

run;

Thank you! 

8 REPLIES 8
Feefee
Calcite | Level 5
That just messes up the data further as there are some values missing in between the delimiter. Thank you for responding though!
Tom
Super User Tom
Super User

If you want want to treat the adjacent delimiters as NOT meaning one word boundary (like spaces used to align columns) then you will need to modify the line in some way.  Either remove (or replace) the quotes 

infile xx dsd dlm='|' ;
input @;
_infile_ = translate(_infile_,' ','"');
input ...;

or introduce something between the delimiters.

infile xx dlm='|' ;
input @;
_infile_ = tranwrd(_infile_,'||','| |');
input ...;

 

Tom
Super User Tom
Super User

The purpose of quotes in delimited files it to protect the delimiters when it appears in the the value of one of the fields.  The DSD option is that is enabling the dataset to parse the line.

 

So in a normal interpretation of the string:

 a|b|c|d|"Mr John|Smith"|e|f|.

There are 8 values there:

a
b
c
d
Mr John|Smith
e
f
.

How do you want that line to be parsed instead?

 

There a some combinations of strings that can confuse SAS's parsing of lines.  Since SAS supports either single or double quotes around string the normal input process does also.   So these two lines are the same to SAS.

a|'hi|there'|c
a|"hi|there"|c

If you have messy data where there exists on the same line a value that starts with a single quote and another that ends in a single quote you could have problems.  Like this:

a|b|c|d
a|'hi|c|x'

Is that your situation?

 

 

Feefee
Calcite | Level 5

Yes, you are correct.

 

I would like my data to be parsed like this:

 

a
b
c
d
Mr John
Smith
e
f

 

I would ideally like that the double or single quotes don't hinder the delimiter. 

 

Would you know how to go around this please?

Tom
Super User Tom
Super User

Remove the DSD option.

But the quotes will be part of the values.

a
b
c
d
"Mr John
Smith"
e
f
Feefee
Calcite | Level 5

Yeah, I did try that, but some of my data is empty/missing in between delimiters so it would simply ignore it which causes further issues 😞

Tom
Super User Tom
Super User

Is it possible to fix the problem at the source?  The values with the quotes should have been quoted.

a|b|c|d|"""Mr John"|"Smith"""|e|f|

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 8 replies
  • 1190 views
  • 2 likes
  • 3 in conversation