Help using Base SAS procedures

pipe delimited file

Accepted Solution Solved
Reply
Occasional Contributor
Posts: 9
Accepted Solution

pipe delimited file

Hi!

I'm trying to read a pipe delimited file.  And for some character variable, the value contains one or more blanks.  Although I specified that delimiter='|' in the infile, it seems like sas still treat blank as a secondary delimiter.  The result is it successfully delimit each variable, but for each value, sas cut off the part after the first blank.  How can I fix it?

Thanks

Akilees


Accepted Solutions
Solution
‎10-15-2013 04:59 PM
Regular Contributor
Posts: 245

Re: pipe delimited file

akilees wrote:

Thank you! Here is the code for reading the variables. Right, I should specify the length, but I don't know how long I should do, so I just estimate the longest possible value?

data have;

     infile '' lrecl=32767 firstobs=2 dlm='|';
     input level1 $ level2_code $ level2 $;

run;


lrecl is only defining the length of the total record.  You still need to define your character variables' length.  For example:

data have;

infile a lrecl=32767 firstobs=2 dlm='|';

input level1 :$15. level2_code :$3. level2 :$15.;

run;

(note the colons)

Or

data have;

infile a lrecl=32767 firstobs=2 dlm='|';

informat level1 level2 $15. level2_code $3.;

input level1 $ level2_code $ level2 $;

run;

Or use length or format similarly.  Length is somewhat better if you're just defining lengths and don't have informats for your variables (all plain characters); informat is somewhat more common if you do have some informats (date variables etc.) and matches what you see from PROC IMPORT.

One trick you can use is to PROC IMPORT, see what the proc import decides for lengths, and borrow that (look at your log).

View solution in original post


All Replies
Super User
Posts: 10,874

Re: pipe delimited file

The option DSD on an infile statement usually works for this. The full code of what you're attempting would help.

It might also be that your character variables are defaulting to 8 characters and need to be longer.

Occasional Contributor
Posts: 9

Re: pipe delimited file

Thank you! Here is the code for reading the variables. Right, I should specify the length, but I don't know how long I should do, so I just estimate the longest possible value?

data have;

     infile '' lrecl=32767 firstobs=2 dlm='|';
     input level1 $ level2_code $ level2 $;

run;


Solution
‎10-15-2013 04:59 PM
Regular Contributor
Posts: 245

Re: pipe delimited file

akilees wrote:

Thank you! Here is the code for reading the variables. Right, I should specify the length, but I don't know how long I should do, so I just estimate the longest possible value?

data have;

     infile '' lrecl=32767 firstobs=2 dlm='|';
     input level1 $ level2_code $ level2 $;

run;


lrecl is only defining the length of the total record.  You still need to define your character variables' length.  For example:

data have;

infile a lrecl=32767 firstobs=2 dlm='|';

input level1 :$15. level2_code :$3. level2 :$15.;

run;

(note the colons)

Or

data have;

infile a lrecl=32767 firstobs=2 dlm='|';

informat level1 level2 $15. level2_code $3.;

input level1 $ level2_code $ level2 $;

run;

Or use length or format similarly.  Length is somewhat better if you're just defining lengths and don't have informats for your variables (all plain characters); informat is somewhat more common if you do have some informats (date variables etc.) and matches what you see from PROC IMPORT.

One trick you can use is to PROC IMPORT, see what the proc import decides for lengths, and borrow that (look at your log).

Occasional Contributor
Posts: 9

Re: pipe delimited file

Thank you Snoopy!  I got the character length from the import procedure.

Occasional Contributor
Posts: 9

Re: pipe delimited file

BTW, if import can do the job, what is data infile used for?  I guess both of them can read raw date files and specify the delimiter.  I tried to search the difference between them but didn't get a clear idea.

Regular Contributor
Posts: 245

Re: pipe delimited file

Data step read gives you more control than PROC IMPORT; so if PROC IMPORT guesses wrong, or doesn't give you the result you need in terms of formats/informats, you should use data step read.  I typically only use PROC IMPORT as a helper (to give me a starting point) in my production code, as I worry PROC IMPORT may yield inconsistent results; but if you're okay with that possibility (if this is a one-time run, for example, or minor inconsistencies with length of text variables and such aren't important) and you're not doing complicated things with formats, you can just use PROC IMPORT.

Occasional Contributor
Posts: 9

Re: pipe delimited file

Thanks for sharing!

Occasional Contributor
Posts: 9

Re: pipe delimited file

But I just realized that the length has been assigned in lrecl=32767, right? why is it still truncating?

Respected Advisor
Posts: 4,756

Re: pipe delimited file

Try

data have;

     length level1 level2_code level2 $32;

     infile '' lrecl=32767 firstobs=2 dlm='|' missover;
     input level1 level2_code level2;

run;

PG

PG
🔒 This topic is solved and locked.

Need further help from the community? Please ask a new question.

Discussion stats
  • 9 replies
  • 3340 views
  • 6 likes
  • 4 in conversation