BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
akilees
Calcite | Level 5

Hi!

I'm trying to read a pipe delimited file.  And for some character variable, the value contains one or more blanks.  Although I specified that delimiter='|' in the infile, it seems like sas still treat blank as a secondary delimiter.  The result is it successfully delimit each variable, but for each value, sas cut off the part after the first blank.  How can I fix it?

Thanks

Akilees

1 ACCEPTED SOLUTION

Accepted Solutions
snoopy369
Barite | Level 11

akilees wrote:

Thank you! Here is the code for reading the variables. Right, I should specify the length, but I don't know how long I should do, so I just estimate the longest possible value?

data have;

     infile '' lrecl=32767 firstobs=2 dlm='|';
     input level1 $ level2_code $ level2 $;

run;


lrecl is only defining the length of the total record.  You still need to define your character variables' length.  For example:

data have;

infile a lrecl=32767 firstobs=2 dlm='|';

input level1 :$15. level2_code :$3. level2 :$15.;

run;

(note the colons)

Or

data have;

infile a lrecl=32767 firstobs=2 dlm='|';

informat level1 level2 $15. level2_code $3.;

input level1 $ level2_code $ level2 $;

run;

Or use length or format similarly.  Length is somewhat better if you're just defining lengths and don't have informats for your variables (all plain characters); informat is somewhat more common if you do have some informats (date variables etc.) and matches what you see from PROC IMPORT.

One trick you can use is to PROC IMPORT, see what the proc import decides for lengths, and borrow that (look at your log).

View solution in original post

9 REPLIES 9
ballardw
Super User

The option DSD on an infile statement usually works for this. The full code of what you're attempting would help.

It might also be that your character variables are defaulting to 8 characters and need to be longer.

akilees
Calcite | Level 5

Thank you! Here is the code for reading the variables. Right, I should specify the length, but I don't know how long I should do, so I just estimate the longest possible value?

data have;

     infile '' lrecl=32767 firstobs=2 dlm='|';
     input level1 $ level2_code $ level2 $;

run;


snoopy369
Barite | Level 11

akilees wrote:

Thank you! Here is the code for reading the variables. Right, I should specify the length, but I don't know how long I should do, so I just estimate the longest possible value?

data have;

     infile '' lrecl=32767 firstobs=2 dlm='|';
     input level1 $ level2_code $ level2 $;

run;


lrecl is only defining the length of the total record.  You still need to define your character variables' length.  For example:

data have;

infile a lrecl=32767 firstobs=2 dlm='|';

input level1 :$15. level2_code :$3. level2 :$15.;

run;

(note the colons)

Or

data have;

infile a lrecl=32767 firstobs=2 dlm='|';

informat level1 level2 $15. level2_code $3.;

input level1 $ level2_code $ level2 $;

run;

Or use length or format similarly.  Length is somewhat better if you're just defining lengths and don't have informats for your variables (all plain characters); informat is somewhat more common if you do have some informats (date variables etc.) and matches what you see from PROC IMPORT.

One trick you can use is to PROC IMPORT, see what the proc import decides for lengths, and borrow that (look at your log).

akilees
Calcite | Level 5

Thank you Snoopy!  I got the character length from the import procedure.

akilees
Calcite | Level 5

BTW, if import can do the job, what is data infile used for?  I guess both of them can read raw date files and specify the delimiter.  I tried to search the difference between them but didn't get a clear idea.

snoopy369
Barite | Level 11

Data step read gives you more control than PROC IMPORT; so if PROC IMPORT guesses wrong, or doesn't give you the result you need in terms of formats/informats, you should use data step read.  I typically only use PROC IMPORT as a helper (to give me a starting point) in my production code, as I worry PROC IMPORT may yield inconsistent results; but if you're okay with that possibility (if this is a one-time run, for example, or minor inconsistencies with length of text variables and such aren't important) and you're not doing complicated things with formats, you can just use PROC IMPORT.

akilees
Calcite | Level 5

Thanks for sharing!

akilees
Calcite | Level 5

But I just realized that the length has been assigned in lrecl=32767, right? why is it still truncating?

PGStats
Opal | Level 21

Try

data have;

     length level1 level2_code level2 $32;

     infile '' lrecl=32767 firstobs=2 dlm='|' missover;
     input level1 level2_code level2;

run;

PG

PG

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 9 replies
  • 8468 views
  • 6 likes
  • 4 in conversation