Hi!
I'm trying to read a pipe delimited file. And for some character variable, the value contains one or more blanks. Although I specified that delimiter='|' in the infile, it seems like sas still treat blank as a secondary delimiter. The result is it successfully delimit each variable, but for each value, sas cut off the part after the first blank. How can I fix it?
Thanks
Akilees
akilees wrote:
Thank you! Here is the code for reading the variables. Right, I should specify the length, but I don't know how long I should do, so I just estimate the longest possible value?
data have;
infile '' lrecl=32767 firstobs=2 dlm='|';
input level1 $ level2_code $ level2 $;run;
lrecl is only defining the length of the total record. You still need to define your character variables' length. For example:
data have;
infile a lrecl=32767 firstobs=2 dlm='|';
input level1 :$15. level2_code :$3. level2 :$15.;
run;
(note the colons)
Or
data have;
infile a lrecl=32767 firstobs=2 dlm='|';
informat level1 level2 $15. level2_code $3.;
input level1 $ level2_code $ level2 $;
run;
Or use length or format similarly. Length is somewhat better if you're just defining lengths and don't have informats for your variables (all plain characters); informat is somewhat more common if you do have some informats (date variables etc.) and matches what you see from PROC IMPORT.
One trick you can use is to PROC IMPORT, see what the proc import decides for lengths, and borrow that (look at your log).
The option DSD on an infile statement usually works for this. The full code of what you're attempting would help.
It might also be that your character variables are defaulting to 8 characters and need to be longer.
Thank you! Here is the code for reading the variables. Right, I should specify the length, but I don't know how long I should do, so I just estimate the longest possible value?
data have;
infile '' lrecl=32767 firstobs=2 dlm='|';
input level1 $ level2_code $ level2 $;
run;
akilees wrote:
Thank you! Here is the code for reading the variables. Right, I should specify the length, but I don't know how long I should do, so I just estimate the longest possible value?
data have;
infile '' lrecl=32767 firstobs=2 dlm='|';
input level1 $ level2_code $ level2 $;run;
lrecl is only defining the length of the total record. You still need to define your character variables' length. For example:
data have;
infile a lrecl=32767 firstobs=2 dlm='|';
input level1 :$15. level2_code :$3. level2 :$15.;
run;
(note the colons)
Or
data have;
infile a lrecl=32767 firstobs=2 dlm='|';
informat level1 level2 $15. level2_code $3.;
input level1 $ level2_code $ level2 $;
run;
Or use length or format similarly. Length is somewhat better if you're just defining lengths and don't have informats for your variables (all plain characters); informat is somewhat more common if you do have some informats (date variables etc.) and matches what you see from PROC IMPORT.
One trick you can use is to PROC IMPORT, see what the proc import decides for lengths, and borrow that (look at your log).
Thank you Snoopy! I got the character length from the import procedure.
BTW, if import can do the job, what is data infile used for? I guess both of them can read raw date files and specify the delimiter. I tried to search the difference between them but didn't get a clear idea.
Data step read gives you more control than PROC IMPORT; so if PROC IMPORT guesses wrong, or doesn't give you the result you need in terms of formats/informats, you should use data step read. I typically only use PROC IMPORT as a helper (to give me a starting point) in my production code, as I worry PROC IMPORT may yield inconsistent results; but if you're okay with that possibility (if this is a one-time run, for example, or minor inconsistencies with length of text variables and such aren't important) and you're not doing complicated things with formats, you can just use PROC IMPORT.
Thanks for sharing!
But I just realized that the length has been assigned in lrecl=32767, right? why is it still truncating?
Try
data have;
length level1 level2_code level2 $32;
infile '' lrecl=32767 firstobs=2 dlm='|' missover;
input level1 level2_code level2;
run;
PG
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.