About ZFreeman

ZFreeman · ‎11-15-2012

Thanks a lot Arthur. I'm trying to avoid manually editing code from the log if possible, but this seems to be the best solution. I appreciate the help! I found another conversation with some similarities to this one, and it doesn't seem like there was ever a clean resolution either.

ZFreeman · ‎11-15-2012

Art, Thanks very much for your input, I didn't realize that the GUESSINGROWS max had been boosted from 32,767 rows in 9.2 to 2,147,483,647 rows in 9.3. Unfortunately though, proc import has at least one limitation that gives me pause. It automatically truncates leading zeros when reading in variables it sets as numeric. If I have a variable with the following values, those values should be distinct in the imported dataset. However both are imported as "123": "00123", "0123". I understand if there is no easy answer, but is there a way to either solve that issue with proc import or solve the original issue with the infile statement?

ZFreeman · ‎11-15-2012

Hello - first time poster! Using the infile engine I'm importing comma-delimited text files with millions of observations and want to ensure that i'm not truncating any character variables. My first thought was that after reading in the data using an estimated set of widths, I would calculate the maximum length of the values of each character variable and compare the max length to the initial width. So, for example, if I set a width of $10. for charvar1 and one of the charvar1 values is "Hello worl", then the max length would be 10 and it would suggest I should rerun my import using a larger width value, perhaps $15., at which point that charvar1 value would be "Hello world" and the max length might drop to 11, suggesting I can stop. The issue I ran into is that this doesn't work in the situation where my variable value is "Hello Bob McDonald". If I read it in using $10., my max length could be 9, generating a false sense of assurance that i'm not truncating anything. One solution would be for me to read in charvar1 such that it includes trailing blanks, so that the test string length is 10 in the Bob McDonald case, but i'm not sure if that's feasible. To make things concrete, below is a dataset showing the issue. Thanks! I have a text file with one line and the following text (i've also attached it): The quick brown; fox My datastep is as follows. data test; informat testvar1 $CHAR5. testvar2 $CHAR5.; format testvar1 $5. testvar2 $5.; infile "Filepath.txt" delimiter = ';' missover dsd lrecl = 32767; input testvar1 $ testvar2 $; lenvar1 = length(testvar1); lenvar2 = length(testvar2); run;

Online Status	Offline
Date Last Visited	‎09-01-2015 07:11 AM

Re: Reading in Trailing Blanks in Delimited Text Files

Re: Reading in Trailing Blanks in Delimited Text Files

Re: Reading in Trailing Blanks in Delimited Text Files

Re: Reading in Trailing Blanks in Delimited Text Files

Re: Reading in Trailing Blanks in Delimited Text Files

Re: Reading in Trailing Blanks in Delimited Text Files