Hello - first time poster! Using the infile engine I'm importing comma-delimited text files with millions of observations and want to ensure that i'm not truncating any character variables. My first thought was that after reading in the data using an estimated set of widths, I would calculate the maximum length of the values of each character variable and compare the max length to the initial width. So, for example, if I set a width of $10. for charvar1 and one of the charvar1 values is "Hello worl", then the max length would be 10 and it would suggest I should rerun my import using a larger width value, perhaps $15., at which point that charvar1 value would be "Hello world" and the max length might drop to 11, suggesting I can stop. The issue I ran into is that this doesn't work in the situation where my variable value is "Hello Bob McDonald". If I read it in using $10., my max length could be 9, generating a false sense of assurance that i'm not truncating anything. One solution would be for me to read in charvar1 such that it includes trailing blanks, so that the test string length is 10 in the Bob McDonald case, but i'm not sure if that's feasible. To make things concrete, below is a dataset showing the issue. Thanks! I have a text file with one line and the following text (i've also attached it): The quick brown; fox My datastep is as follows. data test; informat testvar1 $CHAR5. testvar2 $CHAR5.; format testvar1 $5. testvar2 $5.; infile "Filepath.txt" delimiter = ';' missover dsd lrecl = 32767; input testvar1 $ testvar2 $; lenvar1 = length(testvar1); lenvar2 = length(testvar2); run;
... View more