Hi all,
Say I have this data in a csv file (C:\Users\xxxx\Desktop\test.csv):
/*
co11, longstring, col3
a,one two three four,b
c,five six seven eight nine,d
*/
When I read it with an input statement, I have to specify a length (or informat) but I want it to take the length of the longest value.
If I use proc import the length is longer than the longest value. Why?
proc import datafile = 'C:\Users\xxxx\Desktop\test.csv'
out = test3
dbms = csv
replace;
getnames = yes;
run;
proc print; run;
proc contents; run; /* length = 15 */
proc sql;
select max(length(longstring)) /* = 14 */
from test3;
quit;
How would I read it so that the length is 14? I've tested this on other data too, where the longest value is 24 and the length created is 26.
Thanks.
Hi, this is an interesting question. However, I was not able to replicate your result. With your sample .csv data I obtain length 25 from both PROC CONTENTS and PROC SQL -- in perfect agreement with the fact that "five six seven eight nine" is a string consisting of 25 characters. (Actually, the names of the second and third variable were prefixed by an underscore due to the leading blanks in the first line of your data.)
Thanks for testing it. I had shortened the text when I ran the test, but I put it back to the example and now I get:
longstring Char 29 $29. $29.
longest value:
25
I checked for leading or trailing blanks (in notepad) and don't see any.
Can you perhaps attach the small sample .csv file, so that we can check if the same discrepancies occur using our SAS installations?
Here it is. change the 'txt' back to 'csv'.
Thanks again for checking.
Thanks for providing the file. Using your PROC IMPORT code I still get length 25 in both cases:
longstring Char 25 $25. $25.
I have also opened the file with a hex editor and haven't found anything suspicious.
Finally, I retyped the file with Notepad and compared it to your file with a comparison tool and found no difference.
So, it's still a mystery to me what happened on your side.
Just an idea: Could it be a DBCS issue?
What do you get from %put %sysfunc(getoption(dbcs)); ?
NODBCS
Ok, thanks.
When you provided the sample file you wrote "change the 'txt' back to 'csv'". This sounds to me as if the file you attached was not the original .csv file from your PROC IMPORT experiment. And this, in turn, would obviously open the possibility that some strange characterisic of the .csv file got lost in the process of "converting" it to a .txt file. How did you create the .txt from the .csv file?
I just copied the file and renamed it with the txt extension, since you can't attach csv files.
Ok, thanks for trying it out.
Why do you need to make it length of the longest variable out of interest, it really doesn't matter that much.
As for length of the variable this would of course be defined up front in you Data Transfer Specification agreement which will detail the strcuture of the data being sent, you do have one yes? Take the length from there.
If not, then you will need to read the file, and then post process it with some logic, howevedo suggest you look at transfer agreements as a better, more robust solution:
proc sql; create table LENGTHS as select "NAME" as NAME,max(length(NAME)) as M from SASHELP.CLASS; quit; data _null_; set lengths; call execute('data want; length '||strip(name)||' $'||strip(put(m,best.))||'.; set sashelp.class; run;'); run;
It's really just a QC issue. We are to keep the "original" length, but of course that length is going to be different depending on how the data was read. In this case the production version and the QC version don't match.
Yes, thanks, I know I can do post-processing to change it.
For our data transfer specs, length is not specified if the data comes in a csv file.
Thanks for your INPUT.
Yes, specifying the length is what I'm pushing for. And it would be consistent across projects.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.