Hi, I am using a data step to import data. The problem is that I have two lines with similar content in the input file. The only difference is that in the first row the column has no space at the end, in the second row there is a space at the end in the same field. After the import the space is removed from the second row, it is somehow trimmed without me doing this. Anybody knows what to do? Thx in advance!
DATA TEST.IMPORTNAME;
LENGTH
Fieldname $ 10
FORMAT
Fieldname $CHAR10.;
INFORMAT
Fieldname $CHAR10.;
INFILE '/opt/data/Filename.csv'
LRECL=500
MISSOVER
firstobs=2
DSD;
INPUT
Fieldname : $CHAR10.;
RUN;
It's not removed at all, there are actually more blanks than the beginner will expect:
DATA WORK.TEST;
LENGTH
Fieldname $ 10
Fieldname_2 $ 10;
FORMAT
Fieldname $CHAR10.
Fieldname_2 $CHAR10.;
INFORMAT
Fieldname $CHAR10.
Fieldname_2 $CHAR10.;
INFILE '$HOME/sascommunity/testfile.csv'
LRECL=500
MISSOVER DLM=','
firstobs=2
DSD;
INPUT
Fieldname : $CHAR10.
Fieldname_2 : $CHAR10.;
RUN;
proc print data=test noobs;
format fieldname fieldname_2 $hex20.;
run;
I uploaded your csv in binary mode to my home on the server, and ran the above code. This is the result:
Fieldname Fieldname_2 42343539352020202020 54657374202020202020 42343539352020202020 54657374202020202020
You can see the blanks (hex 20) with which the character variables are padded.
A csv file should use commas as delimiter; if that is the case with your file, you need to add dlm=',' in the infile statement.
But be aware that character variables in SAS are always padded to their full length with blanks anyway. You could only lose a blank if it was the 11th character.
@Jakees wrote:
Hi, I am using a data step to import data. The problem is that I have two lines with similar content in the input file. The only difference is that in the first row the column has no space at the end, in the second row there is a space at the end in the same field. After the import the space is removed from the second row, it is somehow trimmed without me doing this. Anybody knows what to do? Thx in advance!
DATA TEST.IMPORTNAME; LENGTH Fieldname $ 10 FORMAT Fieldname $CHAR10.; INFORMAT Fieldname $CHAR10.; INFILE '/opt/data/Filename.csv' LRECL=500 MISSOVER firstobs=2 DSD; INPUT Fieldname : $CHAR10.; RUN;
> be aware that character variables in SAS are always padded to their full length with blanks anyway. You could only lose a blank if it was the 11th character.
Exactly, so I struggle to imagine what is removed.
Hi thx for your quick response. I added the DLM (without the import was also working). The value B4595 is 6 long (with the space). You can see there is a space in the CSV file. After the import the space is removed in the Work.test table.
I attached the file and I use the following code:
DATA WORK.TEST;
LENGTH
Fieldname $ 10
Fieldname_2 $ 10;
FORMAT
Fieldname $CHAR10.
Fieldname_2 $CHAR10.;
INFORMAT
Fieldname $CHAR10.
Fieldname_2 $CHAR10.;
INFILE '/opt/sas/data/testfile.csv'
LRECL=500
MISSOVER
DLM=','
firstobs=2
DSD;
INPUT
Fieldname : $CHAR10.
Fieldname_2 : $CHAR10.;
RUN;
It's not removed at all, there are actually more blanks than the beginner will expect:
DATA WORK.TEST;
LENGTH
Fieldname $ 10
Fieldname_2 $ 10;
FORMAT
Fieldname $CHAR10.
Fieldname_2 $CHAR10.;
INFORMAT
Fieldname $CHAR10.
Fieldname_2 $CHAR10.;
INFILE '$HOME/sascommunity/testfile.csv'
LRECL=500
MISSOVER DLM=','
firstobs=2
DSD;
INPUT
Fieldname : $CHAR10.
Fieldname_2 : $CHAR10.;
RUN;
proc print data=test noobs;
format fieldname fieldname_2 $hex20.;
run;
I uploaded your csv in binary mode to my home on the server, and ran the above code. This is the result:
Fieldname Fieldname_2 42343539352020202020 54657374202020202020 42343539352020202020 54657374202020202020
You can see the blanks (hex 20) with which the character variables are padded.
Hi Kurt, the 2 lines have the same hex value which is exactly the problem. Two different values are treated as the same. A value with and without space should have than a different hex value which is not the case. Also when I run a query builder and select distinct rows I get 1 row.
The problem is that I receive from a supplier this value as a primary key. I think it cannot be solved by SAS because of the padding. I should ask the supplier not to accept ending spaces in primary keys.
@Jakees wrote:
I should ask the supplier not to accept ending spaces in primary keys.
Ask them what they are smoking, and where to get it. It seems to be prime stuff, but one should only use it at parties.
The reason why I'm being so hard: in all types of environment, blanks are considered a separator between items. In natural written language, they separate words. In programming, they separate code elements. HTML (and all other markup systems, eg the ones used internally in word processors) has the concept of "white space" which can be transparently manipulated as needed by a given layout. Operating systems use blanks as the primary separator, so having them in unexpected places (eg in filenames) forces one to work around that (leading to unnecessary quoting).
Having trailing blanks in a crucial data item only works in environments that store strings in a variable length with a terminating special character (as C does it with hex 00) or a preceding length indicator (see the varchar formats in databases).
But I've never seen a variable-length primary key or one with blanks. Primary keys should be fixed length strings without blanks, or UUID's.
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.