I'm importing a CSV file into SAS using INFILE / INPUT, but a special character is causing SAS to skip over the comma delimiter. Previous posts have suggested that I use SAS with Unicode Support, but I still run into the problem. I'm using SAS 9.4 with Unicode Support. My data looks something like this:
bill_address1 | bill_address2 | bill_address3 | bill_address4 | bill_address5 |
CS | GravÚ | 90050 | Honfleur | 9403 |
But when I import the data with my code, SAS ignores the comma separator between bill_address2 and bill_address3:
The log file didn't output any errors or warnings. My code is below (replacing "[[filepath]]" with the data location)
data test;
infile "[[filepath]]\test.csv"
dsd missover lrecl = 32767 firstobs = 2;
informat
bill_address1 $2.
bill_address2 $12.
bill_address3 $12.
bill_address4 $12.
bill_address5 $12.
;
input
bill_address1 $
bill_address2 $
bill_address3 $
bill_address4 $
bill_address5 $
;
run;
@kiranv_ DSD implies DLM.
@tluoskr View your file with a text editor (hex format) to determine what that special character is, once you know what the character is, you can strip it out with TRANSLATE or COMPRESS. I suspect it's a carriage return, especially if your data was in Excel at some point and someone had formatted the data using ALT+ENTER. If that's the case there are two solutions from the past week on this question posted on the forum.
@tluoskr wrote:
I'm importing a CSV file into SAS using INFILE / INPUT, but a special character is causing SAS to skip over the comma delimiter. Previous posts have suggested that I use SAS with Unicode Support, but I still run into the problem. I'm using SAS 9.4 with Unicode Support. My data looks something like this:
bill_address1 bill_address2 bill_address3 bill_address4 bill_address5 CS GravÚ 90050 Honfleur 9403
But when I import the data with my code, SAS ignores the comma separator between bill_address2 and bill_address3:
The log file didn't output any errors or warnings. My code is below (replacing "[[filepath]]" with the data location)
data test; infile "[[filepath]]\test.csv" dsd missover lrecl = 32767 firstobs = 2; informat bill_address1 $2. bill_address2 $12. bill_address3 $12. bill_address4 $12. bill_address5 $12. ; input bill_address1 $ bill_address2 $ bill_address3 $ bill_address4 $ bill_address5 $ ; run;
try dlm =','
I tried replacing dsd with dlm = ",", but I still have the same problem.
@kiranv_ DSD implies DLM.
@tluoskr View your file with a text editor (hex format) to determine what that special character is, once you know what the character is, you can strip it out with TRANSLATE or COMPRESS. I suspect it's a carriage return, especially if your data was in Excel at some point and someone had formatted the data using ALT+ENTER. If that's the case there are two solutions from the past week on this question posted on the forum.
@tluoskr wrote:
I'm importing a CSV file into SAS using INFILE / INPUT, but a special character is causing SAS to skip over the comma delimiter. Previous posts have suggested that I use SAS with Unicode Support, but I still run into the problem. I'm using SAS 9.4 with Unicode Support. My data looks something like this:
bill_address1 bill_address2 bill_address3 bill_address4 bill_address5 CS GravÚ 90050 Honfleur 9403
But when I import the data with my code, SAS ignores the comma separator between bill_address2 and bill_address3:
The log file didn't output any errors or warnings. My code is below (replacing "[[filepath]]" with the data location)
data test; infile "[[filepath]]\test.csv" dsd missover lrecl = 32767 firstobs = 2; informat bill_address1 $2. bill_address2 $12. bill_address3 $12. bill_address4 $12. bill_address5 $12. ; input bill_address1 $ bill_address2 $ bill_address3 $ bill_address4 $ bill_address5 $ ; run;
Using the data and code you've posted I can't replicate the issue you describe but get the desired result.
Can you please try and post some data which allows us to replicate the issue.
As already suggested by others: Use a text editor like Notepad++ which allows you to make all characters visible and search for something unusual in the data.
Maybe you need add ENCODING= option.
infile "[[filepath]]\test.csv"
encoding='utf8' ..............
You are right the encoding option works.
encoding in the infile statement should match the source(where the csv is prepared) encoding.
target SAS environment has to be set to UTF8.
wlatin1 is one of the encodings used by WIndows systems.
example:
infile "[[filepath]]\test.csv" encoding='wlatin1'
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.