I'm importing a CSV file (see attached file "AKNZBP_test2.CSV" with encoding ansi) into SAS (session encoding UTF-8) using a data step with INFILE.
The ansi CSV file has a hex"FF" delimiter, so I used the options delimiter="~" encoding='ansi'
But the hex"FF" delimiter seems not to be recognized!
but wy? ok, hex"FF" is not a valid character in UTF-8 encoding!
my datastep for the impot is:
data work.AKNZBP_006;
infile "/PATH/AKNZBP_test2.CSV" delimiter="FF"x encoding='ansi';
Informat NSTAT $1.
FA $1.
IDVT $17.
NLFD 2.0
NTXT1 $77.
NTXT2 $77.
NTXT3 $77.
NTXT4 $77.
NTXT5 $77.
NTXT6 $77.
NTXT7 $77.
NTXT8 $77.
NTXT9 $77.
NTXT10 $77.
NTXT11 $77.
NTXT12 $77.
NTXT13 $77.
_ANLD 6.0
_AEND 6.0
_SB $4.
;
Input NSTAT
FA
IDVT
NLFD
NTXT1
NTXT2
NTXT3
NTXT4
NTXT5
NTXT6
NTXT7
NTXT8
NTXT9
NTXT10
NTXT11
NTXT12
NTXT13
_ANLD
_AEND
_SB
;
Run;
If I change the delimiter form hex"FF" to "~" and use the infile options delimiter="~" encoding='ansi'
it works fine! I get a correct Import of the CSV File!
but is there a way to inprot the CSV file with the hex"FF" delimiter without changing it to another delimiter like "~"
thx 4 help
Günter
Seems not.
But you can change it on the fly by modifying _INFILE_ automatic variable.
Strangely enough '00'x works. Not sure why 'FF'x doesn't.
data test;
dlm='00'x;
infile 'c:\downloads\AKNZBP_test2.CSV' dsd dlm=dlm obs=3 truncover encoding=any;
input @;
put _infile_ $hex4.;
_infile_=translate(_infile_,dlm,'FF'x);
put _infile_ $hex4.;
input (var1-var3) (:$30.);
put (var:) (=:$quote.);
run;
31FF 3100 var1="1" var2="1" var3="00000000000036277" 31FF 3100 var1="1" var2="1" var3="00000000000036278" 31FF 3100 var1="1" var2="1" var3="00000000000036279" NOTE: 3 records were read from the infile 'c:\downloads\AKNZBP_test2.CSV'. The minimum record length was 74. The maximum record length was 76. NOTE: The data set WORK.TEST has 3 observations and 3 variables.
You can even ask it to figure out what byte it can use as the replacement delimiter so it could use a different one on every line.
data work.AKNZBP_006;
length dlm $1;
infile "c:\downloads\AKNZBP_test2.CSV" delimiter=dlm truncover;
length NSTAT $1 FA $1 IDVT $17 NLFD 8 NTXT1-NTXT13 $77 _ANLD 8 _AEND 8 _SB $4 ;
input @;
dlm=compress(collate(0,254),_infile_);
_infile_=translate(_infile_,dlm,'FF'x);
input NSTAT -- _SB;
run;
proc print; run;
SAS 9.4 on WINDOWS N N N N N N N N N N N N N N T T T T _ _ S I N T T T T T T T T T X X X X A A O T D L X X X X X X X X X T T T T N E _ b A F V F T T T T T T T T T 1 1 1 1 L N S s T A T D 1 2 3 4 5 6 7 8 9 0 1 2 3 D D B 1 1 1 00000000000036277 0 0 Prov. 951011 951011 0001 2 1 1 00000000000036278 0 15% Prov. 951011 951011 0001 3 1 1 00000000000036279 0 18% Prov. 951011 951011 0001 4 1 1 00000000000036280 0 3% Prov. 951011 951011 0001 5 2 1 00000000000036281 0 3% Prov. 951011 960122 0001
hej hej Tom,
thx, that seams to work well!
I have to test your solution with all our 63 datafiles!
so I use just a simple version for testing:
data work.AKNZBP_006;
dlm='00'x;
infile "/PATH/AKNZBP_test2.CSV" dlm=dlm encoding='ansi';
Informat NSTAT $1.
FA $1.
...
;
Input @;
_infile_=translate(_infile_,dlm,'FF'x);
Input
NSTAT
FA
...;
Run;
BR
Günter
Looks good.
Not sure what value you are getting out of attaching informats to the variables. I guess as a side effect it is forcing SAS to guess that you wanted to define the variables with lengths that are compatible with the widths you used on the attached informats.
hej hej Tom,
it works so far, but just for encoding='ansi'
every other encoding (like encoding='latin2') does not work!
also, in our data files we have german "ö , ä , ü ...." (not now in the sample file) and with the using of translate, this characters are not correctly converted to UTF8!
I have to check further, what is happening with this characters (hex bevor translate and after translate)
BR Günter
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.