12-03-2016 06:05 AM - edited 12-04-2016 09:42 AM
I am trying to extract a dataset to a file. Inside a dataset, the first column contains double bytes. The goal is to extract the first column with fixed length of 60 and second column with fixed length of 10.
The problem is that after the extraction, the first field in the first and second row cannot be aligned to the length of 60. Only the rest of rows with pure English can achieve this.
input x :$60. y $ :10.;
@1 x $60.
@61 y $10.;
Result of the file:
Kindly give advice , thank you very much!
12-03-2016 07:32 AM
How about this one:
data booking; input x :$60. y $ :10.; datalines; 梵蒂冈hello checked 苏丹复苏 uncheck Anderson checked EmmaWatson checked BradJames checked proc print; run; data _null_; set booking; file 'c:\temp\plane_booking.txt'; len=80; want=x||y; put want $varying80. len; run;
12-03-2016 12:32 PM - edited 12-03-2016 12:34 PM
Are you asking to create a text file with Unicode characters that can take between 1 to 4 bytes each and still be able to read the second column starting at byte number 61?
Why not just create a delimited file instead? Then you do not need to worry about how many bytes or even how many characters are in each field.
12-03-2016 01:51 PM - edited 12-03-2016 02:07 PM
Is SAS putting in too many spaces or not enough?
Either way you should be able to adjust using the difference between the number of characters and the number of bytes in the string.
data test; length string $200 ; infile cards truncover ; input string ; Nbytes = length(string); Nchars = klength(string); Difference = Nbytes - Nchars ; cards; ルアドhello ルアドレス Anderson EmmaWatson BradJames ; proc print; run;
So let's output this as fixed length and see what happens. To make it easier to see I will change the spaces in the string to periods and append carets for the extra padding.
data _null_; file 'testu8.txt' encoding=utf8; length blanks $200 ; blanks = repeat('^',199); set test ; string=ktranslate(string,'.',' '); if _n_=1 then put 'NBYTES|NCHARS|DIFF|STRING'; put nbytes 6. '|' nchars 6. '|' difference 4. '|' @ ; put string $15. blanks $varying15. difference '|' ; run;
Here is what it looks like in Windows WordPad.
To avoid chopping a character in the middle use KSUBSTR() to limit the number of characters until the number of bytes is less than your output field width.
data fix; set test; do nchars=nchars to 1 by -1 until(nbytes <= 13) ; string2 = ksubstr(string,1,nchars); nbytes = length(string2); put string= string2= nchars= nbytes= ; end; run;