Hi all,
I am trying to extract a dataset to a file. Inside a dataset, the first column contains double bytes. The goal is to extract the first column with fixed length of 60 and second column with fixed length of 10.
The problem is that after the extraction, the first field in the first and second row cannot be aligned to the length of 60. Only the rest of rows with pure English can achieve this.
Sample code:
data booking;
input x :$60. y $ :10.;
datalines;
ルアドhello checked
ルアドレス uncheck
Anderson checked
EmmaWatson checked
BradJames checked
proc print;
run;
data _null_;
set booking;
file '/plane_booking.txt'
put
@1 x $60.
@61 y $10.;
run;
Result of the file:
ルアドhello checked
ルアドレス uncheck
Anderson checked
EmmaWatson checked
BradJames checked
Ideal result:
ルアドhello checked
ルアドレス uncheck
Anderson checked
EmmaWatson checked
BradJames checked
Kindly give advice , thank you very much!
How about this one:
data booking;
input x :$60. y $ :10.;
datalines;
梵蒂冈hello checked
苏丹复苏 uncheck
Anderson checked
EmmaWatson checked
BradJames checked
proc print;
run;
data _null_;
set booking;
file 'c:\temp\plane_booking.txt';
len=80;
want=x||y;
put want $varying80. len;
run;
Hi Ksharp, thanks for the reply but the result is the same.
Are you asking to create a text file with Unicode characters that can take between 1 to 4 bytes each and still be able to read the second column starting at byte number 61?
Why not just create a delimited file instead? Then you do not need to worry about how many bytes or even how many characters are in each field.
Hi Tom, yes , this is the requirement.
What happens if byte 61 falls in the middle of a multi-byte character?
Hi Tom,
Truncated .
Is SAS putting in too many spaces or not enough?
Either way you should be able to adjust using the difference between the number of characters and the number of bytes in the string.
data test;
length string $200 ;
infile cards truncover ;
input string ;
Nbytes = length(string);
Nchars = klength(string);
Difference = Nbytes - Nchars ;
cards;
ルアドhello
ルアドレス
Anderson
EmmaWatson
BradJames
;
proc print; run;
So let's output this as fixed length and see what happens. To make it easier to see I will change the spaces in the string to periods and append carets for the extra padding.
data _null_;
file 'testu8.txt' encoding=utf8;
length blanks $200 ;
blanks = repeat('^',199);
set test ;
string=ktranslate(string,'.',' ');
if _n_=1 then put 'NBYTES|NCHARS|DIFF|STRING';
put nbytes 6. '|' nchars 6. '|' difference 4. '|' @ ;
put string $15. blanks $varying15. difference '|' ;
run;
Here is what it looks like in Windows WordPad.
To avoid chopping a character in the middle use KSUBSTR() to limit the number of characters until the number of bytes is less than your output field width.
data fix;
set test;
do nchars=nchars to 1 by -1 until(nbytes <= 13) ;
string2 = ksubstr(string,1,nchars);
nbytes = length(string2);
put string= string2= nchars= nbytes= ;
end;
run;
That is really weird. I got this. Check attachment.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.