BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
GLO1
Fluorite | Level 6

Good afternoon,

 

I was trying to merge two datasets into one (one with measurements the other with characteristics for the same observations). However, the datastep was terminated after observation number 27 (there are 13000). The error shown is the following:

 

ERROR: Some character data was lost during transcoding in the dataset _EXP1_.BASCOCOG. Either the data contains characters that are not representable in the new encoding or truncation occurred during transcoding.

 

After using:

%let dsn=libref.data;
%let dsid=%sysfunc(open(&dsn,i));
%put &dsn ENCODING is: %sysfunc(attrc(&dsid,encoding));

 

,I got the following WARNING: Argument 1 to function ATTRC referenced by the %SYSFUNC or %QSYSFUNC macro function is out of range.

Does someone know how to fix this? I would really be appreciated

 

Thank you in advance, looking forward to hear from you

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
jklaverstijn
Rhodochrosite | Level 12

I have seen this exact issue happening when migrating data from a LATIN platform to UTF-8. In UTF-8 you may need more bytes in UTF-8 for the same data than in LATIN (for example, the e-accent or é) is 1 byte in LATIN but two in UTF-8. Now if the dataset copies the length attributes for character columns from source to target it fails to take this into consideration, resulting in truncation. If you analyze your observation 27 you will likely find the culprit.

 

SAS has a solution for this. This is the CVP engine. This engine will multiply the length of the target columns by a set amount to avoid the issue. Using this engine would be like:

libname source cvp 'Source-data-library';   
libname target 'Target-data-library';   
proc copy noclone in=source out=target;   
run;

Hope this helps,

- Jan.

View solution in original post

8 REPLIES 8
PaigeMiller
Diamond | Level 26

After this line

%let dsid=%sysfunc(open(&dsn,i));

insert and run this line

 

%put &=dsid;

What does it write to the log?

--
Paige Miller
GLO1
Fluorite | Level 6
Thank you for your fast reply. The log shows the following;

/* Show the encoding value for the problematic data set */
137 %let dsn=libref.expbascog_carolina;
138 %let dsid=%sysfunc(open(&dsn,i));
139 %put &=dsid;
DSID=0
140 %put &dsn ENCODING is: %sysfunc(attrc(&dsid,encoding));
WARNING: Argument 1 to function ATTRC referenced by the %SYSFUNC or %QSYSFUNC macro function
is out of range.
libref.expbascog_carolina ENCODING is:
141 %let rc=%sysfunc(close(&dsid));

r_behata
Barite | Level 11

No issues with the code. You may have to replace the 'libref' with the actual library name in your code.

17         %let dsn=sashelp.class;
18         %let dsid=%sysfunc(open(&dsn,i));
19         %put &dsid.;
4
20         %put &dsn ENCODING is: %sysfunc(attrc(&dsid,encoding));
sashelp.class ENCODING is: us-ascii  ASCII (ANSI)
21         %let cl=%sysfunc(close(&dsid));
PaigeMiller
Diamond | Level 26

@GLO1 wrote:

138 %let dsid=%sysfunc(open(&dsn,i));
139 %put &=dsid;
DSID=0
140 %put &dsn ENCODING is: %sysfunc(attrc(&dsid,encoding));
WARNING: Argument 1 to function ATTRC referenced by the %SYSFUNC or %QSYSFUNC macro function
is out of range.


According to the documentation for the OPEN function

OPEN returns 0 if the data set could not be opened.


So that's why the next line fails. LIBREF.DATA cannot be opened. Perhaps because it does not exist.

--
Paige Miller
Ksharp
Super User

Paige is right . I got no problem.

69         %let dsn=sashelp.class;
 70         %let dsid=%sysfunc(open(&dsn));
 71         %let encoding=%sysfunc(attrc(&dsid,encoding));
 72         %let dsid=%sysfunc(close(&dsid));
 73         
 74         %put Table &dsn  encoding is &encoding ;
 Table sashelp.class  encoding is us-ascii  ASCII (ANSI)

 

GLO1
Fluorite | Level 6

Ok, maybe I should start from the beginning. This is the actual problem:

 

165 data expcog_carolina; /*copy CAROLINA EXP1_coga1d and EXP1_bascocog to workfile*/
166 set _EXP1_.coga1d;
NOTE: Data file _EXP1_.COGA1D.DATA is in a format that is native to another host, or the file
encoding does not match the session encoding. Cross Environment Data Access will be
used, which might require additional CPU resources and might reduce performance.
167 run;

NOTE: There were 13078 observations read from the data set _EXP1_.COGA1D.
NOTE: The data set WORK.EXPCOG_CAROLINA has 13078 observations and 186 variables.
NOTE: DATA statement used (Total process time):
real time 0.22 seconds
cpu time 0.18 seconds


169 data expbascog_carolina;
170 set _EXP1_.bascocog;
NOTE: Data file _EXP1_.BASCOCOG.DATA is in a format that is native to another host, or the
file encoding does not match the session encoding. Cross Environment Data Access will be
used, which might require additional CPU resources and might reduce performance.
171 run;

ERROR: Some character data was lost during transcoding in the dataset _EXP1_.BASCOCOG. Either
the data contains characters that are not representable in the new encoding or
truncation occurred during transcoding.
NOTE: The DATA step has been abnormally terminated.
NOTE: The SAS System stopped processing this step because of errors.
NOTE: There were 27 observations read from the data set _EXP1_.BASCOCOG.
WARNING: The data set WORK.EXPBASCOG_CAROLINA may be incomplete. When this step was stopped
there were 27 observations and 156 variables.
WARNING: Data set WORK.EXPBASCOG_CAROLINA was not replaced because this step was stopped.
NOTE: DATA statement used (Total process time):
real time 0.02 seconds
cpu time 0.00 seconds

 

 

jklaverstijn
Rhodochrosite | Level 12

I have seen this exact issue happening when migrating data from a LATIN platform to UTF-8. In UTF-8 you may need more bytes in UTF-8 for the same data than in LATIN (for example, the e-accent or é) is 1 byte in LATIN but two in UTF-8. Now if the dataset copies the length attributes for character columns from source to target it fails to take this into consideration, resulting in truncation. If you analyze your observation 27 you will likely find the culprit.

 

SAS has a solution for this. This is the CVP engine. This engine will multiply the length of the target columns by a set amount to avoid the issue. Using this engine would be like:

libname source cvp 'Source-data-library';   
libname target 'Target-data-library';   
proc copy noclone in=source out=target;   
run;

Hope this helps,

- Jan.

GLO1
Fluorite | Level 6

Your solution works, thank you very much for your help!

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 8 replies
  • 3150 views
  • 4 likes
  • 5 in conversation