BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
stataq
Quartz | Level 8

Hello,

 

I would like to get the variables that contain "_raw" in my folder. I tried  to utilized vcolumn but got following error message. Could anyone told me how to work around it? 

 

data varlst;
	set sashelp.vcolumn;
 if  libname = 'RAW' and index(name,'_RAW') ; 
	keep memname name label;
run;

ERROR: Some character data was lost during transcoding in the dataset RAW.XXX. Either the data contains characters that are not representable in the new encoding or truncation occurred during transcoding.
ERROR: SQL View SASHELP.VCOLUMN could not be processed.
1 ACCEPTED SOLUTION

Accepted Solutions
Kurt_Bremser
Super User

As an end user in a server-based environment, you can't change the session encoding, but the SAS administrator can provide one for you.

And you will need one to read that particular (and probably a lot others) dataset.

Keep in mind that someone was able to create that dataset in your organisation, so there must be a UTF-encoded workspace server available.

View solution in original post

9 REPLIES 9
Kurt_Bremser
Super User

You have a dataset with an encoding different from your session encoding.

You need to switch to a SAS session with an encoding matching that of the dataset.

stataq
Quartz | Level 8

@Kurt_Bremser It looks like my SAS is using WLATIN1 and the datasets used UTF-8. What should I do to overcome this difference?

Thanks.

Kurt_Bremser
Super User

Use a SAS session configured for UTF-8. How you do this depends on your SAS setup (local or client/server). When server-based, you need to consult your SAS admin.

Tom
Super User Tom
Super User

Not sure why a query of SASHELP.VCOLUMN would cause SAS to check any of the values of any variables in any of the datasets.  Is it possible you have some VIEWS that might be referencing the RAW.XXX dataset so that trying the get the metadata about which variables are in the dataset caused SAS to actually try to read the RAW.XXX dataset?

 

You could try using a WHERE statement instead of an IF condition to filter the observations before they make it into the data step.  But note that the SASHELP views will not honor WHERE conditions, so SASHELP.VCOLUMN will still try to query every open libref, not just the RAW libref.

 

So try using DICTIONARY.COLUMNS directly and see if that helps any.

proc sql;
create table varlst as
  select memname,name,label
  from dictionary.columns
  where libname = 'RAW' 
    and index(upcase(name),'_RAW')
 ; 
quit;
Kurt_Bremser
Super User

The problem is: to get the names of the variables, the DICTIONARY.COLUMNS pseudo-table has to read at least the header of each dataset.

It might even be that someone was crazy enough to use a UTF character in a name literal.

stataq
Quartz | Level 8

Using "where' instead of 'if' doesn't change the situation. 😅

 

I am using sas server and probably wont be able to change my session setup. Is it a way to convert dataset so I can still read it?

 

Tom
Super User Tom
Super User

Are you asking how to read a dataset with UTF-8 codes that cannot translate into the single byte encoding that your current SAS session is using?   (A different question than getting list of variable names from SAS metadata.)

 

You can use the ENCODING='ANY' dataset option to get the values into your data step. 

data want;
  set raw.xxx(encoding='any');

At that point you can try to craft your own logic to find the characters that cannot be transcoded and change them to something else.  For example what if you have the UTF-8 code for uppercase Greek Delta character? 

data test;
  length _raw $4;
  _raw='CE94'x;
run;

Tom_0-1729096174144.png

What character(s) in WLATIN1 would you want to translate 'CE94'x into?

You might try replacing it with the word Delta, but make sure that the variable is long enough to store the extra 3 bytes needed.

data want;
  length _raw $10 ;
  set test(encoding='any');
  _raw=tranwrd(_raw,'CE94'x,'Delta');
run;

proc print;
run;
 

Tom_0-1729096834506.png

 

stataq
Quartz | Level 8

@Tom I would still like to get the variable list from data, but I feel something in 'Label' cause the issue. 😅

Kurt_Bremser
Super User

As an end user in a server-based environment, you can't change the session encoding, but the SAS administrator can provide one for you.

And you will need one to read that particular (and probably a lot others) dataset.

Keep in mind that someone was able to create that dataset in your organisation, so there must be a UTF-encoded workspace server available.

sas-innovate-white.png

Our biggest data and AI event of the year.

Don’t miss the livestream kicking off May 7. It’s free. It’s easy. And it’s the best seat in the house.

Join us virtually with our complimentary SAS Innovate Digital Pass. Watch live or on-demand in multiple languages, with translations available to help you get the most out of every session.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 9 replies
  • 1629 views
  • 3 likes
  • 3 in conversation