BookmarkSubscribeRSS Feed
drshashlik
Fluorite | Level 6

Hi SAS users,

I think my problem is probably fairly simple.  I'm working with very large data that is too big for proc sql, so I'm trying hash tables.

Some of the variable names are >32 bytes.  I have tried renaming these in various stages of the process, but no luck so far.  

I tried renaming at the beginning:  

data want (rename=long_var=short_var);  (no good, 'invalid value for the rename option').

Then in the declare statement:

declare hash H (dataset: 'work.data (rename=long_var=short_var)'); No good.  seems to be too late in the process because the same error message keeps appearing (variable name>32 bytes).

 

I apologise that I can't post data or code.  The work is all done inside a secure lab and I can't bring anything outside.

Thanks for your help!

 

 

 

 

9 REPLIES 9
SASKiwi
PROC Star

SAS does not accept variable names longer than 32 characters to begin with so there is no way you can be dealing with SAS datasets here. Why do you think you have column names longer than 32 characters when SAS won't let you create such datasets?

 

If you are reading from external relational databases which have column names longer than 32 characters then these can only be read in in SQL Passthru and the resultant SAS variable name trimmed to 32 characters. Also hash tables can't be used in this scenario, only with SAS datasets in DATA steps. 

drshashlik
Fluorite | Level 6

I won't be giving you any thumbs-up for being helpful.  What a waste of your time and mine.  Better not to have replied at all.

SASKiwi
PROC Star

Actually I am trying to be genuinely helpful and I'm sorry you didn't think I was. Try running this and look at the resultant log:

34         data test;
35           My_Long_Variable_Name_longer_Than_32_Characters = 'X';
ERROR: The variable named My_Long_Variable_Name_longer_Than_32_Characters contains more than 32 characters.
36         run;

NOTE: Compression was disabled for data set WORK.TEST because compression overhead would increase the size of the data set.
NOTE: The SAS System stopped processing this step because of errors.
WARNING: The data set WORK.TEST may be incomplete.  When this step was stopped there were 0 observations and 0 variables.
NOTE: DATA statement used (Total process time):
      real time           0.01 seconds
      cpu time            0.00 seconds

You can see from this that you can't create a SAS dataset with variable names longer than 32 characters. 

drshashlik
Fluorite | Level 6

@SASKiwi  I owe you an apology.  Sorry. That was not ok.

I didn't explain myself clearly.  I'm working with data that's in MS SQL Server. So you are correct that I'm not starting with a SAS database.  However, I am working in SAS, so I need to bring the data into SAS.  So I first try accessing the data with proc sql, but there is too much data. So I tried the hash table approach.  And that's where I hit the >32 bytes problem.

Is there a way to circumvent the >32 bytes problem?

SASKiwi
PROC Star

@drshashlik  - Apology accepted! As I mentioned you can use SQL Passthru to circumvent  > 32 character table and column names in SQL Server. It works something like this (replace the LIBNAME with one that works for you):

libname sqlsrvr odbc noprompt = "server=MyServerName;DRIVER=SQL Server;Trusted Connection=yes;";

proc sql;
 connect using sqlsrvr;
  create table Want  as 
  select My_Shorter_Column_Name from connection to sqlsrvr
  (SELECT My_Very_Long_Column_Name_Longer_Than_32_Chars
   FROM [MyDatabase].[MySchema].[MyTable] A
   )
  ;
quit;

 

Patrick
Opal | Level 21

@drshashlik 

SAS will shorten the variable names to 32characters when reading the data from the database into SAS (creating a SAS table). Within SAS code just use the first 32 characters as variable names and you should be fine.

The approach when interfacing with a database is to reduce data volumes on the database side as much as you can prior to loading it into SAS. 

 

You can formulate your query run against a database using SAS SQL flavor. SAS will push as much of the processing to the database (like a where condition) but this doesn't always work - i.e. when you use a SAS function in the where condition SAS can't translate into a matching database SQL function. 

The other option is to use explicit SQL pass-through. This is SQL formulated in the database flavor which SAS just send directly for execution "as is". This allows you to use any of the database functionality.

 

I can't really find sufficient information in your question to provide more ideas how to make things work for you.

mkeintz
PROC Star

@drshashlik wrote:

I won't be giving you any thumbs-up for being helpful.  What a waste of your time and mine.  Better not to have replied at all.


Why "better not to have replied at all"?   SAS just doesn't accept names longer than 32 characters, and that is the reason the "rename" parameter won't work, which is pretty important to have confirmed.

 

After all, if the rename for such long names were successful, it would belie the name-length constraint.   Not just the newname, but the original name must be 32 characters or less.   As you speculated, it's just too late in the process.

 

Now as to the SQL Passthru suggestion.   I've never used SQL Passthru, so what follows is just conjecture.  If SQL Passthru can be used to create an SQL view, then I would think it likely that view, now with SAS-acceptable variable names, could be loaded into a hash object.  

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
PGStats
Opal | Level 21

Beware that some dataset representations may lead you to confuse variable labels (>32 chars) with variable names. Make sure you are using variable names, when required.

PG

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

Mastering the WHERE Clause in PROC SQL

SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 9 replies
  • 2524 views
  • 5 likes
  • 6 in conversation