BookmarkSubscribeRSS Feed
Sanjay_M
Obsidian | Level 7

I am trying to create a hive table from SAS using the below code. 

Have you guys come across any data compatibility / data type conversion issues?

 

Till now I have faced the below issues, can someone please help.

1. There is a NOTE saying SAS variable labels, formats, and lengths are not written to DBMS tables. This means the labels do not get stored.

2. Numeric column values that are long, get saved as exponential values in Hive.

 

 

LIBNAME orchive HADOOP
uri='jdbc:hive2://xxxxxxx.systems.uk.xxxx:1111,xxxxxx.systems.uk.xxxx:1111,xxxxxxxx.systems.uk.xxxx:1111/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2;principal=hive/_HOST@HRES.ADROOT.XXXX?hive.execution.engine=tez;hive.tez.container.size=16384;tez.queue.name=risk'
PORT=10001
SERVER="xxxxxxxxx.systems.uk.xxxx,xxxxxxxxx.systems.uk.xxxx,xxxxxxxx.systems.uk.xxxx"
schema=prodschema
dbmax_text=100
DBCREATE_TABLE_OPTS='STORED AS ORC'
;

 

data orchive.test;

       set test;

run;

 

12 REPLIES 12
ChrisNZ
Tourmaline | Level 20

> There is a NOTE saying SAS variable labels, formats, and lengths are not written to DBMS tables. This means the labels do not get stored.

That's right. There's no metadata in Hive. You can add comments though.

 

> Numeric column values that are long, get saved as exponential values in Hive.

That's weird. What's the data type? or do you have a short SAS format maybe?

 

Also, for speed, use proc append rather than a data step. 

Sanjay_M
Obsidian | Level 7

SAS - The account numbers in SAS are 15 digit long. As it is numeric, the length of course is 8. Format is 20. Informat is 20.

 

Hive - (See image ). the datatype is double. But when queried through hive prompt using select statement, it shows as exponential value. See image.

ChrisNZ
Tourmaline | Level 20

I do not open MS office documents from the web. Use the image icon.

If the data type is double, I suspect the issue is simply that SAS uses the default  best.  format to display the value, and that format cannot show 15 digits.

ChrisNZ
Tourmaline | Level 20

Oh the hive prompt shows the exponential notation. Mmm could it be setting on your hive viewer?

Sanjay_M
Obsidian | Level 7

Here is the Hive prompt showing the value as exponential.

Please advise if this can be fixed.


pic.jpg

ChrisNZ
Tourmaline | Level 20

How does it display the number when queried form sas with a 20. format?

This does not look like a SAS question. Your viewer may be the culprit, see here for example https://stackoverflow.com/questions/32576187/hive-converting-from-double-to-string-not-in-scientific

Sanjay_M
Obsidian | Level 7

Thanks Chris. I tried casting it as big int and it displays correctly. So I think the better way to do it would be to create an empty hive table with the structure you need, and then use PROC APPEND.

Tom
Super User Tom
Super User

Why would you store account number as anything other than a string?  What is meaning of a MEAN account number?

If the values are already up to 15 digits long they will pretty soon exceed the limit of precision for storing as a floating point number.

2333  data _null_;
2334    x=constant('exactint');
2335    put x= comma32.;
2336  run;

x=9,007,199,254,740,992

That means that beyond 9,007,199,254,740,992 you can no longer be certain the integer can be stored precisely.  Some numbers that are powers of 2 can be represented exactly but not the integers between them.

LinusH
Tourmaline | Level 20

2. How did you verify that it's stored as exponential? Try to look at the values from a Hive query, not SAS to rula out any data type conversion.

 

Your coulumn should be stored as DOUBLE given the default data mapping rules.

If you have a display in conversion to sas, you can play with SAS properties on the Hive table:

 

ALTER TABLE passthrough_ex SET TBLPROPERTIES ('SASFMT:bgint'='CHAR(20)')

If you don't thik DOUBLE itself works, you can use the DBSASTYPE option to use another data type in Hive.  

Data never sleeps
Sanjay_M
Obsidian | Level 7

Thanks Linush

ChrisNZ
Tourmaline | Level 20

> Numeric column values that are long, get saved as exponential values in Hive.

 

That's  just the way the are displayed. Use a long format to see all the digits.

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 12 replies
  • 2234 views
  • 1 like
  • 4 in conversation