- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hi mates,
Maybe exist any way to write parquet or avro files from SAS 9.4M8 using the classic bulk load? the final location for this type of files should be s3 storage of CDP public cloud.
thanks for any help!
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
You need to provide more details.
Are you using 9.4 or Viya? Do the parquet files already exist?
SAS 9.4 cannot create parquet files.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Are you using 9.4 or Viya? we are using 9.4 only (no viya)
Do the parquet files already exist? No, normally we use the classic bulk load proc sql funcionality with some of the properties of it, for example
create table schema.table (
bulkload=YES
bl_host="namehost@something"
bl_port=XXXX
bl_datafile="/tmp_path/file.dat"
bl_delete_datafile=YES
hdfs_principal="hdfs/_HOST@DOMAIN.COM"
dbcreate_table_opts="STORED AS PARQUET" ) as select * from saslibname.datasetsas;
could it possible use in the propertie hdfs_principal a bucket s3?
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Well, if you are using 9.4, then you need another application to create the parquet files for you.
You can use Hive as you seem to be doing now, or use a Python package such as Panda that you can call from SAS, for example.
Then you transfer the files to S3.
There might be other methods (such as AWS Glue), but I'm not familiar with them.
Note that option BULKLOAD allows loading multiple rows of data as one unit, to insert or append them to a DBMS table.
A S3 bucket is not a DBMS. You are not shifting rows, you are shifting files, so this option has nothing to do with S3.
Caveat: I never used S3, so my understanding of its capabilities might be limited.