BookmarkSubscribeRSS Feed
Tomasz_Barnas
Fluorite | Level 6

Hey I have a little problem with the etl connection in SAS VIYA4.
The postgresql database provided to us periodically by an external company is badly optimized and, for example, the name column is programmed for 200 characters which is completely unnecessary and even problematic considering the huge number of tables, their size, and the amount of badly optimized columns.
We would like to trim some columns for example from 200 characters to 50 characters.
Ideally, it would be possible to do this graphically while updating the tables daily.
Does anyone perhaps know the optimal solution for trimming the size of columns v VIYA4?

11 REPLIES 11
Berux
Fluorite | Level 6

I have the same problem

SASKiwi
PROC Star

It's not clear to me what you want to do here. If it is to trim the data columns as it is read into SAS to conserve storage space then I suggest you try this as an alternative:

options compress = yes;

data MyLib.MySASTable;
  set MyPostGr.MyPGTable;
run;

This will ensure that blank space is removed from your columns when stored in SAS libraries without bothering to reduce coumn length.

Tomasz_Barnas
Fluorite | Level 6

Thanks for the answer,

unfortunately that's not quite what I had in mind.

It's a large project with more than 100 tables where new data will come in every day.

I would like to predetermine the length of the columns in the target tables.

Kurt_Bremser
Super User

Create template datasets for all database tables, with all variables defined, but no observations. Then, after import, run this:

data table;
set
  template_table
  imported_table
;
run;

You'll get lots of WARNINGs about possible truncation.

SASKiwi
PROC Star

Why do you want to predetermine the lengths of columns? If it is for efficient storage then the COMPRESS option can deal with hundreds of tables without doing any length trimming. If it is for some other reason then please tell us.

ballardw
Super User

Any time there is a data transfer between organizations there should be documentation as to the file contents.

Ideally that would include all properties such as length, type, variable names and quite possibly value limits.

I would suggest looking to see if your organization has such a document with the provider of these files and verify that the provided files are in agreement.

If not, work on establishing such.

 

As a contractor using data supplied by my client we had recurring charges to the client for programming. The client asked why the kept getting these charges. When we explained and provided examples of the files where the column headings changed, the column order changed and the content of some of the columns changed structure for some moderately critical fields they understood. And shortly thereafter the files no longer had those changes happening.

 

In the above case I had the ability to impact things because of the agreement for programming charges as needed and they decided that the programming charges on our side were large enough to actually implement control on their side to prevent the charges.

 

Another option might be how the files are provided. If you are getting these as native Postgres files perhaps an agreement to have them sent as CSV or some other text format where YOU write code to read the values as desired.

sbxkoenk
SAS Super FREQ

Not sure this is what you need ... but you can consider it.

 

Sample 24804: %SQUEEZE-ing Before Compressing Data, Redux
https://support.sas.com/kb/24/804.html

Go to tab "download" to download latest version of the macro.


(Do NOT go here --> https://support.sas.com/kb/24/805.html. --> It's the original / old macro code)

Take care :

  • SQUEEZE does not know whether on a future date a bigger length might be needed for a particular variable.
  • SQUEEZE cannot handle name literals (unconventional variable names like 'p$ 7/'n).
options validvarname=any;
data have;
 'p$ 7/'n=8;
 put 'p$ 7/'n=;
run;
/* squeeze macro cannot deal with dataset have */

Koen

Patrick
Opal | Level 21

Assuming you can't change the actual Postgres tables the easiest way from a usage perspective would be to create views in Postgres with the desired column attributes. You then access the data via these views.

The advantage of views in Postgres is that you then don't have to care anymore about the variable attributes however you transfer the data to SAS (SPRE or directly into CAS via data connector).

 

You could of course also define a SAS view (SPRE only), or create an empty table structure and then just append the data to it (using the force option), or use explicit pass-through SQL where you cast your variables before transfer to SAS - but all of these methods require more coding or aren't as much a "one fits all" approach as views in the Postgres source. 

Tomasz_Barnas
Fluorite | Level 6

Hey, thank you for your answers, but I have the impression that you have misunderstood me a little.
I have full documentation of the database, and I do not want to perform compression.
I only care about resizing the length of columns in selected tables, and columns.
Importantly, I know how to do this in SAS 94, but I have the impression that SAS VIYA4 is terribly truncated of functions or they are hidden somewhere.
Therefore, the question is whether this type of operation can be done in SAS VIYA4?
Greetings 🙂

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

SAS Enterprise Guide vs. SAS Studio

What’s the difference between SAS Enterprise Guide and SAS Studio? How are they similar? Just ask SAS’ Danny Modlin.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 11 replies
  • 1984 views
  • 3 likes
  • 7 in conversation