BookmarkSubscribeRSS Feed
Himanshu_Gupta
Calcite | Level 5

Hi,

I had created a code that basically uploads the data from Netezza to SAS and then aggregrates it using the relevant variables using Proc Tabulate. However, this was utilising a lot of resources and time as the dataset size is mostly 50-100 million rows spanning 150 columns.

To, avoid the resource utilization i have been trying to use in-database processing for Netezza, by providing the SQLGENERATION = DBMS option in my libname statement and the using the libname to directly input the Netezza dataset into the proc tabulate procedure.

My understanding of in-database processing is that it should fire a sql query into netezza(when we use the SQLGENERATION = DBMS option in the libname) that performs the tabulation of data accordingly in Netezza and sends the summarized data to SAS. However, this hasn't been working lyk we envisaged, and has been basically firing a select * statement to Netezza and pulling the data into SAS and then performing the aggregation in SAS rather than Netezza.

I have not been able to figure out why this must have been happening and some guidance on the same would be great.

Thanks!

6 REPLIES 6
Tom
Super User Tom
Super User

Try using PROC SUMMARY to summarize the data. That should be able to push more operations into the data.  You could possibly then build a report off of the summarized data using TABULATE if you want.

Quentin
Super User

Or try explicit-pass through query.   It's nice, because it's explicit.  That is, you get to write SQL code in whatever database-specific flavor of SQL, and then say, "please run this SQL step written in your own language on this database and then return the results to me (in SAS)."

BASUG is hosting free webinars Next up: Jane Eslinger presenting PROC REPORT and the ODS EXCEL destination on Mar 27 at noon ET. Register now at the Boston Area SAS Users Group event page: https://www.basug.org/events.
jakarman
Barite | Level 11

With in database processing you need to understand the limitation of both sides.

One rule always exists: When it is not possible to do "in-database" processing all data will be copied to SAS and processes there.

An other set of options could be:

SAStrace=',,,ds' SAStraceloc=SASlog

      sql_ip_trace=source sqlgeneration=dbmust sqlmapputto=SAS_put dbidirectexec ;

The dbmust forcing en error when the sql is not going to the external dbms.

The aggregate functions of a rdbms are not always logical the same as the summary functions of SAS.

This is seen at grouping levels. To eliminate that as cause use a single analyses variable. Check the grouping effects (multi-level) on reading the mauals.

Sometimes SAS has programmed something with advanced techniques using cursors.

Using that with a Analytics database like netezaa / terradata can be a performance disaster. This a failure of SAS.

For that one you have to prove the issue happening and hoping TS will react adequate.

That will leave you to not using SAS but coding tot Netezza native sql as explicit pass throuqh.
It is really not using SAS as SAS has sold the usage for this 

---->-- ja karman --<-----
Kurt_Bremser
Super User

I would recommend to use SQL passthrough to get the data ordered by the relevant variables in Netezza and then use proc summary ... by ...  in SAS.

jakarman
Barite | Level 11

Just see the aggregate functions, these are often unique to each dbms and by that causing troubles getting well translated by SAS.

IBM Knowledge Center   (windows aggregate functions netezza)

---->-- ja karman --<-----
Dreamer
Obsidian | Level 7

I also had the same issue.

I used readbuff=1000 direct_sql=yes libname options and it worked for me.

You can use SAStrace=',,,ds' SAStraceloc=SASlog to trace where your SQL is getting processed.

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 6 replies
  • 2918 views
  • 0 likes
  • 6 in conversation