12-08-2013 05:47 AM
We need to calculate percentiles at different dimensions of large amount of data (at 8 billion records). As we don't have PROC UNIVARIATE in SAS In Database support - thought of using proc freq to calculate cumulative frequecies and use that to calculate the percentiles.
I am able to get the distribution for a specific subset but not for all the data at one shot as it is giving me I/O error.
Can anyone please assist if there is any other way to calculate 99.9 percentiles or use PROC FREQ efficiently.
We are not transfering the data and it is huge.
Thanks in advance.
12-08-2013 04:01 PM
Is your data in native SAS datasets or on a server?
If its on a server you may want to consider using native SQL commands for that server instead.
8 billion is a lot of records regardless and will take a while, since percentile function requires a sort of data.
12-08-2013 04:05 PM
The data is a teradata table as am not doing any transfer of data. We tried doing this in teradata but is taking whole lot of time. Thanks.
12-08-2013 06:11 PM
Have you confirmed what Teradata is doing by configuring these options:
OPTIONS sastrace=',,,d' sastraceloc=saslog nostsuffix;
Also check out this:
10-13-2014 05:39 AM
Can you help me out with a similar doubt. I am not being able to do in-database processing using the SQL generation. My code is only firing a select * to the RDBMS and pulling it into SAS to perform the SAS procedure rather than performing it in the DBMS itself.
Can you give some directions on this, regarding the requirements and the correct libname statement or code snippet to perform the same. ?