BookmarkSubscribeRSS Feed
vishal_p
Calcite | Level 5

Hi ,

Need your help on below sas mainframe query..

The query is taking a very long time to run(its running since 2 days and has not finished yet)..

CREATE VIEW OUTTAB  AS

SELECT FILEA.ALLF1 ,FILEB.ALLF2

FROM FILEB    ,

     FILEA    WHERE

(FILEA.POL=FILEB.POL)       OR

(FILEA.LNAME=FILEB.LNAME   AND

FILEA.FNAME=FILEB.FNAME   AND

FILEA.DOB=FILEB.DOB   )    ;

FileA has 2233535 RECORDS

FileB has 155960  RECORDS

When there is only one condition IN WHERE CLAUSE the query runs very quickly.

I feel If there is one condition it would be treated as inner join so that might be the reason.

Basically I want to select records from filea and fileb

where FILEA.POL=FILEB.POL

            or

when all the below conditions meet..

( FILEA.LNAME=FILEB.LNAME   AND

FILEA.FNAME=FILEB.FNAME   AND

FILEA.DOB=FILEB.DOB   )

Can u suggest a improvement in query please.

Thanks

Vishal

10 REPLIES 10
LinusH
Tourmaline | Level 20

Well, SAS SQL has some flaws...

What you can do is to split this query into two queries, and then merge the result sets (just append/insert if they are exclusive, or together with some distinct or NODUPKEY operation if aren't).

Generally, it's hard to optimize OR operators. You cold try to store the data in SPDE, since SPDE sometimes can optimize queries that contains OR operators.

Data never sleeps
vishal_p
Calcite | Level 5

hi,

I modified the query as below..

I am running this query on Mainframe system.

PROC SQL;

CREATE VIEW SASLIB.OUTTAB AS

SELECT FILEA.ALLF1,FILEB.ALLF2

FROM SASLIB.FILEB

INNER JOIN SASLIB.FILEA ON ( FILEA.POL=FILEB.POL ) OR

(FILEA.LNAME=FILEB.LNAME AND

FILEA.FNAME=FILEB.FNAME AND

FILEA.DOB=FILEB.DOB);

RUN; QUIT;

I got the below error code..

ERROR: UTILITY FILE WRITE FAILED. PROBABLE DISK FULL CONDITION.

I tried increasing the workspace also tried with allocating a library but nothing seems to work

                                                              

File1 has 2233535 RECORDS
File2 has 155960  RECORDS

Thanks

Vishal

                                  

Can we further improve this query please..

LinusH
Tourmaline | Level 20

The seconds query does not differs from the first one, using where is practical the same as using inner-on.

Refer to my previous post for other techniques.

Data never sleeps
sassharp
Calcite | Level 5

I also recommend to go with seperate quiries then joining the results. In proc sql with OR operation takes more time.

PGStats
Opal | Level 21

Your query is logically equivalent to :

PROC SQL;

CREATE VIEW SASLIB.OUTTAB AS

(SELECT FILEA.ALLF1, FILEB.ALLF2

FROM SASLIB.FILEB INNER JOIN SASLIB.FILEA ON

FILEA.POL=FILEB.POL )

UNION

(SELECT FILEA.ALLF1, FILEB.ALLF2

FROM SASLIB.FILEB INNER JOIN SASLIB.FILEA ON

FILEA.LNAME=FILEB.LNAME AND

FILEA.FNAME=FILEB.FNAME AND

FILEA.DOB=FILEB.DOB);

QUIT;

which might work better since it avoids the dreaded OR operator. The UNION operation includes unique records only once, even if they appear in both sets. Worth a try.

PG

PG
dav_amol
Calcite | Level 5

CASE WHEN is faster than OR so try below given code :

CREATE VIEW OUTTAB  AS

SELECT FILEA.ALLF1 ,FILEB.ALLF2

FROM FILEB    ,

     FILEA    WHERE

( case when FILEA.POL ne FILEB.POL then

  ( FILEA.LNAME=FILEB.LNAME   AND

    FILEA.FNAME=FILEB.FNAME   AND

    FILEA.DOB=FILEB.DOB  

  )

  else  FILEA.POL = FILEB.POL

   end

)  ;

Ksharp
Super User

That is the disadvantage of SQL ,and sure it is the advantage of SAS.

For this large table , your best choice is to use Hash Table. An alternative way is to change the small table into a

format by using proc format ,then to perform query. I have some experience like yours. I used proc format and spended about one hour to get the reslut. But I think Hash Table will be faster.

Ksharp

vishal_p
Calcite | Level 5

Hello,

Thank u all for the help..

The select-case statement seems to be a better option...

I was able to try with only a few records(with 1lakh in both the files)..

But When i run the query for the full files(with all records
File1 has 2233535 RECORDS
File2 has 155960  RECORDS )

Then I get into a problem,
Below is the error I get.
ERROR: UTILITY FILE WRITE FAILED. PROBABLE DISK FULL CONDITION.

I am running this on a mainframe system.
I tried with allocating a seperate sas library, increasing work space, Compressing the dataset etc but nothing has worked.


Just want to know is there a way to allocate more than one library and probably
concatenate all of them together in SAS.


The reason being I can'nt get one big sas library allocated in Mainframe.
Instead i can get chunks of small libraries allocated.


So can we implement the below example..
Example:)
Libname l1 "mainframe path" ;
Libname l2 "mainframe path" ;
Libname l3 "mainframe path" ;

Then can we have the below scenario implemented in sas
l4 = l1 + l2 + l3 ;


Once we have l4 we can save the sas dataset in l4 as below..


create l4.outab as
(my sql query)

Can someone suggest if this is possible.

Thanks
Vishal

Ksharp
Super User

You can concatenate several libraries in a libname statement.

Libname l4 ('c:\temp\' 'd:\ftp\') ;

But I think your problem is not there. Hash Table also can do the same thing as CASE END did and better.

For the large table, the only choice is Hash Table or proc format

Ksharp

Peter_L
Quartz | Level 8

Are all the columns you are using in joins indexed? I am not sure if the SAS query optimiser is clever enough to create temporary indexes for the query. Joining these tables without an index will be very slow, particularly if the rows are very wide (lots of big columns). You can find out how the query is working with the PROC SQL _METHOD option. See http://www2.sas.com/proceedings/sugi30/101-30.pdf for more information about this option.

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 10 replies
  • 1473 views
  • 0 likes
  • 7 in conversation