Help using Base SAS procedures

Need help on Proc sql query

Reply
Occasional Contributor
Posts: 5

Need help on Proc sql query

Hi ,

Need your help on below sas mainframe query..

The query is taking a very long time to run(its running since 2 days and has not finished yet)..

CREATE VIEW OUTTAB  AS

SELECT FILEA.ALLF1 ,FILEB.ALLF2

FROM FILEB    ,

     FILEA    WHERE

(FILEA.POL=FILEB.POL)       OR

(FILEA.LNAME=FILEB.LNAME   AND

FILEA.FNAME=FILEB.FNAME   AND

FILEA.DOB=FILEB.DOB   )    ;

FileA has 2233535 RECORDS

FileB has 155960  RECORDS

When there is only one condition IN WHERE CLAUSE the query runs very quickly.

I feel If there is one condition it would be treated as inner join so that might be the reason.

Basically I want to select records from filea and fileb

where FILEA.POL=FILEB.POL

            or

when all the below conditions meet..

( FILEA.LNAME=FILEB.LNAME   AND

FILEA.FNAME=FILEB.FNAME   AND

FILEA.DOB=FILEB.DOB   )

Can u suggest a improvement in query please.

Thanks

Vishal

Super User
Posts: 5,432

Need help on Proc sql query

Well, SAS SQL has some flaws...

What you can do is to split this query into two queries, and then merge the result sets (just append/insert if they are exclusive, or together with some distinct or NODUPKEY operation if aren't).

Generally, it's hard to optimize OR operators. You cold try to store the data in SPDE, since SPDE sometimes can optimize queries that contains OR operators.

Data never sleeps
Occasional Contributor
Posts: 5

Need help on Proc sql query

hi,

I modified the query as below..

I am running this query on Mainframe system.

PROC SQL;

CREATE VIEW SASLIB.OUTTAB AS

SELECT FILEA.ALLF1,FILEB.ALLF2

FROM SASLIB.FILEB

INNER JOIN SASLIB.FILEA ON ( FILEA.POL=FILEB.POL ) OR

(FILEA.LNAME=FILEB.LNAME AND

FILEA.FNAME=FILEB.FNAME AND

FILEA.DOB=FILEB.DOB);

RUN; QUIT;

I got the below error code..

ERROR: UTILITY FILE WRITE FAILED. PROBABLE DISK FULL CONDITION.

I tried increasing the workspace also tried with allocating a library but nothing seems to work

                                                              

File1 has 2233535 RECORDS
File2 has 155960  RECORDS

Thanks

Vishal

                                  

Can we further improve this query please..

Super User
Posts: 5,432

Need help on Proc sql query

The seconds query does not differs from the first one, using where is practical the same as using inner-on.

Refer to my previous post for other techniques.

Data never sleeps
Frequent Contributor
Posts: 110

Need help on Proc sql query

I also recommend to go with seperate quiries then joining the results. In proc sql with OR operation takes more time.

Respected Advisor
Posts: 4,927

Need help on Proc sql query

Your query is logically equivalent to :

PROC SQL;

CREATE VIEW SASLIB.OUTTAB AS

(SELECT FILEA.ALLF1, FILEB.ALLF2

FROM SASLIB.FILEB INNER JOIN SASLIB.FILEA ON

FILEA.POL=FILEB.POL )

UNION

(SELECT FILEA.ALLF1, FILEB.ALLF2

FROM SASLIB.FILEB INNER JOIN SASLIB.FILEA ON

FILEA.LNAME=FILEB.LNAME AND

FILEA.FNAME=FILEB.FNAME AND

FILEA.DOB=FILEB.DOB);

QUIT;

which might work better since it avoids the dreaded OR operator. The UNION operation includes unique records only once, even if they appear in both sets. Worth a try.

PG

PG
Occasional Contributor
Posts: 18

Re: Need help on Proc sql query

CASE WHEN is faster than OR so try below given code :

CREATE VIEW OUTTAB  AS

SELECT FILEA.ALLF1 ,FILEB.ALLF2

FROM FILEB    ,

     FILEA    WHERE

( case when FILEA.POL ne FILEB.POL then

  ( FILEA.LNAME=FILEB.LNAME   AND

    FILEA.FNAME=FILEB.FNAME   AND

    FILEA.DOB=FILEB.DOB  

  )

  else  FILEA.POL = FILEB.POL

   end

)  ;

Super User
Posts: 10,041

Re: Need help on Proc sql query

That is the disadvantage of SQL ,and sure it is the advantage of SAS.

For this large table , your best choice is to use Hash Table. An alternative way is to change the small table into a

format by using proc format ,then to perform query. I have some experience like yours. I used proc format and spended about one hour to get the reslut. But I think Hash Table will be faster.

Ksharp

Occasional Contributor
Posts: 5

Need help on Proc sql query

Hello,

Thank u all for the help..

The select-case statement seems to be a better option...

I was able to try with only a few records(with 1lakh in both the files)..

But When i run the query for the full files(with all records
File1 has 2233535 RECORDS
File2 has 155960  RECORDS )

Then I get into a problem,
Below is the error I get.
ERROR: UTILITY FILE WRITE FAILED. PROBABLE DISK FULL CONDITION.

I am running this on a mainframe system.
I tried with allocating a seperate sas library, increasing work space, Compressing the dataset etc but nothing has worked.


Just want to know is there a way to allocate more than one library and probably
concatenate all of them together in SAS.


The reason being I can'nt get one big sas library allocated in Mainframe.
Instead i can get chunks of small libraries allocated.


So can we implement the below example..
ExampleSmiley Happy
Libname l1 "mainframe path" ;
Libname l2 "mainframe path" ;
Libname l3 "mainframe path" ;

Then can we have the below scenario implemented in sas
l4 = l1 + l2 + l3 ;


Once we have l4 we can save the sas dataset in l4 as below..


create l4.outab as
(my sql query)

Can someone suggest if this is possible.

Thanks
Vishal

Super User
Posts: 10,041

Need help on Proc sql query

You can concatenate several libraries in a libname statement.

Libname l4 ('c:\temp\' 'd:\ftp\') ;

But I think your problem is not there. Hash Table also can do the same thing as CASE END did and better.

For the large table, the only choice is Hash Table or proc format

Ksharp

Contributor
Posts: 32

Re: Need help on Proc sql query

Are all the columns you are using in joins indexed? I am not sure if the SAS query optimiser is clever enough to create temporary indexes for the query. Joining these tables without an index will be very slow, particularly if the rows are very wide (lots of big columns). You can find out how the query is working with the PROC SQL _METHOD option. See http://www2.sas.com/proceedings/sugi30/101-30.pdf for more information about this option.

Ask a Question
Discussion stats
  • 10 replies
  • 517 views
  • 0 likes
  • 7 in conversation