DATA Step, Macro, Functions and more

Proc SQL matching only missing values

Accepted Solution Solved
Reply
Occasional Contributor
Posts: 13
Accepted Solution

Proc SQL matching only missing values

Hello Everyone,

 

I am creating a database in which I include observations include segments that had no road work done and on also include any segment with road work done on it prior to the first time any work was done. However, when I try to match these conditions I get the following log response:

NOTE: The "<>" operator is interpreted as "not equals".
NOTE: The execution of this query involves performing one or more Cartesian product joins that can not be optimized.
WARNING: Variable year already exists on file DB.REF1.
WARNING: Variable ident already exists on file DB.REF1.
ERROR: Insufficient space in file DB.REF1.DATA.
ERROR: File DB.REF1.DATA is damaged. I/O processing did not complete.

The code is as follows:

proc sql;
create table db.ref1 as
select *
from db.dir5lane2 road, db.refactmin act
where (road.ident = act.ident and road.year < act.year) or act.ident <> road.ident;
quit;

As always there is a sample dataset attached. Any suggestion is appreciated.


Accepted Solutions
Solution
‎03-13-2018 12:30 PM
PROC Star
Posts: 8,149

Re: Proc SQL matching only missing values

Not sure how to do that with PROC SQL. Here is how I would do it using a data step:

proc sort data=db.dir5lane2;
  by ident;
run;

proc sort data=db.refactmin;
  by ident;
run;

data db.ref1;
  merge db.dir5lane2 (in=ina) 
        db.refactmin (rename=(year=byear) in=inb);
  by ident;
  if ina and inb then do;
    if year < byear;
  end;
run;

Art, CEO, AnalystFinder.com

 

 

 

View solution in original post


All Replies
Super User
Posts: 6,631

Re: Proc SQL matching only missing values

You are trying to create a huge file, and running out of disk space to store it.  (Other, more minor issues exist, but first things first.)

 

Let's say you look at the number of records in your incoming tables, and find that there are 10,000 records in each table.  The approximate number of records you are asking for is 100,000,000.  That's because most of the records will be selected by the last criterion using <>.  That match joins every record in one table with virtually every record in the other table.

 

Think about which records you actually want to extract, and how they should be matched.  Then we can revisit how to ask for that.

Occasional Contributor
Posts: 13

Re: Proc SQL matching only missing values

Posted in reply to Astounding

Thank you for your once again for your quick reply Astounding!

 

What I would like to select from the dir5lane2 dataset is all ident's that either not appears in the refactmin dataset or the ident does appear but the year is less than the refactmin specifies for that matching ident section. 

 

(dir5lane2.ident = refactmin.ident and dir5lane2.year < refactmin.year) or (dir5lane2.ident <> refactmin.ident)

 

The dir5lane2 is only 16k observations, so there must be something with my understanding of proc sql. The code does run fine when I do not include the former restriction but only the latter. 

PROC Star
Posts: 8,149

Re: Proc SQL matching only missing values

Given the sample workbook you provided, what result do you want?

 

Art, CEO, AnalystFinder.com

 

Occasional Contributor
Posts: 13

Re: Proc SQL matching only missing values

See attached updated file for the wanted result.

Solution
‎03-13-2018 12:30 PM
PROC Star
Posts: 8,149

Re: Proc SQL matching only missing values

Not sure how to do that with PROC SQL. Here is how I would do it using a data step:

proc sort data=db.dir5lane2;
  by ident;
run;

proc sort data=db.refactmin;
  by ident;
run;

data db.ref1;
  merge db.dir5lane2 (in=ina) 
        db.refactmin (rename=(year=byear) in=inb);
  by ident;
  if ina and inb then do;
    if year < byear;
  end;
run;

Art, CEO, AnalystFinder.com

 

 

 

Occasional Contributor
Posts: 13

Re: Proc SQL matching only missing values

Thank you Arthur, works like a charm. Have only used sas for 2 months, thank you for presenting this code as a sample.
☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 6 replies
  • 131 views
  • 0 likes
  • 3 in conversation