BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
David_Billa
Rhodochrosite | Level 12

Any idea to combine these two steps into one? 

 

proc sql;
  create table PT_QUAN as
  select distinct PLAN_TO , QUANNEL
  from SHIPMENT_PULL;
quit;

proc sql;
  create table MISSING_QUANS as
    select t1.*
    from PT_QUAN t1
      right join (
        select distinct PLAN_TO
        from PT_QUAN
        where MISSING(QUANNEL)
        ) t2 on (t1.PLAN_TO = t2.PLAN_TO)
    where not missing(t1.PLAN_TO)
    order by t1.PLAN_TO , t1.QUANNEL desc;
quit;
1 ACCEPTED SOLUTION

Accepted Solutions
Kurt_Bremser
Super User

You do a RIGHT JOIN, but then you exclude all those where the "left" table does not supply any observations, making this in effect an inner join. But it's an inner join anyway, as you join pt_quan with itself

What you do is a lookup, finding all those groups where there is at least one missing QUANNEL in the dataset.

Try this:

proc sort
  data=shipment_pull (keep=plan_to quannel)
  out=pt_quan
  nodupkey
;
by plan_to quannel;
run;

data missing_quans;
do until (last.plan_to);
  set pt_quan;
  by plan_to;
  if missing (quannel) then flag = 1;
end;
do until (last.plan_to);
  set pt_quan;
  by plan_to;
  if flag then output;
end;
drop flag;
run;

(untested, for lack of usable example data)

This will outperform the SQL with sub-selects by orders of magnitude for larger datasets.

View solution in original post

3 REPLIES 3
SASKiwi
PROC Star
proc sql;
  create table MISSING_QUANS as
    select t1.PLAN_TO
          ,t1.QUANNEL
          ,count(*) as PLAN_Count
    from SHIPMENT_PULL t1
      right join (
        select PLAN_TO
              ,count(*) as Plan_Count
        from SHIPMENT_PULL
        where MISSING(QUANNEL)
        group by PLAN_TO
        ) t2 on (t1.PLAN_TO = t2.PLAN_TO)
    where not missing(t1.PLAN_TO)
   group by t1.PLAN_TO
           ,t1.QUANNEL
    order by t1.PLAN_TO , t1.QUANNEL desc;
quit;
Kurt_Bremser
Super User

You do a RIGHT JOIN, but then you exclude all those where the "left" table does not supply any observations, making this in effect an inner join. But it's an inner join anyway, as you join pt_quan with itself

What you do is a lookup, finding all those groups where there is at least one missing QUANNEL in the dataset.

Try this:

proc sort
  data=shipment_pull (keep=plan_to quannel)
  out=pt_quan
  nodupkey
;
by plan_to quannel;
run;

data missing_quans;
do until (last.plan_to);
  set pt_quan;
  by plan_to;
  if missing (quannel) then flag = 1;
end;
do until (last.plan_to);
  set pt_quan;
  by plan_to;
  if flag then output;
end;
drop flag;
run;

(untested, for lack of usable example data)

This will outperform the SQL with sub-selects by orders of magnitude for larger datasets.

Kurt_Bremser
Super User

Rule of thumb: in SAS, multiple DATA /SORT steps usually outperform complex SQL queries, and are easier to maintain. Personally, I only use SQL when I need to do a many-to-many join with a cartesian product as result.

hackathon24-white-horiz.png

2025 SAS Hackathon: There is still time!

Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!

Register Now

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 3 replies
  • 910 views
  • 2 likes
  • 3 in conversation