proc sql question

Reply
Contributor
Posts: 29

proc sql question


Dear,

I have some data like below with 4 columns and values as below

col1 col2 col3 col4
A      1     X     Y
B     2      P     Q
A     2     A     B
A     1     A     B

By SQL in SAS,
I need duplicates based only on first two columns to be removed. That is, an I get the result as  2,3 rows and any of 1 and 4 rows ??

Expected result:

B     2      P     Q

A     2     A     B

A     1     A     B

(or)

A      1     X     Y

B     2      P     Q

A     2     A     B

I know its pretty simple in SAS, but need help in SQL

PS: all SQL commands do not work in proc SQL

Thanks

Mark

Super User
Posts: 6,927

Re: proc sql question

What keeps you from using proc sort with nodupkey?

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
Respected Advisor
Posts: 3,886

Re: proc sql question

If your data resides in a data base then use pass-through SQL and you will have access to all native functionality of this data base (eg. analytical functions like rownum() and then select the first record in the window). If it's a SAS table then why try and use SAS SQL if this can be done so easily using PROC SORT.

Contributor
Posts: 29

Re: proc sql question

It is not SAS dataset.  I just gave it as example for your reference. It is a database table containing around 20 million records. So we chose native functionality to get the result, but unfortunately stuck at this moment and are looking for any chance to complete this step as well using sql itself

Respected Advisor
Posts: 3,886

Re: proc sql question

Well, if it's a database then you should try and reduce the volumes directly on the database before transferring the data to SAS. What's the database?

Super User
Posts: 6,927

Re: proc sql question

Then you best do that step in the database itself. Get help from your DBA and use his code in SQL passthrough.

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
Super User
Posts: 9,662

Re: proc sql question

SQL in SAS is standard SQL . therefore it is hard for SAS version SQL. But you could try this one :

data have;
input col1 $ col2 col3 $ col4 $;
cards;
A      1     X     Y
B     2      P     Q
A     2     A     B
A     1     A     B
;
run;
proc sql;
 create table want as
  select * 
   from have
    group by col1,col2
      having cats(col1,col2,col3,col4) =max(cats(col1,col2,col3,col4));
     quit;

XIa Keshan

Trusted Advisor
Posts: 1,204

Re: proc sql question

proc sql;

select col1,col2,max(col3) as col3,max(col4) as col4 from have

group by col1,col2;

quit;

Ask a Question
Discussion stats
  • 7 replies
  • 925 views
  • 6 likes
  • 5 in conversation