DATA Step, Macro, Functions and more

How do I find top two records of dataset using proc SQl

Accepted Solution Solved
Reply
Contributor
Posts: 37
Accepted Solution

How do I find top two records of dataset using proc SQl

Hi All,

I was trying to find top two 2 records of my dataset using Proc SQl,

so here is my data, I have Customer and spend variable, and I need to find what are top two

spend each customer made?

 

Data I have this:

cc      spend
1       100
1       200
1       550
1      100
1      200
1      550
1     100
2     200
2     550
2    200
2    200
2    550
2    200
2    900
3    750
3    550
3    1300
3    1900
3    750

 

I tried this but this is not giving me the correct results:

 

proc sql;
Select CC,(select max(spend)from work.have
where spend NOT IN(select max(spend) from work.have)) as Second_Max_spend from work.have

group by CC;
quit;

 

Expcted Results

CC Spend

1     550

1     200

2     900

2     550

3     1900

3     1300

 

Thanks!


Accepted Solutions
Solution
‎04-03-2016 07:57 AM
Respected Advisor
Posts: 4,651

Re: How do I find top two records of dataset using proc SQl

But if you insist on SQL...

 


proc sql;
select unique cc, spend
from have as a
where (select count(distinct spend) from have where cc=a.cc and spend > a.spend) < 2
order by cc, spend desc;
quit; 
PG

View solution in original post


All Replies
Super Contributor
Posts: 408

Re: How do I find top two records of dataset using proc SQl

Proc SQL has the OUTOBS option for this:

 

proc sql outobs=2;
select cc, max(spend) as max_spend
from work.have
group by cc
order by calculated max_spend desc
;
quit;

Hope this helps,

- Jan.

Contributor
Posts: 37

Re: How do I find top two records of dataset using proc SQl

Hey,

It just did not help, expected result is something different , I need to figure out what are top

two spend each customer made....(OUTBS is not the helping, I just dont' want only two Obs).

 

Thanks for your reply

 

Thanks!

Dharmndra

Super User
Posts: 17,865

Re: How do I find top two records of dataset using proc SQl

[ Edited ]

It's easier if you use a data step here, rather than SQL. If you were doing it in SQL you would need an OVER clause that's not supported in SAS.

Use the BY group feature to identify each group, create a counter for the observations and then ouptut the records that are the top 2. 

 

proc sort data=have;
by id descending spend;
run;

data want;
set have;
by id;
retain count;

if first.id then count=0;
count+1;

if count <=2 then output;
run;

 

Trusted Advisor
Posts: 1,115

Re: How do I find top two records of dataset using proc SQl

Hi @LittlesasMaster,

 

Here's a pure PROC SQL approach:

proc sql;
create view top_one as
select cc, max(spend) as spend
from have
group by cc;

select * from top_one
union all
select cc, max(spend) as spend
from (select * from have except all select * from top_one)
group by cc
order by cc, spend desc;
quit;

Please note that customer 1 has two records with SPEND=550 and the above code selects these two. If you want to ignore duplicates in order to obtain what you posted as your "expected results" (i.e. SPEND=200 as the second highest amount for CC=1), simply delete the all keyword in "except all".

Contributor
Posts: 37

Re: How do I find top two records of dataset using proc SQl

Thank you so much this helped me.Man Happy

Respected Advisor
Posts: 4,651

Re: How do I find top two records of dataset using proc SQl

Proc rank can prove useful here

 

proc rank data=have descending ties=dense out=want(where=(rank<=2));
by cc;
var spend;
ranks rank;
run;

proc sort data=want(drop=rank) nodupkey; by cc descending spend; run;

proc print; run;
PG
Respected Advisor
Posts: 4,651

Re: How do I find top two records of dataset using proc SQl

Otherwise, use normal sort and BY processing or nodupkey sort and a simpler data step to get unique values

 

proc sort data=have out=temp; by cc descending spend; run;

data want;
set temp; by cc descending spend;
if first.cc then count = 0;
if count < 2 then
    if first.spend then do;
        output;
        count + 1;
        end;
drop count;
run;

proc print; run;

/* Or... */
proc sort data=have out=temp2 nodupkey; by cc descending spend; run;

data want2;
set temp2; by cc;
if first.cc then count = 0;
if count < 2 then do;
    output;
    count + 1;
    end;
drop count;
run;

proc print; run;
PG
Solution
‎04-03-2016 07:57 AM
Respected Advisor
Posts: 4,651

Re: How do I find top two records of dataset using proc SQl

But if you insist on SQL...

 


proc sql;
select unique cc, spend
from have as a
where (select count(distinct spend) from have where cc=a.cc and spend > a.spend) < 2
order by cc, spend desc;
quit; 
PG
Contributor
Posts: 37

Re: How do I find top two records of dataset using proc SQl

Thank you so much this helped me to understand the concept .Man Happy

Contributor
Posts: 37

Re: How do I find top two records of dataset using proc SQl

Thank you, this Rank was new for me Man Happy

☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 10 replies
  • 475 views
  • 6 likes
  • 5 in conversation