Hi All,
I was trying to find top two 2 records of my dataset using Proc SQl,
so here is my data, I have Customer and spend variable, and I need to find what are top two
spend each customer made?
Data I have this:
cc spend
1 100
1 200
1 550
1 100
1 200
1 550
1 100
2 200
2 550
2 200
2 200
2 550
2 200
2 900
3 750
3 550
3 1300
3 1900
3 750
I tried this but this is not giving me the correct results:
proc sql;
Select CC,(select max(spend)from work.have
where spend NOT IN(select max(spend) from work.have)) as Second_Max_spend from work.have
group by CC;
quit;
Expcted Results
CC Spend
1 550
1 200
2 900
2 550
3 1900
3 1300
Thanks!
But if you insist on SQL...
proc sql;
select unique cc, spend
from have as a
where (select count(distinct spend) from have where cc=a.cc and spend > a.spend) < 2
order by cc, spend desc;
quit;
Proc SQL has the OUTOBS option for this:
proc sql outobs=2;
select cc, max(spend) as max_spend
from work.have
group by cc
order by calculated max_spend desc
;
quit;
Hope this helps,
- Jan.
Hey,
It just did not help, expected result is something different , I need to figure out what are top
two spend each customer made....(OUTBS is not the helping, I just dont' want only two Obs).
Thanks for your reply
Thanks!
Dharmndra
It's easier if you use a data step here, rather than SQL. If you were doing it in SQL you would need an OVER clause that's not supported in SAS.
Use the BY group feature to identify each group, create a counter for the observations and then ouptut the records that are the top 2.
proc sort data=have;
by id descending spend;
run;
data want;
set have;
by id;
retain count;
if first.id then count=0;
count+1;
if count <=2 then output;
run;
Hi @LittlesasMaster,
Here's a pure PROC SQL approach:
proc sql;
create view top_one as
select cc, max(spend) as spend
from have
group by cc;
select * from top_one
union all
select cc, max(spend) as spend
from (select * from have except all select * from top_one)
group by cc
order by cc, spend desc;
quit;
Please note that customer 1 has two records with SPEND=550 and the above code selects these two. If you want to ignore duplicates in order to obtain what you posted as your "expected results" (i.e. SPEND=200 as the second highest amount for CC=1), simply delete the all keyword in "except all".
Thank you so much this helped me.
Proc rank can prove useful here
proc rank data=have descending ties=dense out=want(where=(rank<=2));
by cc;
var spend;
ranks rank;
run;
proc sort data=want(drop=rank) nodupkey; by cc descending spend; run;
proc print; run;
Otherwise, use normal sort and BY processing or nodupkey sort and a simpler data step to get unique values
proc sort data=have out=temp; by cc descending spend; run;
data want;
set temp; by cc descending spend;
if first.cc then count = 0;
if count < 2 then
if first.spend then do;
output;
count + 1;
end;
drop count;
run;
proc print; run;
/* Or... */
proc sort data=have out=temp2 nodupkey; by cc descending spend; run;
data want2;
set temp2; by cc;
if first.cc then count = 0;
if count < 2 then do;
output;
count + 1;
end;
drop count;
run;
proc print; run;
But if you insist on SQL...
proc sql;
select unique cc, spend
from have as a
where (select count(distinct spend) from have where cc=a.cc and spend > a.spend) < 2
order by cc, spend desc;
quit;
Thank you so much this helped me to understand the concept .
Thank you, this Rank was new for me
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.