- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
I'm trying to select the top 1 record belonging to each group in proc sql, but not been able to identify a correct way to do it in proc sql. I'm trying to find top account tied to the store with transactions and total_sales ordered in descending manner. In simple words, one account may be tied to multiple stores, so within each account, I'm trying to filter the top record with highest transactions and sales.
I have an excel file which I'm importing using proc import, but providing a sample table below as an example.
account | store_num | transactions | sales |
101 | 1000 | 3 | 90 |
101 | 1000 | 6 | 170 |
105 | 1015 | 5 | 156 |
102 | 1234 | 9 | 200 |
101 | 1234 | 7 | 189 |
102 | 1234 | 11 | 267 |
105 | 1030 | 5 | 170 |
I'm initially grouping the account and store_num by sum of transactions and sales in descending order and then used PROC RANK to assign rank.
```
proc import datafile="/myfolder/eg.xlsx"
out=fileg dbms=xlsx replace;
run;
proc sql;
create table tempp as
Select account, store_num, sum(transactions) as total_Transactions, sum(sales) as Total_Sales,
from fileg
group by account, store_num
order by total_transactions desc, Total_Sales desc;
quit;
proc sql;
create table main2 as
select * from tempp
order by account;
quit;
proc rank data=main2 out=result ties=dense descending;
by account;
var total_transactions;
ranks transRank;
run;
proc print data=result n;
run;
```
The output of 'result' looks like the one below:
account | store_num | total_transactions | total_sales | transRank |
101 | 1234 | 7 | 189 | 2 |
101 | 1000 | 9 | 260 | 1 |
102 | 1234 | 20 | 467 | 1 |
105 | 1015 | 5 | 156 | 1 |
105 | 1030 | 5 | 170 | 1 |
I want my final table to look like the one below so that it ranks and filter outs the first record for each account having highest transactions. In case of a tie in the rank due to same transactions, the highest sales will be used to provide the rank.
account | store_num | total_sales | total_transactions |
102 | 1234 | 467 | 20 |
101 | 1000 | 260 | 9 |
105 | 1030 | 170 | 5 |
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
SAS SQL doesn't provide window/analytics functions so any selection based on sort order is not that easy to implement.
With SAS I'd go for a combination of SQL and data step (option 1). It can also be done with a nested SLQ but it's not "pretty" (option 2).
data have;
input account store_num transactions sales;
datalines;
101 1000 3 90
101 1000 6 170
101 1234 7 189
102 1234 9 200
102 1234 11 267
105 1030 5 170
105 1015 5 156
;
/* option 1 */
proc sql;
create table inter as
select
account,
store_num,
sum(sales) as total_sales,
sum(transactions) as total_transactions
from have
group by account, store_num
order by account, total_transactions, total_sales
;
quit;
data want1;
set inter;
by account;
if last.account;
run;
/* option 2 */
proc sql;
create table want2 as
select *
from
(
select *
from
(
select
account,
store_num,
sum(sales) as total_sales,
sum(transactions) as total_transactions
from have
group by account, store_num
)
group by account
having max(total_transactions)=total_transactions
)
group by account
having max(total_sales)=total_sales
;
quit;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Please carefully define, or provide starting data and result, exactly what you mean by "top account tied to the store".
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
SAS SQL doesn't provide window/analytics functions so any selection based on sort order is not that easy to implement.
With SAS I'd go for a combination of SQL and data step (option 1). It can also be done with a nested SLQ but it's not "pretty" (option 2).
data have;
input account store_num transactions sales;
datalines;
101 1000 3 90
101 1000 6 170
101 1234 7 189
102 1234 9 200
102 1234 11 267
105 1030 5 170
105 1015 5 156
;
/* option 1 */
proc sql;
create table inter as
select
account,
store_num,
sum(sales) as total_sales,
sum(transactions) as total_transactions
from have
group by account, store_num
order by account, total_transactions, total_sales
;
quit;
data want1;
set inter;
by account;
if last.account;
run;
/* option 2 */
proc sql;
create table want2 as
select *
from
(
select *
from
(
select
account,
store_num,
sum(sales) as total_sales,
sum(transactions) as total_transactions
from have
group by account, store_num
)
group by account
having max(total_transactions)=total_transactions
)
group by account
having max(total_sales)=total_sales
;
quit;