Help using Base SAS procedures

Is it possible to leave only largest values in table?

Reply
Occasional Contributor
Posts: 10

Is it possible to leave only largest values in table?

There are two countries, around 50 regions and a lot of different items in my dataset.
What I want to do?
Leave in table only rows with top 3 "sales" for each country, each item and week. In other words, if I have data like this:
Country Region Item Week Sales
LV r1 i1 1 12
LV r2 i1 1 11
LV r3 i1 1 10
LV r4 i1 1 15
LV r5 i1 1 13
......
LT r6 i1 1 12
LT r7 i1 1 10
LT r8 i1 1 11
LT r9 i1 1 18
LT r0 i1 1 20

I want to get such results:
Country Region Item Week Sales
LV r1 i1 1 12
LV r4 i1 1 15
LV r5 i1 1 13
......
LT r6 i1 1 12
LT r9 i1 1 18
LT r0 i1 1 20

Is it possible to do so?

I got a feeling it should be possible with PROC UNIVARIATE, but not sure, how. Message was edited by: efunk
N/A
Posts: 0

Re: Is it possible to leave only largest values in table?

Hello,

One solution is offered by proc means with output out statement, combined with transpose procedure:

[pre]
data have;
input Country $ Region $ Item $ Week Sales ;
datalines;
LV r1 i1 1 12
LV r2 i1 1 11
LV r3 i1 1 10
LV r4 i1 1 15
LV r5 i1 1 13
LT r6 i1 1 12
LT r7 i1 1 10
LT r8 i1 1 11
LT r9 i1 1 18
LT r0 i1 1 20
;

proc means data=have nway noprint;
class Country Item Week;

output out=want_ntr idgroup(max(sales) out[3] (sales)=)/autoname ;

run;

proc transpose data=want_ntr out=want (drop=_name_);
by Country Item Week;

var sales:;

run;
[/pre]

Marius
Occasional Contributor
Posts: 10

Re: Is it possible to leave only largest values in table?

Posted in reply to deleted_user
It is essential for me that region names would be left in the table as I need to know in which region those sales were made.

There's almost zero chance for ties as all sales(currency) in real data have 13 decimal places, for example: 29.6044367170295

Message was edited by: efunk
N/A
Posts: 0

Re: Is it possible to leave only largest values in table?

hello,

this is not difficult. just merge "want" database with initial one. this merging will also solve the tiebreaking situation data _null_ refers at:

[pre]
proc sort data=have out=have_sort;
by Country Item Week descending Sales;
run;

proc sort data=want(rename=(col1=sales)) out=want_sort;
by Country Item Week descending Sales ;
run;

data final;
merge want_sort (in=a) have_sort (in=b);
by Country Item Week descending Sales ;

if a and b;
run;
[/pre]

Marius
Occasional Contributor
Posts: 10

Re: Is it possible to leave only largest values in table?

Posted in reply to deleted_user
Works perfectly. Thank you all for lightning fast support.
Special thanks to Marius.
Respected Advisor
Posts: 3,799

Re: Is it possible to leave only largest values in table?

What do you want in the case of ties?

You could also use the OBS info from the IDGROUP statement to POINT to the obs that you want for the subset.

[pre]
data have;
input (Country Region Item)(:$2.) Week Sales;
cards;
LV r1 i1 1 12
LV r2 i1 1 11
LV r3 i1 1 10
LV r4 i1 1 15
LV r5 i1 1 13
LT r6 i1 1 12
LT r7 i1 1 10
LT r8 i1 1 11
LT r9 i1 1 18
LT r0 i1 1 20
;;;;
run;
proc summary data=have nway;
class country;
output out=top3(keep=_obs_Smiley Happy idgroup(max(sales) obs out[3](sales)=);
run;
proc print;
run;
data need;
set top3;
array obs
  • _obs_:;
    do j=1 to dim(obs);
    point=obs;
    set have point=point;
    output;
    end;
    drop _obs_: j;
    run;
    proc print;
    run;
    [/pre]

    [pre]
    Obs Country Region Item Week Sales

    1 LT r0 i1 1 20
    2 LT r9 i1 1 18
    3 LT r6 i1 1 12
    4 LV r4 i1 1 15
    5 LV r5 i1 1 13
    6 LV r1 i1 1 12 Message was edited by: data _null_;
  • Frequent Contributor
    Posts: 139

    Re: Is it possible to leave only largest values in table?

    Hello efunk

    By sorting your data and using the first./last. programming functionality you can get what you want. See the below code


    data sales;
    length country region item $2 week sales 8;
    input country $ region $ item $ week sales;
    datalines;
    LV r1 i1 1 12
    LV r2 i1 1 11
    LV r3 i1 1 10
    LV r4 i1 1 15
    LV r5 i1 1 13
    LT r6 i1 1 12
    LT r7 i1 1 10
    LT r8 i1 1 11
    LT r9 i1 1 18
    LT r0 i1 1 20
    ;
    run;

    proc sort data=sales ;
    by country descending sales;
    run;

    data what_i_want;
    set sales;
    by country descending sales;
    if first.country then do;
    counter=0;
    end;
    counter +1;

    if counter<=3;
    run;


    D
    Ask a Question
    Discussion stats
    • 6 replies
    • 126 views
    • 0 likes
    • 4 in conversation