SAS Programming

SASPhile · Posted 10-27-2012 10:43 AM

there are 10 ids and scores,with varying records for each id. how to get top 3 records for each id based on score?

there is a function "TOP" in other SQL's, what is the equivalent in SAS?

alandool · Posted 03-23-2013 11:32 AM

Editor's Note: Thanks to all who contributed, showing that there are mulitple ways to get the same results in SAS.

Here's my take on this. A mix of proc sql and data step.

Alan

data have;
input id $ score @@;
datalines;
S12 -20 S12 20 S12 10
S12 10 S12 20 S12 -50
S13 -40 S13 60 S13 -100
S14 10 S14 20 S14 0
S14 -30 S14 -20
;

/* change as needed to bring back desired ranking */
%let start= 1;
%let end = 3;

/* create table containing distinct id and value -- needed b/c more than one id can have the same score */
proc sql;
create table dst_id_score as
select distinct id, score from have
order by id, score desc; /* keep this sort order for ranking */
* NOTE: Table WORK.DST_ID_SCORE created, with 12 rows and 2 columns. ;

/* assign ranking by id by score */
data temp;
set dst_id_score ;
by id;
if first.id then rnk=0;
rnk + 1;
run;
* NOTE: The data set WORK.TEMP has 12 observations and 3 variables. ;

/* go back to original file and bring back all data meeting desired ranking */
proc sql;
create table final_top_ranked as

select
a.id
, a.score
, b.rnk as rank
from have a inner join temp b on a.id = b.id and a.score = b.score
where rnk between &start and &end
order by id, rnk;
* NOTE: Table WORK.FINAL_TOP_RANKED created, with 11 rows and 3 columns. ;

View solution in original post

FriedEgg · Posted 10-27-2012 12:23 PM

data want;

do _n_=1 by 1 until(last.id);

set have;

by id;

if _n_ <= 3 then output;

end;

run;

art297 · Posted 10-27-2012 12:45 PM

Of course that assumes that the data is pre-sorted and that ties aren't relevant.

FriedEgg · Posted 10-28-2012 04:24 PM

Yes, it relies upon the data being presorted, unlike sql, however the TOP clause itself is indiscriminate about and type of rank or ties, so that shouldn't be an issue to repeat functionality.

Haikuo · Posted 10-28-2012 03:41 PM

If still choose SQL (SAS SQL in this case) and include ties:

data have;

input id $ value;

cards;

S12 -20

S12 20

S12 10

S12 20

S12 -50

S13 -40

S13 60

S13 -100

S14 10

S14 20

S14 0

S14 -30

S14 -20

;

proc sql;

select id,value from

(select distinct b.id,b.value, a.value as avalue from have a

left join have b

on a.id=b.id)

group by id,value

having sum(avalue>=value)<=3;

quit;

Haikuo

art297 · Posted 10-28-2012 06:30 PM

I don't think that works as intended.

You still haven't marked this question as being answered, thus everyone will still keep trying to come up with solutions. I really think you are best off doing this in either a datastep or using a proc like proc rank. However, since you asked how you might be able to do it using proc sql, here is a proc sql solution that takes ties into consideration (i.e., if there is a tie for the 3rd position, it will provide all records that have a value greater than of equal to the tied value).

Unfortunately, it has to be modified to account for the specific ids of interest. I used Haikuo's example dataset, but added one record for testing purposes:

data have;

input id $ value;

cards;

S12 -20

S12 20

S12 10

S12 20

S12 -50

S13 -40

S13 15

S13 60

S13 -100

S14 10

S14 20

S14 0

S14 -30

S14 -20

;

proc sql;

create table temp as

select *

from have

where id="S12"

order by value desc

;

create table S12 as

select id, value as bottom_value

from temp (firstobs=3 obs=3)

;

create table temp as

select *

from have

where id="S13"

order by value desc

;

create table S13 as

select id, value as bottom_value

from temp (firstobs=3 obs=3)

;

create table temp as

select *

from have

where id="S14"

order by value desc

;

create table S14 as

select id, value as bottom_value

from temp (firstobs=3 obs=3)

;

create table criteria as

select * from S12

union all select * from S13

union all select * from S14

;

create table want (drop=count) as

select a.*,count(*) as count

from have a

left join criteria b

on a.id=b.id

group by a.id

having a.value>=b.bottom_value

order by id, value desc

;

quit;

FriedEgg · Posted 10-28-2012 07:17 PM

data have;

input id $ value @@;

cards;

S12 -20 S12 20 S12 10

S12 10 S12 20 S12 -50

S13 -40 S13 60 S13 -100

S14 10 S14 20 S14 0

S14 -30 S14 -20

;

proc means data=have nway noprint;

class id;

var value;

output out=top3arr(drop=_:) idgroup ( max(value) out[3] (value)=) /autolabel autoname;

run;

proc transpose data=top3arr out=top3list(drop=_: rename=(col1=value));

by id;

var value:;

run;

proc sql;

drop table top3arr;

quit;

ID	VALUE
S12	20
S12	20
S12	10
S13	60
S13	-40
S13	-100
S14	20
S14	10
S14	0

PGStats · Posted 10-28-2012 09:39 PM

With proc rank you get some control over the way ties are treated :

data have;
input id $ score @@;
datalines;
S12 -20 S12 20 S12 10
S12 10 S12 20 S12 -50
S13 -40 S13 60 S13 -100
S14 10 S14 20 S14 0
S14 -30 S14 -20
;

proc rank data=have descending ties=low /*ties=dense*/
out=want(where=(order<=3));
by id notsorted;
var score; ranks order;
run;

PG

Haikuo · Posted 10-28-2012 10:02 PM

Agree with Art, SQL is not really optimal for this kind of task. Comparing to PG's clean-cut proc rank, Data step will have same amount of control, if not more:

data have;

input id $ score @@;

datalines;

S12 -20 S12 20 S12 10

S12 10 S12 20 S12 -50

S13 -40 S13 60 S13 -100

S14 10 S14 20 S14 0

S14 -30 S14 -20

;

proc sort data=have;

by id descending value;

run;

data want;

do _n_=1 by 1 until (last.id);

set have nobs=nobs;

by id descending value;

if first.id then _f=nobs;

_n+ifn(value=lag(value),0,1);

_n=ifn(first.id,1,_n);

_f=ifn(_n_=3,_n,_f) ;

if _n<=_f then output;

end;

drop _:;

run;

proc print;run;

alandool · Posted 03-23-2013 11:32 AM

Editor's Note: Thanks to all who contributed, showing that there are mulitple ways to get the same results in SAS.

Here's my take on this. A mix of proc sql and data step.

Alan

data have;
input id $ score @@;
datalines;
S12 -20 S12 20 S12 10
S12 10 S12 20 S12 -50
S13 -40 S13 60 S13 -100
S14 10 S14 20 S14 0
S14 -30 S14 -20
;

/* change as needed to bring back desired ranking */
%let start= 1;
%let end = 3;

/* create table containing distinct id and value -- needed b/c more than one id can have the same score */
proc sql;
create table dst_id_score as
select distinct id, score from have
order by id, score desc; /* keep this sort order for ranking */
* NOTE: Table WORK.DST_ID_SCORE created, with 12 rows and 2 columns. ;

/* assign ranking by id by score */
data temp;
set dst_id_score ;
by id;
if first.id then rnk=0;
rnk + 1;
run;
* NOTE: The data set WORK.TEMP has 12 observations and 3 variables. ;

/* go back to original file and bring back all data meeting desired ranking */
proc sql;
create table final_top_ranked as

select
a.id
, a.score
, b.rnk as rank
from have a inner join temp b on a.id = b.id and a.score = b.score
where rnk between &start and &end
order by id, rnk;
* NOTE: Table WORK.FINAL_TOP_RANKED created, with 11 rows and 3 columns. ;

neekum · Posted 04-08-2016 11:56 PM

Great solution. It is transparent, and that is critical for debugging and portability.

neekum · Posted 04-08-2016 11:58 PM

I meant, great solution by alandool. It totally matches my coding style, where I spell out each and every item explicitly for QA and later use. Kudos!

SAS Programming

Select top 3 records for each id

Re: Select top 3 records for each id

Re: Select top 3 records for each id

Re: Select top 3 records for each id

Re: Select top 3 records for each id

Re: Select top 3 records for each id

Re: Select top 3 records for each id

Re: Select top 3 records for each id

Re: Select top 3 records for each id

Re: Select top 3 records for each id

Re: Select top 3 records for each id

Re: Select top 3 records for each id

Re: Select top 3 records for each id

Top 3 Reasons Bookmarks are an Asset

Unveiling the Latest Upgrades: SAS Visual Analytics Empowers the Acade...

Using SAS Viya to Select the British & Irish Lions Rugby Team: Part 3

selecting records, ranked variables

Serverless Operations in AWS, part 3: Schedule and Monitor

Follow Us

What is...

SAS Programming

Register Today!

SAS Training: Just a Click Away

Follow Us

What is...