Allocating numbers from "-n" to "+n" while ranking a variable by group

Accepted Solution Solved
Reply
Contributor
Posts: 33
Accepted Solution

Allocating numbers from "-n" to "+n" while ranking a variable by group

[ Edited ]

Hi

 

I have a dataset with millions of observations for several IDs. I have given a small example below to narate the issue.

 

Say, I have following dataset where each ID has observations (sorted) ranging from negative to positive values. I need to allocate ranks (smallest to largest) to these observations "By ID", the ranks must be ranging from -n to +n as shown below.

 

data have; 
  input id year ret;
  datalines;
45  2005  -0.30
45  2006   0.28
45  2007  -0.08
48  2011  -0.01
48  2012   0.78
48  2013   0.54
49  2007   0.15
49  2008  -0.06
49  2009   0.3
49  2010  -0.06
;
run;
proc sort data=have;
by id ret;
run;
data want; 
  input id year ret counter;
  datalines;
45  2005  -0.30    -2
45  2007  -0.08    -1
45  2006   0.28     1
48  2011  -0.01    -1
48  2013   0.54     1
48  2012   0.78     2
49  2008  -0.06    -2
49  2010  -0.06    -1
49  2007   0.15     1
49 2009 0.3 2
; run;

Please help me fin finding codes for:

1- Finding number of negative and positive values in each ID.

2- Allocating rank counter from smallest to largest.

 

Thanks.


Accepted Solutions
Solution
‎10-05-2017 05:59 PM
Super User
Posts: 5,987

Re: Allocating numbers from "-n" to "+n" while ranking a variable by group

While I would normally program this just the way @mkeintz suggested, it sounds like that's difficult for you.  So I'll present an easier approach:

 

proc sort data=have out=negatives;

by id descending ret;

where ret < 0;

run;

 

proc sort data=have out=positives;

by id ret;

where ret > 0;

run;

 

data negatives;

set negatives;

by id;

if first.id then counter=-1;

else counter + (-1);

run;

 

data positives;

set positives;

by id;

if first.id then counter=1;

else counter + 1;

run;

 

data want;

set negatives positives;

by id;

run;

 

The order won't be quite right, but you can fix that:

 

proc sort data=want;

by id counter;

run;

View solution in original post


All Replies
Trusted Advisor
Posts: 1,148

Re: Allocating numbers from "-n" to "+n" while ranking a variable by group

This looks like a homework assignment.

 

Why don't you provide a sample dataset and code that you have tried to use, so we can help you figure things out.

 

But as an operational problem, it looks like you could

  1. sort the data smallest to largest value of some variable for each id
  2. For each id
    1. count the number of negatives (N_Neg).
    2. Using that count, assign the "negative ranks" sequentially from -N_neg to -1
    3. For positives assign ranks sequentially starting at 1

 

Contributor
Posts: 33

Re: Allocating numbers from "-n" to "+n" while ranking a variable by group

[ Edited ]

@mkeintz can you provide the code please?

Solution
‎10-05-2017 05:59 PM
Super User
Posts: 5,987

Re: Allocating numbers from "-n" to "+n" while ranking a variable by group

While I would normally program this just the way @mkeintz suggested, it sounds like that's difficult for you.  So I'll present an easier approach:

 

proc sort data=have out=negatives;

by id descending ret;

where ret < 0;

run;

 

proc sort data=have out=positives;

by id ret;

where ret > 0;

run;

 

data negatives;

set negatives;

by id;

if first.id then counter=-1;

else counter + (-1);

run;

 

data positives;

set positives;

by id;

if first.id then counter=1;

else counter + 1;

run;

 

data want;

set negatives positives;

by id;

run;

 

The order won't be quite right, but you can fix that:

 

proc sort data=want;

by id counter;

run;

Contributor
Posts: 33

Re: Allocating numbers from "-n" to "+n" while ranking a variable by group

[ Edited ]
Posted in reply to Astounding

@Astounding Thank you so much for your detailed reply. This is quite helpful.

Frequent Contributor
Posts: 110

Re: Allocating numbers from "-n" to "+n" while ranking a variable by group

Try this.....

 

data NEG WANT;
  set HAVE;
if ret<0 then output NEG;
  else output WANT;
run;

proc sort data=neg;
  by ID descending RET;
run;

data NEG;
  set NEG;
by id;
if first.id then ctr=.;
ctr+1;
run;

proc sql;
  update NEG
  SET CTR=CTR*-1;
quit;


proc sort data=want;
  by id ret;
run;

data want;
  set want;
by id;
if first.id then ctr=.;
ctr+1;
run;

proc append base=want data=neg force;
 run;

proc sort data=want;
  by id ctr;
run;

Hope this helps.

Contributor
Posts: 33

Re: Allocating numbers from "-n" to "+n" while ranking a variable by group

Posted in reply to ShiroAmada

@ShiroAmada Thanks a lot. Really useful.

PROC Star
Posts: 167

Re: Allocating numbers from "-n" to "+n" while ranking a variable by group

@Saba1

As you state that your data has millions of observations, it may be more efficient if you can do the calculation without sorting (assuming that you initial data is sorted by ID and YEAR):

data want;
  array values(100) 8 _temporary_;
  npos=0;
  nneg=0;
  n0=0;
  do _N_=1 by 1 until(last.id);
    set have(keep=id ret);
    by id;
    values(_N_)=ret;
    if ret<0 then
      nneg+1; /* number of negative values */
    else if ret>0 then
      npos+1; /* number of positive values */
    else
      n0+1; /* number of zeroes */
    end;
  call sortn(of values(*));
  do until(last.id);
    set have;
    by id;
    _N_=whichn(ret,of values(*));
    if ret=0 then
      rank=0;
    else if ret<0 then
      rank=_N_-100+npos+n0-1;
    else
      rank=_N_-100+npos;
    values(_N_)=.; /* set missing, so that next record with same value gets next rank */
    output;
    end;
run;

If two years have the same ID and RET value, the program assigns the lowest rank to the earliest record. The array dimension (100) should not be smaller than the number of records for a single ID, as you appeared to have one record per ID and YEAR, I assumed that 100 was a safe value. I added the possibility of zeros, always giving a rank of 0. You may want to take missing values into account also (I would suggest giving them a missing rank, currently they are getting NPOS+N0-100).

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 7 replies
  • 161 views
  • 0 likes
  • 5 in conversation