BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
MariaD
Barite | Level 11

Hi all,

 

I have the following situation:

 

Id    Variable1    Variable 2     Variable 3

A         33              45                   21

B         13              18                   25

 

And I like to know the order of the value of variable 1, for each row, compared with variable 2 and 3. The expected results will be:

 

Id    Variable1    Variable 2     Variable 3      Order1      Order2    Order 3

A         33              45                   21                2               1             3          

B         13              18                   25                1               2             3

 

Thats means, for Id "A", the variable 1 is the second largest compared with variable 2 and 3, so the new variable Order1 has value 2. The variable 2, has the greatest value, so Order2 is 1, etc.

 

Any suggestions?

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
art297
Opal | Level 21

You can also do it using a datastep. e.g.:

data have;
  input Id$ Variable1-Variable3;
  datalines;
A 33 45 21
B 13 18 25
;

data want (drop=i);
  set have;
  array have(3) var:;
  array sorted(3) _temporary_;
  array order(3);
  do i=1 to 3;
    sorted(i)=have(i);
  end;
  call sortn(sorted(3),sorted(2),sorted(1));
  do i=1 to 3;
    order(i)=whichn(sorted(i),of have(*));
  end;
run;

Art, CEO, AnalystFinder.com

 

View solution in original post

13 REPLIES 13
PaigeMiller
Diamond | Level 26

This is very simple if you do a transpose of your original data, and then run PROC RANK to get the orders.

--
Paige Miller
PeterClemmensen
Tourmaline | Level 20
data have;
input Id$ Variable1-Variable3;
datalines;
A 33 45 21
B 13 18 25
;

proc transpose data=have out=have_long(rename=(_NAME_=variable COL1=value));
by ID;
run;

proc rank data=have_long out=want;
by ID;
var value;
ranks order;
run;
novinosrin
Tourmaline | Level 20

Neat suggestion by @PaigeMiller and demo by @PeterClemmensen, However you may need another transpose after proc rank step that is if we(the community) want to meet OP's requirement. 

 

MariaD
Barite | Level 11

Thanks! That's correct,  I'm looking for some solutions that not require applied more than one transpose.

novinosrin
Tourmaline | Level 20

Well @art297 the genius always has it slick before ordinary blokes like me could provide one.Take his and often read his posts I think hall of famers like him should pariticipate a bit slowly. lol hahahaha All the Best!

PaigeMiller
Diamond | Level 26

@MariaD wrote:

Thanks! That's correct,  I'm looking for some solutions that not require applied more than one transpose.


It's not really clear why you don't want to transpose your data twice, it's relatively safe and quick (unless you have a HUGE data set); this is also much simpler (in my opinion) that the brilliant solution from @art297, and has the benefit that the algorithms used are thoroughly tested and debugged, and also has the benefit that you can specify how ties are handled. I usually advise people to use built-in solutions (e.g. SAS PROCs), than trying to write your own code and then confirming that you got it right and it handles all situations properly.

--
Paige Miller
MariaD
Barite | Level 11

First at all, thanks so much all of you for greats ideas.

 

Yes, @PaigeMiller, I have a huge table (hundred of million of records) so I like to consider all possible options to choose the most optimal for it.

 

PaigeMiller
Diamond | Level 26

Well, @MariaD, having thought about it some more, I would think that a PROC would still handle HUGE datasets faster than data step code. But I admit I don't know that this is the right answer in your case. I guess the only way to find out is to try it.

--
Paige Miller
art297
Opal | Level 21

@PaigeMiller: I have to disagree! Datastep code, in my experience, almost always works faster than a proc.

 

Since I'm only using SAS UE these days I can't really see the effects on a "large" dataset, but below is an example on just a 2 million record file (expanded from the original example).

 

As you can see, the whole process only took .45 seconds using datastep code, while just the initial proc transpose took 1.34 seconds. 

 

Also, for comparison purposes, I included a conversion from wide to long using the %untranspose macro (http://www.sascommunity.org/wiki/An_Easier_and_Faster_Way_to_Untranspose_a_Wide_File .. which uses datastep code) that we'll be presenting at this year's SGF. As you'll see, if you run the test code, it does the untransposition task in just .16 seconds.

 

Art, CEO, AnalystFinder.com

/*time comparisons on a 2 million record file*/
data have (drop=i); input Variable1-Variable3; if _n_ eq 1 then do; do i=1 to 1000000; id=put(i,8.); output; end; end; else do; do i=1000001 to 2000000; id=put(i,8.); output; end; end; datalines; 33 45 21 13 18 25 ; proc transpose data=have out=need (rename=(Col1=value)); by id; var variable1-variable3; run; /* cpu time: 1.34 seconds */ /* using the %untranspose macro (http://www.sascommunity.org/wiki/An_Easier_and_Faster_Way_to_Untranspose_a_Wide_File )*/ %untranspose(data=have, out=need, var=variable, id=position,by=id) /* cpu time: 0.16 seconds */ data want (drop=i); set have; array have(3) var:; array sorted(3) _temporary_; array order(3); do i=1 to 3; sorted(i)=have(i); end; call sortn(sorted(3),sorted(2),sorted(1)); do i=1 to 3; order(i)=whichn(sorted(i),of have(*)); end; run; /* cpu time: 0.45 seconds */
PaigeMiller
Diamond | Level 26

That's good to know!

--
Paige Miller
art297
Opal | Level 21

You can also do it using a datastep. e.g.:

data have;
  input Id$ Variable1-Variable3;
  datalines;
A 33 45 21
B 13 18 25
;

data want (drop=i);
  set have;
  array have(3) var:;
  array sorted(3) _temporary_;
  array order(3);
  do i=1 to 3;
    sorted(i)=have(i);
  end;
  call sortn(sorted(3),sorted(2),sorted(1));
  do i=1 to 3;
    order(i)=whichn(sorted(i),of have(*));
  end;
run;

Art, CEO, AnalystFinder.com

 

Astounding
PROC Star

Here's a simple approach.  It's nowhere near as flexible as @art297's program (since it only handles exactly 3 incoming variables).  But it's simple enough that a junior programmer can understand it.

 

data want;

set have;

order1=2;

order2=2;

order3=2;

if variable1 = max(of variable1-variable3) then order1=1;

else if variable1 = min(of variable1-variable3) then order1=3;

if variable2 = max(of variable1-variable3) then order2=1;

else if variable2 = min(of variable1-variable3) then order2=3;

if variable3 = max(of variable1-variable3) then order3=1;

else if variable3 = min(of variable1-variable3) then order3=3;

run;

Ksharp
Super User

It is really easy for IML.

 

data have;
input Id$ Variable1-Variable3;
datalines;
A 33 45 21
B 13 18 25
;

proc iml;
use have nobs nobs;
read all var _num_ into  x;
close;
rank=j(nrow(x),ncol(x));
do i=1 to nobs;
rank[i,]=ranktie(x[i,],'dense');
end;
create temp from rank;
append from rank;
close;
quit;

data want;
 merge have temp;
run;

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 13 replies
  • 1508 views
  • 4 likes
  • 7 in conversation