BookmarkSubscribeRSS Feed
Shonesum
Calcite | Level 5

Hello all, 

How can I add a column into dataset based on 1 column value.  How can I divide each rank into 2 splits or 3 splits. If I used Proc rank. it just splits the dataset into 2/3 equal portion. Is there a way to split the each unique ranks into 2/3 split. In the example below I have splited into 2 splits. 

 

Have:

Test1Test2Rank
Appledd1
Balldd1
Catdd1
Dogdd1
Elephantdd2
dsddd2
sdfdd2
asdfe3
asdfe3
asdfe3
adsfe3

 

want:

Test1Test2RankDesired
Appledd11
Balldd11
Catdd12
Dogdd12
Elephantdd21
dsddd22
sdfdd22
asdfe31
asdfe31
asdfe32
adsfe32

 

 

proc rank data = have out = want group =2;

var = rank;

ranks =desired;

quit;

 

This code gave me :

 

Test1Test2RankDesired
Appledd10
Balldd10
Catdd10
Dogdd10
Elephantdd21
dsddd21
sdfdd21
asdfe31
asdfe32
asdfe32
adsfe32

Thank you in advance.

 

 

7 REPLIES 7
PaigeMiller
Diamond | Level 26

@Shonesum wrote:

Hello all, 

How can I add a column into dataset based on 1 column value.  How can I divide each rank into 2 splits or 3 splits. If I used Proc rank. it just splits the dataset into 2/3 equal portion. Is there a way to split the each unique ranks into 2/3 split. In the example below I have splited into 2 splits. 

 

Have:

Test1 Test2 Rank
Apple dd 1
Ball dd 1
Cat dd 1
Dog dd 1
Elephant dd 2
dsd dd 2
sdf dd 2
asdf e 3
asdf e 3
asdf e 3
adsf e 3

 

want:

Test1 Test2 Rank Desired
Apple dd 1 1
Ball dd 1 1
Cat dd 1 2
Dog dd 1 2
Elephant dd 2 1
dsd dd 2 2
sdf dd 2 2
asdf e 3 1
asdf e 3 1
asdf e 3 2
adsf e 3 2

 


What is the rule that decides that sometimes you get 1/2 of the rows as 1 and 1/2 as 2 for some ranks, and 1/3 versus 2/3 for other ranks? What happens if there are 5 observations in a Rank? Is this done at random or non-random and sequentially?

--
Paige Miller
Shonesum
Calcite | Level 5

Hi Paige it is done sequentially. I solved it by using "by" statement in proc rank 🙂 

PaigeMiller
Diamond | Level 26

@Shonesum wrote:

Hi Paige it is done sequentially. I solved it by using "by" statement in proc rank 🙂 


That's great!

 

Could you please share your code in case someone else reads this thread and wants to know the answer?

--
Paige Miller
Shonesum
Calcite | Level 5

Since there was no numeric column, I added col variable  then by rank.

 

data work.have;
set work.test;
col=_n_;
run;

proc rank data = work.have out = work.want group =2;
by rank;
var  col;

ranks desired;

quit;
PaigeMiller
Diamond | Level 26

Nice! Well done! Smiley Happy

--
Paige Miller
KachiM
Rhodochrosite | Level 12

You want the "desired" to start from 1. You also desire 2 or 3 groups within each RANK given in your HAVE data set. Here is a Data Step for SUB SPLITTING. If the RANK numbers is less than 6 I made a split of 2 groups otherwise 3 groups. With your example data set, two groups are made. I have added another 6 rows to show for a split of 3 groups.

 

data have;
input Test1 :$8. Test2 :$2. Rank;
datalines;
Apple    dd 1
Ball     dd 1
Cat      dd 1
Dog      dd 1
Elephant dd 2
dsd      dd 2
sdf      dd 2
asdf     e  3
asdf     e  3
asdf     e  3
adsf     e  3
;
run;

The following program splits into 2 groups:

 


data want;
   do _n_ = 1 by 1 until(last.rank);
      set have;
      by rank;
   end;
   ngroups = ceil(ifN(_n_ >= 6, _n_/3, _n_/2));
   do _n_ = 1 by 1 until(last.rank);
      set have;
      by rank;
      if _n_ <= ngroups then desired = 1;
      else desired + mod(_n_, ngroups);
      output;
   end;
drop ngroups;
run;

The statement:

   ngroups = ceil(ifN(_n_ >= 6, _n_/3, _n_/2));

computes the desired number of rows to be present in each split. The following statements

 

      if _n_ <= ngroups then desired = 1;
      else desired + mod(_n_, ngroups);

creates the values for "desired" based on the number of observations per RANK.

 

Is this what you want?

 

Here is the additional observations. The same program works.

 

data have;
input Test1 :$8. Test2 :$2. Rank;
datalines;
Apple    dd 1
Ball     dd 1
Cat      dd 1
Dog      dd 1
Elephant dd 2
dsd      dd 2
sdf      dd 2
asdf     e  3
asdf     e  3
asdf     e  3
adsf     e  3
aaa      g  4
aaa      g  4
aaa      g  4
aaa      g  4
aaa      g  4
aaa      g  4
;
run;

The output of this:

 

Obs 	Test1 	Test2 	Rank 	desired
1 	Apple 	dd 	1 	1
2 	Ball 	dd 	1 	1
3 	Cat 	dd 	1 	2
4 	Dog 	dd 	1 	2
5 	Elephant 	dd 	2 	1
6 	dsd 	dd 	2 	1
7 	sdf 	dd 	2 	2
8 	asdf 	e 	3 	1
9 	asdf 	e 	3 	1
10 	asdf 	e 	3 	2
11 	adsf 	e 	3 	2
12 	aaa 	g 	4 	1
13 	aaa 	g 	4 	1
14 	aaa 	g 	4 	2
15 	aaa 	g 	4 	2
16 	aaa 	g 	4 	3
17 	aaa 	g 	4 	3

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

Mastering the WHERE Clause in PROC SQL

SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 7 replies
  • 859 views
  • 0 likes
  • 3 in conversation