BookmarkSubscribeRSS Feed
Linmuxi
Calcite | Level 5

Hi All, I have a dataset with variables Rank1-Rank10. and each Rank variables may varies from 1-5, and there is lots of missing values here.

idRank1Rank2Rank3Rank4Rank5Rank6Rank7Rank8Rank9Rank10
13.....4...
23.3.......
31....14...
43.....2...
53.3......2
6......4.1.

 

what I want is to figure out if there is any same values from Rank1-Rank10 within same row. For example, there is same values for ID 2 ,3 and 5. For those IDs have same values, I want to create a variable called SAME, let same =1. otherwise same= 0

 

can anyone help with this?? Thanks so much!!

7 REPLIES 7
Shmuel
Garnet | Level 18

Try next code:

data want (drop=count1-count5);
  set have;
        array n {*}  count1-count5;
        array r {*}  rank1-rank10;

        same = 0;
        do i=1 to dim(r);
            if r(i) ne . then n(r(i)) +1;
            if n(r(i)) > 1 then do;
               same = 1; 
              
               leave;
           end;
       end;
run;
r_behata
Barite | Level 11
data want;
set have;
array rank rank1-rank10;
array match match1-match10;

do i=1 to dim(rank)-1;
i_=i+1;
	do j=i_ to dim(rank)-1;
	if rank[i] ne . and rank[j] ne . then do;
		if rank[i] eq rank[j] then match[j]=1;		
	end;
	end;
end;

if sum(of match1-match10) ge 1 then same =1;
	else same=0;
	
	drop i j i_ match1-match10;
run;
s_lassen
Meteorite | Level 14

I would do it like this:

data want;
  set have;
  array ranks(*) rank1-rank10;
  same=0;
  do _N_=dim(ranks) to 2 by -1 until(same);
    if not missing(ranks(_N_)) then
      same=whichn(ranks(_N_),of ranks(*))<_N_;
    end;
run;
Ksharp
Super User
data have;
infile cards expandtabs;
input id	Rank1	Rank2	Rank3	Rank4	Rank5	Rank6	Rank7	Rank8	Rank9	Rank10 ;
cards;
1	3	.	.	.	.	.	4	.	.	.
2	3	.	3	.	.	.	.	.	.	.
3	1	.	.	.	.	1	4	.	.	.
4	3	.	.	.	.	.	2	.	.	.
5	3	.	3	.	.	.	.	.	.	2
6	.	.	.	.	.	.	4	.	1	.
;
run;
data want;
 if _n_=1 then do;
  declare hash h();
  h.definekey('k');
  h.definedone();
 end;
set have;
array x{*} rank:;
do i=1 to dim(x);
  if not missing(x{i}) then do;k=x{i};h.ref();end;
end;
same=n(of x{*}) ne h.num_items;
h.clear();
drop i k;
run;
novinosrin
Tourmaline | Level 20
data have;
infile cards expandtabs;
input id	Rank1	Rank2	Rank3	Rank4	Rank5	Rank6	Rank7	Rank8	Rank9	Rank10 ;
cards;
1	3	.	.	.	.	.	4	.	.	.
2	3	.	3	.	.	.	.	.	.	.
3	1	.	.	.	.	1	4	.	.	.
4	3	.	.	.	.	.	2	.	.	.
5	3	.	3	.	.	.	.	.	.	2
6	.	.	.	.	.	.	4	.	1	.
;


data want ;
set have;
 array r [*] rank:;
 array t [ 10] _temporary_ ;
 call pokelong ( (peekclong (addrlong(r[1]), 80)), addrlong(t[1]), 80) ; 
 call sortn(of t(*)); 
 do _n_=whichn(coalesce(of t(*)),of t(*))+1 to dim(t);
 same=t(_n_)=t(_n_-1);
 if same then leave;
 end;
 run;
mkeintz
PROC Star

@novinosrin

 

I like the idea of sorting values, which reduces the task to looking for identical neighboring values.  I presume you put the variables in a temporary array, to avoid sorting the original variables.  But in this case, it's a bit simpler (no poke and peek) to sort the rank variables in place, process them, and then re-read in original order:

 

data have;
  input id Rank1 Rank2 Rank3 Rank4 Rank5 Rank6 Rank7 Rank8 Rank9 Rank10 ;
datalines;
1 3 . . . . . 4 . . . 
2 3 . 3 . . . . . . . 
3 1 . . . . 1 4 . . . 
4 3 . . . . . 2 . . . 
5 3 . 3 . . . . . . 2 
6 . . . . . . 4 . 1 . 
run;

data want (drop=_i);
  set have;
  same=0;
  array rnk {*} rank: ;
  call sortn(of rnk{*});   /* Sort the rank variables */

  /* Now look for identical neighbors */
  do _i=dim(rnk) to 2 by -1 while (same=0 and rnk{_i-1}^=.);
    if rnk{_i}=rnk{_i-1} then same=1;
  end;

  /* Reread the rank variables, in pre-sorted sequence */
  set have (keep=rank:) point=_n_;
run;

 

The do loop starts at the upper bound of the rnk array because all the missing values will be sorted towards the lower bound.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
novinosrin
Tourmaline | Level 20

Thank you Mark. Actually I wanted to reach out to you for a small understanding help on the conditional set statement on the other thread and more important one on a  conditional lag. I will probably do that on Friday assuming it's only fair not to bother you weekdays. Just wanted to let you know I have a request coming through. 

 

Not the 1st or the last that you have helped me speed over the past couple of years , so another one. Have a nice afternoon/evening

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 7 replies
  • 1980 views
  • 4 likes
  • 7 in conversation