SAS Programming

DATA Step, Macro, Functions and more
BookmarkSubscribeRSS Feed
Linmuxi
Calcite | Level 5

Hi All, I have a dataset with variables Rank1-Rank10. and each Rank variables may varies from 1-5, and there is lots of missing values here.

idRank1Rank2Rank3Rank4Rank5Rank6Rank7Rank8Rank9Rank10
13.....4...
23.3.......
31....14...
43.....2...
53.3......2
6......4.1.

 

what I want is to figure out if there is any same values from Rank1-Rank10 within same row. For example, there is same values for ID 2 ,3 and 5. For those IDs have same values, I want to create a variable called SAME, let same =1. otherwise same= 0

 

can anyone help with this?? Thanks so much!!

7 REPLIES 7
Shmuel
Garnet | Level 18

Try next code:

data want (drop=count1-count5);
  set have;
        array n {*}  count1-count5;
        array r {*}  rank1-rank10;

        same = 0;
        do i=1 to dim(r);
            if r(i) ne . then n(r(i)) +1;
            if n(r(i)) > 1 then do;
               same = 1; 
              
               leave;
           end;
       end;
run;
r_behata
Barite | Level 11
data want;
set have;
array rank rank1-rank10;
array match match1-match10;

do i=1 to dim(rank)-1;
i_=i+1;
	do j=i_ to dim(rank)-1;
	if rank[i] ne . and rank[j] ne . then do;
		if rank[i] eq rank[j] then match[j]=1;		
	end;
	end;
end;

if sum(of match1-match10) ge 1 then same =1;
	else same=0;
	
	drop i j i_ match1-match10;
run;
s_lassen
Meteorite | Level 14

I would do it like this:

data want;
  set have;
  array ranks(*) rank1-rank10;
  same=0;
  do _N_=dim(ranks) to 2 by -1 until(same);
    if not missing(ranks(_N_)) then
      same=whichn(ranks(_N_),of ranks(*))<_N_;
    end;
run;
Ksharp
Super User
data have;
infile cards expandtabs;
input id	Rank1	Rank2	Rank3	Rank4	Rank5	Rank6	Rank7	Rank8	Rank9	Rank10 ;
cards;
1	3	.	.	.	.	.	4	.	.	.
2	3	.	3	.	.	.	.	.	.	.
3	1	.	.	.	.	1	4	.	.	.
4	3	.	.	.	.	.	2	.	.	.
5	3	.	3	.	.	.	.	.	.	2
6	.	.	.	.	.	.	4	.	1	.
;
run;
data want;
 if _n_=1 then do;
  declare hash h();
  h.definekey('k');
  h.definedone();
 end;
set have;
array x{*} rank:;
do i=1 to dim(x);
  if not missing(x{i}) then do;k=x{i};h.ref();end;
end;
same=n(of x{*}) ne h.num_items;
h.clear();
drop i k;
run;
novinosrin
Tourmaline | Level 20
data have;
infile cards expandtabs;
input id	Rank1	Rank2	Rank3	Rank4	Rank5	Rank6	Rank7	Rank8	Rank9	Rank10 ;
cards;
1	3	.	.	.	.	.	4	.	.	.
2	3	.	3	.	.	.	.	.	.	.
3	1	.	.	.	.	1	4	.	.	.
4	3	.	.	.	.	.	2	.	.	.
5	3	.	3	.	.	.	.	.	.	2
6	.	.	.	.	.	.	4	.	1	.
;


data want ;
set have;
 array r [*] rank:;
 array t [ 10] _temporary_ ;
 call pokelong ( (peekclong (addrlong(r[1]), 80)), addrlong(t[1]), 80) ; 
 call sortn(of t(*)); 
 do _n_=whichn(coalesce(of t(*)),of t(*))+1 to dim(t);
 same=t(_n_)=t(_n_-1);
 if same then leave;
 end;
 run;
mkeintz
PROC Star

@novinosrin

 

I like the idea of sorting values, which reduces the task to looking for identical neighboring values.  I presume you put the variables in a temporary array, to avoid sorting the original variables.  But in this case, it's a bit simpler (no poke and peek) to sort the rank variables in place, process them, and then re-read in original order:

 

data have;
  input id Rank1 Rank2 Rank3 Rank4 Rank5 Rank6 Rank7 Rank8 Rank9 Rank10 ;
datalines;
1 3 . . . . . 4 . . . 
2 3 . 3 . . . . . . . 
3 1 . . . . 1 4 . . . 
4 3 . . . . . 2 . . . 
5 3 . 3 . . . . . . 2 
6 . . . . . . 4 . 1 . 
run;

data want (drop=_i);
  set have;
  same=0;
  array rnk {*} rank: ;
  call sortn(of rnk{*});   /* Sort the rank variables */

  /* Now look for identical neighbors */
  do _i=dim(rnk) to 2 by -1 while (same=0 and rnk{_i-1}^=.);
    if rnk{_i}=rnk{_i-1} then same=1;
  end;

  /* Reread the rank variables, in pre-sorted sequence */
  set have (keep=rank:) point=_n_;
run;

 

The do loop starts at the upper bound of the rnk array because all the missing values will be sorted towards the lower bound.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
novinosrin
Tourmaline | Level 20

Thank you Mark. Actually I wanted to reach out to you for a small understanding help on the conditional set statement on the other thread and more important one on a  conditional lag. I will probably do that on Friday assuming it's only fair not to bother you weekdays. Just wanted to let you know I have a request coming through. 

 

Not the 1st or the last that you have helped me speed over the past couple of years , so another one. Have a nice afternoon/evening

sas-innovate-white.png

Special offer for SAS Communities members

Save $250 on SAS Innovate and get a free advance copy of the new SAS For Dummies book! Use the code "SASforDummies" to register. Don't miss out, May 6-9, in Orlando, Florida.

 

View the full agenda.

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 7 replies
  • 1615 views
  • 4 likes
  • 7 in conversation