BookmarkSubscribeRSS Feed
Shalini_K
Calcite | Level 5

Hi All,

My data set has following fields

Customer_ID(Primary key)  MCG_1  MCG_2  MCG_3   MCG_4   MCG_5

The 5 attributes MCG_1 to MCG_5 are row wise ranks based on customer spends in 5 merchant category groups(Shopping/Restaurants etc) also

containing ties and blanks.


Sample dataset:


Cust_ID,MCG_1,MCG_2,MCG_3,MCG_4,MCG_5,

3589,2,5,1,3,4,

1366,1.5,1.5,4,3,5,

6144,,1,2,4,3

I need to create 5 variables in the dataset which will give me variable names based on their ranks;and both the variable names in case of ties(eg 2nd record below)

How can I do it in Base SAS ?

Intended outcome:

Cust_IDRank1Rank2Rank3Rank4Rank5
3589MCG_3MCG_1MCG_4MCG_5MCG_2
1366MCG_1/MCG_2 MCG_4MCG_3MCG_5
6144MCG_2MCG_3MCG_5MCG_4



Thanks,

Shalini

7 REPLIES 7
art297
Opal | Level 21

I think that the following will do what you want:

data want (drop=MCG_1-MCG_5);

  set have;

  array _r(*) MCG_1-MCG_5;

  array rank(5) $11.;

  call missing (of rank(*));

  do _n_=1 to dim(_r);

    if not missing(_r(_n_)) then do;

      if int(_r(_n_)) eq _r(_n_) then rank(_r(_n_))=vname(_r(_n_));

      else if missing(rank(int(_r(_n_)))) then rank(int(_r(_n_)))=

       vname(_r(int(_r(_n_))));

      else  rank(int(_r(_n_)))=catx('/',rank(int(_r(_n_))),

       vname(_r(_n_)));

    end;

  end;

run;

Shalini_K
Calcite | Level 5

Thanks a lot Arthur for your help..

When I increased the attributes from 5 to 11, the code provides incorrect output in case of ties.

For eg:

Cust_ID,MCG_1,MCG_2,MCG_3,MCG_4,MCG_5,MCG_6,MCG_7,MCG_8,MCG_9,MCG_10,MCG_11,

3478,2.5,2.5,2.5,9.5,9.5,8,,2.5,5.5,7,5.5

the code gives below output:

Rank1-(blank), Rank2- MCG_2/MCG_2/MCG_3/MCG_8

Rank3-(blank), Rank4- (blank),

Rank5 - MCG_5/MCG_11, Rank6 - (blank),

Rank7 - MCG_10,  Rank8 - MCG_6,

Rank9 -MCG_9/MCG_5 , Rank10 - (blank),  Rank11 - (blank)

The required outcome is:

Rank1 - (MCG_1/MCG_2/MCG_3/MCG_8 , Rank2- (blank),

Rank3-(blank), Rank4- (blank),

Rank5 - MCG_9/MCG_11, Rank6 - (blank),

Rank7 - MCG_10,  Rank8 - MCG_6, 

Rank9 -MCG_4/MCG_5 , Rank10 - (blank),  Rank11 - (blank)

What can be done to achieve the required outcome?

Tom
Super User Tom
Super User

Restructure data using PROC TRANSPOSE and then use PROC RANK TIES=LOW to convert your ranks into integers that you can use as array indexes.

data have ;

  infile cards dsd truncover ;

  input Cust_ID MCG_1 - MCG_11 ;

cards;

3589,2,5,1,3,4,

1366,1.5,1.5,4,3,5,

6144,,1,2,4,3

3478,2.5,2.5,2.5,9.5,9.5,8,.,2.5,5.5,7,5.5

run;

proc transpose data=have out=vertical ;

  by cust_id notsorted;

  var mcg_: ;

run;

proc rank data=vertical ties=low out=vertical2;

  by cust_id notsorted;

  var col1;

  ranks rank;

run;

data want ;

  set vertical2 ;

  by cust_id notsorted;

  array r $100 rank1-rank11 ;

  retain rank1-rank11;

  keep cust_id rank1-rank11;

  if first.cust_id then call missing(of rank1-rank11);

  if rank then r(rank)=catx('/',r(rank),_name_);

  if last.cust_id ;

run;

data _null_;

  set want ;

  put (_all_) (=/);

run;

Cust_ID=3589

rank1=MCG_3

rank2=MCG_1

rank3=MCG_4

rank4=MCG_5

rank5=MCG_2

rank6=

rank7=

rank8=

rank9=

rank10=

rank11=

Cust_ID=1366

rank1=MCG_1/MCG_2

rank2=

rank3=MCG_4

rank4=MCG_3

rank5=MCG_5

rank6=

rank7=

rank8=

rank9=

rank10=

rank11=

Cust_ID=6144

rank1=MCG_2

rank2=MCG_3

rank3=MCG_5

rank4=MCG_4

rank5=

rank6=

rank7=

rank8=

rank9=

rank10=

rank11=

Cust_ID=3478

rank1=MCG_1/MCG_2/MCG_3/MCG_8

rank2=

rank3=

rank4=

rank5=MCG_9/MCG_11

rank6=

rank7=MCG_10

rank8=MCG_6

rank9=MCG_4/MCG_5

rank10=

rank11=

Shalini_K
Calcite | Level 5

Thanks a lot..Both the codes worked for me

Shalini_K
Calcite | Level 5

HI..

For the above query,I have a slight modification in the requirement in case of ties.

Sample dataset:

Cust_ID,MCG_1,MCG_2,MCG_3,MCG_4,MCG_5,MCG_6,MCG_7,MCG_8,MCG_9,MCG_10,MCG_11,

Sp_1, Sp_2, Sp_3, Sp_4, Sp_5, Sp_6, Sp_7, Sp_8, Sp_9, Sp_10, Sp_11,

3478,2.5,2.5,2.5,9.5,9.5,8,.,2.5,5.5,7,5.5,1000,2000,3000,4000,5000,6000,7000,8000,9000,10000,11000,

3479,1.5,3.5,3.5,8.5,8.5,10,11,1.5,6,6,6,15000,14000,12500,11000,9000,8000,7500,15000,6000,5000,4000


The 11 attributes MCG_1 to MCG_11 are row wise ranks based on customer RFM scores in 11 merchant category groups(Shopping/Restaurants etc) and sp_1 to sp_11 their respective spends($)

The codes shared by Tom & Arthur helped me create 11 new variables which gave me variable names based on their ranks;and both the variable names in case of ties.

In case of ties,instead of  tied variable names separted by '/ ' I now want the variable name of the Max value of corresponding tied spends.

Earlier outcome was:

Cust_IDrank1rank2rank3rank4rank5rank6rank7rank8rank9rank10rank11
3478MCG_1/MCG_2/MCG_3/MCG_8MCG_9/MCG_11MCG_10MCG_6MCG_4/MCG_5
3479MCG_1/MCG_8MCG_2/MCG_3MCG_9/MCG_10/MCG_11MCG_4/MCG_5MCG_6MCG_7


Required outcome:

Cust_IDrank1rank2rank3rank4rank5rank6rank7rank8rank9rank10rank11
3478sp_8sp_3sp_2sp_1sp_11sp_9MCG_10MCG_6sp_5sp_4
3479sp_1sp_8sp_2sp_3sp_9sp_10sp_11sp_4sp_5MCG_6MCG_7


In 2nd record,MCG(1 & 😎 as well as Sp variable is tied for rank 1.Either of the variable name sp_1 or sp_8 will be acceptable in rank1 & rank2 values.

Can this be done in SAS?




Shalini_K
Calcite | Level 5

The 'earlier outcome' & 'required outcome' tables have 20+ columns & may not appear totally on the screen.Hence,I am attaching a csv file

'Reqd_Outcome.csv' for your reference.

art297
Opal | Level 21

You would have to test the following, but I think it does what you want:

data have;

  infile cards dlm=',';

  input Cust_ID MCG_1-MCG_11;

  cards;

3478,2.5,2.5,2.5,9.5,9.5,8,.,2.5,5.5,7,5.5

3479,1.5,3.5,3.5,8.5,8.5,10,11,1.5,6,6,6

;

data want (drop=MCG: counter: _: i j);

  set have;

  array r(*) MCG_1-MCG_11;

  array rank(11) $70.;

  array counter(11);

  array _rank(11);

  call missing (of rank(*));

  call missing (of counter(*));

  call missing (of _rank(*));

  do i=1 to 11;

    if not missing(r(i)) then do;

      do j=1 to 11;

        if r(i)=r(j) then counter(i)+1;

      end;

      if counter(i) eq 1 then _rank(i)=r(i);

      else _rank(i)=round(r(i))-(int(counter(i)/2));

    end;

  end;

  do i=1 to dim(r);

    if not missing(r(i)) then do;

      if r(i) eq _rank(i) then rank(r(i))=vname(r(i));

      else  rank(_rank(i))=catx('/',rank(_rank(i)),vname(r(i)));

    end;

  end;

run;

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 7 replies
  • 1156 views
  • 6 likes
  • 3 in conversation