Help using Base SAS procedures

How to categorize data columns of multiple entries and combinations in to groups.

Accepted Solution Solved
Reply
Regular Learner
Posts: 1
Accepted Solution

How to categorize data columns of multiple entries and combinations in to groups.

data
3
1,2
2,3,4
3,5,6
1,3,7
8

Can anyone help with a solution for this problem? I have been trying to find a solution for the problem of categorizing multiple entries combinations in to groups.

Problem:

If entry is any number or combination with numbers 1,2,3 then group as A ;

  If entry is any number or combination with numbers 4,5,6 then group as B ;

  If entry is any combination with numbers containing A and B then group as Both;

  If entry is any combination with numbers not containing A and B then group as other or unknown;


My approach is as follows but wish to include combinations of entries.

Data ;

     if  entry = '1,2,3'  then group = "A" ;

     if  entry = '4,5,6'  then group = "B" ; 

     if  entry = '1,2,3' AND  entry = '4,5,6' then group = "Both" ; 

   else  if  entry not in ('1,2,3','4,5,6') then group = "other or unknown" ;

run;


Accepted Solutions
Solution
‎10-20-2013 07:32 PM
Respected Advisor
Posts: 3,124

Re: How to categorize data columns of multiple entries and combinations in to groups.

Here is one thought, lots of hard coding and lack of robustness, but you may work out something better based on it.

data have;

input entry $20.;

cards;

3

1,2

2,3,4

3,5,6

1,3,7

8

;

data want;

  array _t(3) $1.; /* '3' is subjective, you will need to know that or put a bigger number hoping to cover the ground.

               You could also choose to run an additional pass to figure it out.*/

  set have;

  _a=0;_b=0; _ct=countw(entry); length group $8;

    do _i=1 to _ct;

      _t(_i)=scan(entry,_i);

       if _t(_i) in ('1','2','3') then _a=1;

       else if _t(_i) in ('4','5','6') then _b=1;

      end;

      if _a+_b=2 then group='both';

      else if _a+_b=0 then group='unknown';

      else if _a+_b=1 then do;

         if _a=1 then group='a';

         else group='b';

       end;

       drop _:;

      run;

    

Haikuo

View solution in original post


All Replies
Solution
‎10-20-2013 07:32 PM
Respected Advisor
Posts: 3,124

Re: How to categorize data columns of multiple entries and combinations in to groups.

Here is one thought, lots of hard coding and lack of robustness, but you may work out something better based on it.

data have;

input entry $20.;

cards;

3

1,2

2,3,4

3,5,6

1,3,7

8

;

data want;

  array _t(3) $1.; /* '3' is subjective, you will need to know that or put a bigger number hoping to cover the ground.

               You could also choose to run an additional pass to figure it out.*/

  set have;

  _a=0;_b=0; _ct=countw(entry); length group $8;

    do _i=1 to _ct;

      _t(_i)=scan(entry,_i);

       if _t(_i) in ('1','2','3') then _a=1;

       else if _t(_i) in ('4','5','6') then _b=1;

      end;

      if _a+_b=2 then group='both';

      else if _a+_b=0 then group='unknown';

      else if _a+_b=1 then do;

         if _a=1 then group='a';

         else group='b';

       end;

       drop _:;

      run;

    

Haikuo

Contributor
Posts: 69

Re: How to categorize data columns of multiple entries and combinations in to groups.

use a function like INDEX to populate six new binary variables that check for each value 1 - 6. then do some IF statements based on that.

For ex.

flag1-flag6=0;

if index(variable-name,'1') > 0 then flag1=1;

if index(variable-name,'1') > 0 then flag2=1;

etc...

then

if flag1 + flag2 + flag3 > 0 then do;

     if flag4 + flag5 + flag6 = 0 then group = 'A';

     else if...

else...

☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 2 replies
  • 294 views
  • 3 likes
  • 3 in conversation