BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Mscarboncopy
Pyrite | Level 9

Hi

 

I have to rename 5 variables A B C D E (that were selected yes = 1 no = 0) to become one variable X with key values of 1-5 BUT we can have multiple values.

So X can be a 1 or 1,2 or 2 or 1,2,3 or 3 or 1,2,3,4 or 4 or 1,2,3,4,5 or 5.

 

So I thought of doing this:

 

if A =1 then X =1;

if B= 1 then X =2;

if C =1 then X =3;

if D then X =4;

if E= 1 then X =5;

else

if (A =1 and B = 1) then X=1,2;

if (A =1 and B = 1 C =1) then X=1,2,3;

if (A =1 and B= 1 and C =1 and D=1) then X =1,2,3,4;

if (A =1 and B = 1 and C=1 and D=1 and E= 1) then

X=1,2,3,4,5;

Run;

 

but this does not work. Error message:  Expecting an arithmetic operator (where the commas were added)

 

I changed the comma with the word and. It "works" but it keeps only one of the values.

so in  if (A =1 and B= 1 and C =1 and D=1) then X =1 and 2, and 3 and 4

It keeps only one value. 

 

Should I have an array instead ? And if so, how can I tell sas to keep all the values when needed ?

Thank you in advance.

 

1 ACCEPTED SOLUTION

Accepted Solutions
Astounding
PROC Star

More than likely, you are better off doing nothing.  Your data is fine as it stands now.

 

If you want to change it, you first have to decide on the values for X.  If X is going to be a number, you can only store a number in it (not a set of numbers).  So what number should represent A and B both 1?  Or would you prefer that X be a character string?  At any rate, it's up to you to decide what the mapping should be between ABCDE and X.  

View solution in original post

14 REPLIES 14
Astounding
PROC Star

More than likely, you are better off doing nothing.  Your data is fine as it stands now.

 

If you want to change it, you first have to decide on the values for X.  If X is going to be a number, you can only store a number in it (not a set of numbers).  So what number should represent A and B both 1?  Or would you prefer that X be a character string?  At any rate, it's up to you to decide what the mapping should be between ABCDE and X.  

Mscarboncopy
Pyrite | Level 9

I agree... but it was not my decision. So now  I have 2 data sets - one with the variables a b c d and e and another with X (and key values all in X representing ABCDE)

And X is supposed to be numeric, so I can't change it to a string.

I think I will do what you suggested, create a number for the combinations AB - AC - AD - AE then BC BD BE and CD CE and DE and all the other combinations as I gave in my example. 

Thank you ! 

Mscarboncopy
Pyrite | Level 9

Now I have another question. If I am creating values for the combinations (as I think is the best solution), my second data file that has only one Var X with multiple numbers must also be renamed (with the keys that would correspond to the combinations). 

 

Sas is saying that X has an incorrect numeric variable for the entries (obs) with 1,3 or 2,3 and so on.

 

The renaming commands below do not work:

 

if (X =1 and X= 2) then X=6;

 

or 

 

if x=1,2 then x=6

 

Sas does not recognize 1,2 as a correct number or the way I broke it apart in the first example. 

 

Would an array work better here then ? If so, how can I work with these values separated by commas in var X?

 

My second data set looks like this (in SPSS):

 

Obs  VAR X

1       3

2       1,3
3       2,5
4       1,2,3,4

5       1

6       1,3

 

etc 

 

and it needs to look like

Obs  VAR X

1       3

2       7
3       15
4       11

5       1

6       7

 

Thanks again

Reeza
Super User
Do you need a decimal number? Tom's solution shows a method to generate a unique number for each combination already, the number variable which will be unique for every combination because of the way binary values work. Test it out.
Mscarboncopy
Pyrite | Level 9

Yes... I thought of that also, but the example he gave was for variables a b c d e (my original question)

I need to do this for only one variable now - and that is variable X that has multiple values 1 to 5.

I am not sure how to adapt his suggestion to this new scenario. 

a b c d e are not variables in this case, I have values 1 2 3 4 5 but all in one variable X that must be separated first and then renamed into a new value.

 

 

Reeza
Super User
Why not back up a step and do it in the same step while you have the ABCDE variables or separate them out again using a loop and array, with SCAN().
Mscarboncopy
Pyrite | Level 9

They are in 2 different data sets so they are independent (the file that has the different 5 variables AND the file that has one var X with 5 key values). I am trying to rename  them both as one VAR X (because they are the same variable, only collected differently) and I need to do this so the variable X in the those 2 files match and I can merge them.

 

How would I separate the values out of X using a loop and array, with SCAN() ? 

That is what I am thinking I need to do, but I've never done this before.

Thank you ! 

Tom
Super User Tom
Super User

So 1,2 is NOT A number.  As a concept it is a list of numbers but it is not A number so it cannot be stored into a single numeric field.  The only way you have values like that in SAS is by storing it as a string.

data have;
  input string_x $20.;
cards;
3
1,3
2,5
1,2,3,4
1
1,3
;

You seem to want to create a new NUMBER that can map the string.

Why to you want a number?  The number would just represent the string and probably in a much less clear way.

You are never going to use it as a number.  You could not do arithmetic with the numbers that represents 1,2,3,4 and the number that represents 1,3.  

There might be some value in converting the list of numbers back into the 0/1 boolean numbers you had before.

data want ;
  set have;
  array flags flag1-flag5 ;
  do i=1 to dim(flags);
    if findw(string_x,cats(i),',','sit') then flags[i]=1;
    else flags[i]=0;
  end;
  drop i;
run;

Result:

Obs    string_x    flag1    flag2    flag3    flag4    flag5

 1     3             0        0        1        0        0
 2     1,3           1        0        1        0        0
 3     2,5           0        1        0        0        1
 4     1,2,3,4       1        1        1        1        0
 5     1             1        0        0        0        0
 6     1,3           1        0        1        0        0

Or you could convert it into multiple observations. Then you could have a numeric variable since any single observation would store only one number.

data want2;
  set have; 
  do flag_no=1 to countw(string_x,',');
    flag = input(scan(string_x,flag_no,','),32.);
    output;
  end;
run;

Result:

Obs    string_x    flag_no    flag

  1    3              1         3
  2    1,3            1         1
  3    1,3            2         3
  4    2,5            1         2
  5    2,5            2         5
  6    1,2,3,4        1         1
  7    1,2,3,4        2         2
  8    1,2,3,4        3         3
  9    1,2,3,4        4         4
 10    1              1         1
 11    1,3            1         1
 12    1,3            2         3

 

Mscarboncopy
Pyrite | Level 9

Hi Tom,

The flags work well for what I want to do.

Thank you so much.

 

Tom
Super User Tom
Super User

You could store the values into a single character string.  Or make a number using powers of 2.

data test;
  input a b c d e ;
  length char $5 ;
  char = cats(of a b c d e);
  number=input(char,binary5.);
cards;
1 0 0 0 0
0 1 0 0 0
0 0 1 0 0
0 0 0 1 0
0 0 0 0 1
0 0 0 0 0
1 0 1 0 0
0 0 0 1 1
;
Obs    a    b    c    d    e    char     number

 1     1    0    0    0    0    10000      16
 2     0    1    0    0    0    01000       8
 3     0    0    1    0    0    00100       4
 4     0    0    0    1    0    00010       2
 5     0    0    0    0    1    00001       1
 6     0    0    0    0    0    00000       0
 7     1    0    1    0    0    10100      20
 8     0    0    0    1    1    00011       3
Reeza
Super User
X = catx(', ', A, B, C, D, E)
ballardw
Super User

Just for giggles, exactly what analysis, use or reporting are you going to do on this X variable?

 

Oh, are any of the A, B, etc ever missing? If so, how do you want to use that information?

 

 

 

Mscarboncopy
Pyrite | Level 9

They are not ever missing (it is either 1 or 0,  0 is not "missing" instead it is "source not used"). This variable is giving information about "sources" of documents used in one of our data sets.

So we can have all 5 sources or any combination of the 5, including only one, but X is never missing, we have to have at least one source for each data entry.

All we want to know is the frequency of "sources" we have in our data, what is the source most used. 

Thanks.

ballardw
Super User

@Mscarboncopy wrote:

They are not ever missing (it is either 1 or 0,  0 is not "missing" instead it is "source not used"). This variable is giving information about "sources" of documents used in one of our data sets.

So we can have all 5 sources or any combination of the 5, including only one, but X is never missing, we have to have at least one source for each data entry.

All we want to know is the frequency of "sources" we have in our data, what is the source most used. 

Thanks.


The sum of each of your variables A to E is the number of times it is used as source. So perhaps NOTHING is needed to answer that question except a summary procedure that will some those values.

data test;
  input a b c d e ;
  length char $5 ;
  char = cats(of a b c d e);
  number=input(char,binary5.);
cards;
1 0 0 0 0
0 1 0 0 0
0 0 1 0 0
0 0 0 1 0
0 0 0 0 1
0 0 0 0 0
1 0 1 0 0
0 0 0 1 1
;

proc means data=test sum;
var a b c d e;
run;

If you have other variables that you need to consider groups such as "agency" or "topic" or similar, those variables would go into a CLASS statement.

hackathon24-white-horiz.png

2025 SAS Hackathon: There is still time!

Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!

Register Now

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 14 replies
  • 3101 views
  • 2 likes
  • 5 in conversation