DATA Step, Macro, Functions and more

Creating an indicator variable

Reply
Contributor
Posts: 66

Creating an indicator variable

Hi,

 

I am trying to create an indicator based on the value contributed from the tables while merging

i need to merge the 3 tables in order to generate an output.

 

i need to create an indicator column with the values 1 , 0 in it

 

the output should look as below from the three tables merged.

 

Data TableA;

Input A B C;

1 1 2 3

 2 4 5 6

7 8 9

;

 

Data TableB;

Input A B C;

12 11 2 3

34 4 5 6

1 71 8 9

 

Data Tablec;

Input ID A B C;

1 10 2 3

2 41 5 6

3 7 8 9

ID   tableA  tableB  TABLEC

1      0          1           0

12   1           0           1

3   1            1            1

 

 

I hope i am clear on this

 

Contributor
Posts: 39

Re: Creating an indicator variable

Posted in reply to santosh_pat69

I'm sorry but I do not understand what you are trying to do.  In the first tables, you list only 3 variables but the data shows 4.  Should they all have ID, A, B and C variables like tableC?  What is the meaning of the "indicator" in the results?  How are you merging the tables, on what variables?

Super User
Posts: 11,343

Re: Creating an indicator variable

Posted in reply to santosh_pat69

You can use the DATASET option IN to build indicator variables of the sets that contribute to any given record.

 

data want;

   merge a (in=InA)

              b (in=InB)

              c (in= InC)

  ;

  /* the varaibles InA InB and InC have values of 1/0 depending on whether the associated dataset is contributing to result.

HOWEVER these are temporary variables. So you need to store them*/

   TableA=InA;

    TableB=InB;

    TableC=InC;

run;

 

 

Contributor
Posts: 66

Re: Creating an indicator variable

Hi Super User,

 

 

Thank you !!!

 

I am trying to create a indicator on the dataset

if the value of the By variable is present on the input tables.

 

it would be something like this

if the variable A from tableA has the values of 100, and the Variable A from the tableB also has the value 100 but the table C has a different variable value 104.

i am trying to create the new variables A1 B1 and C1 based on the value of A fro tables A,B and C

the ouput after the merge should similar to

   tableA  |   tableB  |  tableC  | A1 |  B1  | C1|

A |100        |100          |104        |  1  |  1    |   0 |

 

Hope i am clear this time.

 

 

Super User
Posts: 11,343

Re: Creating an indicator variable

Posted in reply to santosh_pat69

Since you don't show anything with a BY statement that bit needs to be shown. Post your merge code.

 

If you are using A as a BY variable with those values then the result you show will not happen as a Variable A in table C with value 104 will not match 100 for Variable A in sets A or B. But the IN variables I mentioned would have the A1 B1 and C1 values you indicate as desired.

(And it may be a better idea to have VAR1 instead of A as there is potential for confusing your variable names with your data set names).

Contributor
Posts: 66

Re: Creating an indicator variable

Hi Super User,

I have tried this and i am close to the desred output.

 

After sorting all the tables by ID, and making the required manipulations on the datasets.

 

 

data M4;*(drop=XXXX_ID);

merge Dept.A_TRN(in=a) Dept.B_TRN(in=b) Dept.c_TRN(in=c) f_TRN(IN=d) ;

By ID;

if a then

do;

 A_TRN='1';

End;

Else

do;

 A_TRN='0';

end;

if b then

do;

 B_TRN = '1';

End;

Else

Do;

 B_TRN = '0';

End;

if c then

do;

 C_TRN = '1';

End;

Else

Do;

 C_TRN = '0';

End;

if d then

do;

 F_TRN = '1';

End;

Else

Do;

 F_TRN = '0';

End;

Run;

 

Super User
Posts: 11,343

Re: Creating an indicator variable

Posted in reply to santosh_pat69

Is there some particular reason you want your indicator variables to be character? They are more cumbersome to work with.

 

You could use to make character values of '0' and '1':

A_TRN= put(a,f1.);

B_TRN= put(b,f1.);

C_TRN= put(c,f1.);

F_TRN= put(d,f1.); <sub-optimal choice using d, in=F would make much more sense

 

Instead of all the If-Then-Else statements

Contributor
Posts: 39

Re: Creating an indicator variable

Posted in reply to santosh_pat69

Is this a fair representation of the sample data where the values for ID in any of the 3 tables are in the set {1,2,3,12,34}?

 

Data TableA;

Input ID A B C;

datalines;

1  1  2  3

2  4  5  6

3  7  8  9

;

Data TableB;

Input ID A B C;

datalines;

12  11 2  3

34   4  5  6

1   71  8  9

;

Data TableC;

Input ID A B C;

datalines;

1  10  2  3

2  41  5  6

3   7   8  9

;

 

If so, what do the values mean in the output:

 

ID tableA tableB tableC

1      0          1          0

12    1          0          1

3      1          1          1

 

Why are only IDs 1,12 and 3 in the output table?  Also, what does 0 mean in the tableA column where ID=1?

 

 

Contributor
Posts: 66

Re: Creating an indicator variable

Posted in reply to santosh_pat69
hi SuperUser,

Thank you for your comments, let me follow your suggestion, the indicator need not be a char it can be either of the num or char.

Will keep you posted on this
Thanks a million
Contributor
Posts: 66

Re: Creating an indicator variable

Posted in reply to santosh_pat69
Hi Jessica

Please ignore my previous comments.

The data was made up at the last minute. The 0 and 1 are used as an indicator 1 to state that the variable is present on tableA if it's not present on the table B or TableC then it should be 0 or vice versa.

I have already achevied this using if then else. Was looking to build this efficiently.
Ask a Question
Discussion stats
  • 9 replies
  • 334 views
  • 0 likes
  • 3 in conversation