BookmarkSubscribeRSS Feed
santosh_pat69
Quartz | Level 8

Hi,

 

I am trying to create an indicator based on the value contributed from the tables while merging

i need to merge the 3 tables in order to generate an output.

 

i need to create an indicator column with the values 1 , 0 in it

 

the output should look as below from the three tables merged.

 

Data TableA;

Input A B C;

1 1 2 3

 2 4 5 6

7 8 9

;

 

Data TableB;

Input A B C;

12 11 2 3

34 4 5 6

1 71 8 9

 

Data Tablec;

Input ID A B C;

1 10 2 3

2 41 5 6

3 7 8 9

ID   tableA  tableB  TABLEC

1      0          1           0

12   1           0           1

3   1            1            1

 

 

I hope i am clear on this

 

9 REPLIES 9
Pamela_JSRCC
Quartz | Level 8

I'm sorry but I do not understand what you are trying to do.  In the first tables, you list only 3 variables but the data shows 4.  Should they all have ID, A, B and C variables like tableC?  What is the meaning of the "indicator" in the results?  How are you merging the tables, on what variables?

ballardw
Super User

You can use the DATASET option IN to build indicator variables of the sets that contribute to any given record.

 

data want;

   merge a (in=InA)

              b (in=InB)

              c (in= InC)

  ;

  /* the varaibles InA InB and InC have values of 1/0 depending on whether the associated dataset is contributing to result.

HOWEVER these are temporary variables. So you need to store them*/

   TableA=InA;

    TableB=InB;

    TableC=InC;

run;

 

 

santosh_pat69
Quartz | Level 8

Hi Super User,

 

 

Thank you !!!

 

I am trying to create a indicator on the dataset

if the value of the By variable is present on the input tables.

 

it would be something like this

if the variable A from tableA has the values of 100, and the Variable A from the tableB also has the value 100 but the table C has a different variable value 104.

i am trying to create the new variables A1 B1 and C1 based on the value of A fro tables A,B and C

the ouput after the merge should similar to

   tableA  |   tableB  |  tableC  | A1 |  B1  | C1|

A |100        |100          |104        |  1  |  1    |   0 |

 

Hope i am clear this time.

 

 

ballardw
Super User

Since you don't show anything with a BY statement that bit needs to be shown. Post your merge code.

 

If you are using A as a BY variable with those values then the result you show will not happen as a Variable A in table C with value 104 will not match 100 for Variable A in sets A or B. But the IN variables I mentioned would have the A1 B1 and C1 values you indicate as desired.

(And it may be a better idea to have VAR1 instead of A as there is potential for confusing your variable names with your data set names).

santosh_pat69
Quartz | Level 8

Hi Super User,

I have tried this and i am close to the desred output.

 

After sorting all the tables by ID, and making the required manipulations on the datasets.

 

 

data M4;*(drop=XXXX_ID);

merge Dept.A_TRN(in=a) Dept.B_TRN(in=b) Dept.c_TRN(in=c) f_TRN(IN=d) ;

By ID;

if a then

do;

 A_TRN='1';

End;

Else

do;

 A_TRN='0';

end;

if b then

do;

 B_TRN = '1';

End;

Else

Do;

 B_TRN = '0';

End;

if c then

do;

 C_TRN = '1';

End;

Else

Do;

 C_TRN = '0';

End;

if d then

do;

 F_TRN = '1';

End;

Else

Do;

 F_TRN = '0';

End;

Run;

 

ballardw
Super User

Is there some particular reason you want your indicator variables to be character? They are more cumbersome to work with.

 

You could use to make character values of '0' and '1':

A_TRN= put(a,f1.);

B_TRN= put(b,f1.);

C_TRN= put(c,f1.);

F_TRN= put(d,f1.); <sub-optimal choice using d, in=F would make much more sense

 

Instead of all the If-Then-Else statements

Pamela_JSRCC
Quartz | Level 8

Is this a fair representation of the sample data where the values for ID in any of the 3 tables are in the set {1,2,3,12,34}?

 

Data TableA;

Input ID A B C;

datalines;

1  1  2  3

2  4  5  6

3  7  8  9

;

Data TableB;

Input ID A B C;

datalines;

12  11 2  3

34   4  5  6

1   71  8  9

;

Data TableC;

Input ID A B C;

datalines;

1  10  2  3

2  41  5  6

3   7   8  9

;

 

If so, what do the values mean in the output:

 

ID tableA tableB tableC

1      0          1          0

12    1          0          1

3      1          1          1

 

Why are only IDs 1,12 and 3 in the output table?  Also, what does 0 mean in the tableA column where ID=1?

 

 

santosh_pat69
Quartz | Level 8
hi SuperUser,

Thank you for your comments, let me follow your suggestion, the indicator need not be a char it can be either of the num or char.

Will keep you posted on this
Thanks a million
santosh_pat69
Quartz | Level 8
Hi Jessica

Please ignore my previous comments.

The data was made up at the last minute. The 0 and 1 are used as an indicator 1 to state that the variable is present on tableA if it's not present on the table B or TableC then it should be 0 or vice versa.

I have already achevied this using if then else. Was looking to build this efficiently.

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 9 replies
  • 1504 views
  • 0 likes
  • 3 in conversation