Help using Base SAS procedures

Match table 1 and 2

Reply
Contributor
Posts: 46

Match table 1 and 2

Anyone can help me with this problem?

Table 1
IDProductColor_idColorPrice
111Camera1Black_E50
111Camera1Black_M50
111Camera1Black_T50
111Camera2Brown_A45
111Camera2Brown_B45
111Camera3Gray_C40
111Camera3Gray_K40
111Camera3Gray_V40
113Printer1Black_E70
113Printer1Black_M70
113Printer1Black_T70
113Printer2Brown_A65
113Printer2Brown_B65
113Printer3Gray_C75
113Printer3Gray_K75
113Printer3Gray_V75

Table2(Template)
Color_idColorPrice
1Black_E50
1Black_M50
2Brown_A45
2Brown_C30
3Gray_C40
3Gray_K40
3Gray_V40
1Black_E70
1Black_M70
1Black_T70
2Brown_A65
2Brown_B65
3Gray_C75
3Gray_K75
3Gray_V75
3Gray_Q80

i am expecting in my test using Table 1 and 2 to have this below :

Table Expected
IDProductColor_idColorPriceTest
111Camera1Black_E50valid
111Camera1Black_M50valid
111Camera1Black_T50Invalid
111Camera2Brown_C30missing
111Camera2Brown_A45valid
111Camera2Brown_B45Invalid
111Camera3Gray_C40valid
111Camera3Gray_K40valid
111Camera3Gray_V40valid
113Printer1Black_E70valid
113Printer1Black_M70valid
113Printer1Black_T70valid
113Printer2Brown_A65valid
113Printer2Brown_B65Valid
113Printer3Gray_C75valid
113Printer3Gray_K75valid
113Printer3Gray_V75valid
113Printer3Gray_Q80missing

Table 1 is what i usually receive from another department and it should follow Table 2 instruction. That is why i need a test on the Table 1.

I hope my problem is clear to understand.

Thank you in advance

Contributor
Posts: 46

Re: Match table 1 and 2

it is almost similar to my previous post "HOW TO MERGE TWO TABLES" but i am having difficulty to include a line of code in the provided code by Xia.

Super User
Posts: 11,343

Re: Match table 1 and 2

What makes a result of "invalid" or "missing"?

Contributor
Posts: 46

Re: Match table 1 and 2

'Invalid' if ID=111( Camera) has anything other than

color id      Color            Price

1                Black_E          50

1               Black_M           50

For the same ID I am looking for in the received report Table1

Color id           Color               Price

2                        Brown_C       30

2                        Brown_A        45

any other color in the received report Table1 is Invalid and Brown_C is not in, is considered 'missing'

the Table I am looking as a reference is the Table2, my report has to match it. I am having problem since the Product and their IDs are not in the reference table (Table2).

Is this help?

Super User
Posts: 10,023

Re: Match table 1 and 2

Not easy to catch you . really make me headache.



 
data Table1;
input ID     Product : $ 20.     Color_id     Color : $ 20.     Price ;
cards; 
111     Camera     1     Black_E     50
111     Camera     1     Black_M     50
111     Camera     1     Black_T     50
111     Camera     2     Brown_A     45
111     Camera     2     Brown_B     45
111     Camera     3     Gray_C     40
111     Camera     3     Gray_K     40
111     Camera     3     Gray_V     40
113     Printer     1     Black_E     70
113     Printer     1     Black_M     70
113     Printer     1     Black_T     70
113     Printer     2     Brown_A     65
113     Printer     2     Brown_B     65
113     Printer     3     Gray_C     75
113     Printer     3     Gray_K     75
113     Printer     3     Gray_V     75
;
run;



data Table2;
input Color_id     Color : $ 20.     Price ;
cards;
1     Black_E     50
1     Black_M     50
2     Brown_A     45
2     Brown_C     30
3     Gray_C     40
3     Gray_K     40
3     Gray_V     40
1     Black_E     70
1     Black_M     70
1     Black_T     70
2     Brown_A     65
2     Brown_B     65
3     Gray_C     75
3     Gray_K     75
3     Gray_V     75
3     Gray_Q     80
;
run;

data table1;
 set table1;
 by ID Product Color_id notsorted;
 if first.Color_id and Color_id=1 then n+1;
run;
data table2;
 set table2;
 by Color_id notsorted;
 if first.Color_id and Color_id=1 then n+1;
run;
proc sort data=table1;by  n  Color_id     Color;run;
proc sort data=table2;by  n  Color_id     Color;run;
data want;
 merge table1(in=ina) table2(in=inb rename=(price=_price));
 by n  Color_id     Color;
 length flag $ 20;
 if not ina then do;flag='missing';price=_price;end;
 if not inb     then flag='invalid';
 if ina and inb then do;
  if price=_price then flag='valid';
   else flag='invalid';
 end;
 drop n _price;
run;



Xia Keshan

Contributor
Posts: 46

Re: Match table 1 and 2

Xia

thank you for your reply. I am having some misalignement in the Expected table like this:

IDProductColor_idColorPriceflag
..1Black_E50missing
..1Black_M50missing
111Camera1Black_T50Invalid
111Camera2Brown_C30missing
111Camera2Brown_A45valid
111Camera2Brown_B45Invalid
111Camera3Gray_C40valid
111Camera3Gray_K40valid
111Camera3Gray_V40valid
111Camera1Black_E70valid
113Printer1Black_M70valid
113Printer1Black_T70valid
113Printer2Brown_A65valid
113Printer2Brown_B65Invalid
113Printer3Gray_C75valid
113Printer3Gray_K75valid
113Printer3Gray_V75valid
113Printer3Gray_Q80invalid
.
.
.

Any idea?

Super User
Posts: 10,023

Re: Match table 1 and 2

You have missing value for Color_id     Color, My code did not consider about it . Do you really need them ?

And What is not expected ?

Contributor
Posts: 46

Re: Match table 1 and 2

Xia

i am going to send you the original data to take a look. what i have in post, is just a sample to show the case. thank you

Contributor
Posts: 46

Re: Match table 1 and 2

Based on my original data, i have these fields:

Table1
IDVar1Var2_idVar2Var3_idVar3Var4_idVar4
111Prod 1
111Prod 1
111Prod 1
111Prod 1
111Prod 1
111Prod 1
111Prod 1
111Prod 1
111Prod 1
111Prod 1
111Prod 1
111Prod 1
112Prod 2
112Prod 2
112Prod 2
112Prod 2
112Prod 2
112Prod 2
112Prod 2
112Prod 2
112Prod 2
112Prod 2
112Prod 2
112Prod 2
112Prod 2
113Prod 3
113Prod 3
113Prod 3
113Prod 3
113Prod 3
113Prod 3
113Prod 3
113Prod 3
113Prod 3
113Prod 3
113Prod 3
113Prod 3
113Prod 3
113Prod 3
113Prod 3
113Prod 3
.
.
.
.

Table2
Var2_idVar2Var3_idVar3Var4_idVar4
......
......
......
......
......
......
......
......
......
......
......

i would like to test for any ID and Var1

t1.var2_id=t2.var2_id  ,  test=valid

t1.var3_id=t2.var3_id, test=valid

t1.var3_id=t2.var3_id, test=valid

or

t1.var2_id^=t2.var2_id, test=invalid

t1.var3_id^=t2.var3_id, test=invalid

t1.var3_id^=t2.var3_id, test=invalid

sometimes for the same ID and Var1 you might not find  t2.Var2_id or t2.Var3_id or t2.Var4_id

in table1 (t1)  then test =missing

i hope this will help to get the picture of what i am facing. Exactly the same number of variables as my work data.

thanks

Super User
Posts: 10,023

Re: Match table 1 and 2

No . can't understand you .

IF

t1.var2_id=t2.var2_id  , 

OR t1.var3_id=t2.var3_id,

OR t1.var3_id=t2.var3_id,   THEN  test=valid   ?

By that logic why the number of obs in output is not the same as in Table 1 ?

Contributor
Posts: 46

Re: Match table 1 and 2

we have some report where it is not equal, that is the reason why i am doing this test to detect which Product does not comply so we can return it back to be rate correctly. we have found some instance where

Product1 , ID 111, has Var4 like this (in table1)

A

B

C

D

while it should be ( based on the Table2):

A

B

C

E

in this case i would say that in Var4 for this product

A is valid

B is valid

C is valid

D is invalid

E is missing

it is complex in term of how to translate it into coding, that is why i am struggling. What make it hard, is the list of the product is very huge, so i need a code to get it done.

Thank you for looking into that

Super User
Posts: 10,023

Re: Match table 1 and 2

Well. I think my code is on the right way .

If you think ID Product should not be missing value , that would be easy to fix .

Otherwise, post the sample data , and the wrong result you get , and point where is not right .

Xia Keshan

Contributor
Posts: 46

Re: Match table 1 and 2

i hpe this sample works because it has the same number of variables i am using:

Table1
IDVar1Var2_idVar2Var3_idVar3Var4_idVar4
111Prod 11Printing1Small_12duration
111Prod 11Printing1Small_223support
111Prod 11Printing1Small_322addlt time
111Prod 11Printing2Med_v12managers
111Prod 11Printing2Med_m3average # of
111Prod 11Printing2Med_k5properties
111Prod 11Printing3Larg_b115strong
111Prod 11Printing3larg_b220competitive
111Prod 11Printing3larg_b324occurrence
111Prod 11Printing4Huge_c111# of cyle
111Prod 11Printing4Huge_c213tag
111Prod 11Printing4Huge_c325Tools
112Prod 22Computing....
112Prod 22Computing....
112Prod 22Computing....
112Prod 22Computing....
112Prod 22Computing....
112Prod 22Computing....
112Prod 22Computing....
112Prod 22Computing....
112Prod 22Computing....
112Prod 22Computing...
112Prod 22Computing....
112Prod 22Computing...
112Prod 22Computing....
113Prod 34.....
113Prod 34....
113Prod 34......
113Prod 34.....
113Prod 34....
113Prod 34......
113Prod 34.....
113Prod 34.....
113Prod 34.....
113Prod 34.....
113Prod 34.....
113Prod 34.....
113Prod 34.....
113Prod 34.....
113Prod 34.....
113Prod 34......
.
.
.
.

Table2
Var2_idVar2Var3_idVar3Var4_idVar4
1Printing1Small_12duration
1Printing1Small_223support
1Printing1Small_322addlt time
1Printing2Med_v12managers
1Printing2Med_m3average # of
1Printing2Med_k5properties
1Printing3Larg_b115strong
1Printing3larg_b220competitive
1Printing3larg_b328exam
1Printing4Huge_c111# of cyle
1Printing4Huge_c213tag
1Printing4Huge_c325Tools

looking at these two tables,  i will say that i receive Table 1 and want to compare it with Table2

taking  Var4 from Table1

-11 obs match with t2, so "valid"

-Var4_id =24, Var4 =occurrence do not match so "invalid"

-Var_id =28 , Var4 =exam is not in t1 so it is "missing"

i know it is too many example of data. Please consider this one only, as it describes better what i am dealing with.

Super User
Posts: 10,023

Re: Match table 1 and 2

OK. Hope this time could work.

 
data Table1;
infile cards expandtabs truncover;
input ID     Var1 & $20.     Var2_id     Var2 : $20.     Var3_id       Var3 : $20.     Var4_id     Var4 & $20. ;
cards; 
111     Prod 1     1     Printing     1     Small_1     2     duration
111     Prod 1     1     Printing     1     Small_2     23     support
111     Prod 1     1     Printing     1     Small_3     22     addlt time
111     Prod 1     1     Printing     2     Med_v     12     managers
111     Prod 1     1     Printing     2     Med_m     3     average # of
111     Prod 1     1     Printing     2     Med_k     5     properties
111     Prod 1     1     Printing     3     Larg_b1     15     strong
111     Prod 1     1     Printing     3     larg_b2     20     competitive
111     Prod 1     1     Printing     3     larg_b3     24     occurrence
111     Prod 1     1     Printing     4     Huge_c1     11     # of cyle
111     Prod 1     1     Printing     4     Huge_c2     13     tag
111     Prod 1     1     Printing     4     Huge_c3     25     Tools
;
run;



data Table2;
infile cards expandtabs truncover;
input Var2_id     Var2 & $20.     Var3_id     Var3 : $20.     Var4_id       Var4 & $20.;
cards;
1     Printing     1     Small_1     2     duration
1     Printing     1     Small_2     23     support
1     Printing     1     Small_3     22     addlt time
1     Printing     2     Med_v     12     managers
1     Printing     2     Med_m     3     average # of
1     Printing     2     Med_k     5     properties
1     Printing     3     Larg_b1     15     strong
1     Printing     3     larg_b2     20     competitive
1     Printing     3     larg_b3     28     exam
1     Printing     4     Huge_c1     11     # of cyle
1     Printing     4     Huge_c2     13     tag
1     Printing     4     Huge_c3     25     Tools
;
run;


proc sort data=table1;by Var2_id     Var2     Var3_id       Var3      Var4_id     Var4;run;
proc sort data=table2;by Var2_id     Var2     Var3_id       Var3      Var4_id      Var4;run;
data want;
 merge table1(in=ina) table2(in=inb );
 by Var2_id     Var2 Var3_id       Var3      Var4_id     Var4;
 length flag $ 20;
 if not ina then flag='missing';
 if not inb then flag='invalid';
 if ina and inb then flag='valid';
run;


Xia Keshan

Ask a Question
Discussion stats
  • 13 replies
  • 415 views
  • 0 likes
  • 3 in conversation