Anyone can help me with this problem?
Table 1 | ||||
ID | Product | Color_id | Color | Price |
111 | Camera | 1 | Black_E | 50 |
111 | Camera | 1 | Black_M | 50 |
111 | Camera | 1 | Black_T | 50 |
111 | Camera | 2 | Brown_A | 45 |
111 | Camera | 2 | Brown_B | 45 |
111 | Camera | 3 | Gray_C | 40 |
111 | Camera | 3 | Gray_K | 40 |
111 | Camera | 3 | Gray_V | 40 |
113 | Printer | 1 | Black_E | 70 |
113 | Printer | 1 | Black_M | 70 |
113 | Printer | 1 | Black_T | 70 |
113 | Printer | 2 | Brown_A | 65 |
113 | Printer | 2 | Brown_B | 65 |
113 | Printer | 3 | Gray_C | 75 |
113 | Printer | 3 | Gray_K | 75 |
113 | Printer | 3 | Gray_V | 75 |
Table2 | (Template) | |
Color_id | Color | Price |
1 | Black_E | 50 |
1 | Black_M | 50 |
2 | Brown_A | 45 |
2 | Brown_C | 30 |
3 | Gray_C | 40 |
3 | Gray_K | 40 |
3 | Gray_V | 40 |
1 | Black_E | 70 |
1 | Black_M | 70 |
1 | Black_T | 70 |
2 | Brown_A | 65 |
2 | Brown_B | 65 |
3 | Gray_C | 75 |
3 | Gray_K | 75 |
3 | Gray_V | 75 |
3 | Gray_Q | 80 |
i am expecting in my test using Table 1 and 2 to have this below :
Table Expected | |||||
ID | Product | Color_id | Color | Price | Test |
111 | Camera | 1 | Black_E | 50 | valid |
111 | Camera | 1 | Black_M | 50 | valid |
111 | Camera | 1 | Black_T | 50 | Invalid |
111 | Camera | 2 | Brown_C | 30 | missing |
111 | Camera | 2 | Brown_A | 45 | valid |
111 | Camera | 2 | Brown_B | 45 | Invalid |
111 | Camera | 3 | Gray_C | 40 | valid |
111 | Camera | 3 | Gray_K | 40 | valid |
111 | Camera | 3 | Gray_V | 40 | valid |
113 | Printer | 1 | Black_E | 70 | valid |
113 | Printer | 1 | Black_M | 70 | valid |
113 | Printer | 1 | Black_T | 70 | valid |
113 | Printer | 2 | Brown_A | 65 | valid |
113 | Printer | 2 | Brown_B | 65 | Valid |
113 | Printer | 3 | Gray_C | 75 | valid |
113 | Printer | 3 | Gray_K | 75 | valid |
113 | Printer | 3 | Gray_V | 75 | valid |
113 | Printer | 3 | Gray_Q | 80 | missing |
Table 1 is what i usually receive from another department and it should follow Table 2 instruction. That is why i need a test on the Table 1.
I hope my problem is clear to understand.
Thank you in advance
it is almost similar to my previous post "HOW TO MERGE TWO TABLES" but i am having difficulty to include a line of code in the provided code by Xia.
What makes a result of "invalid" or "missing"?
'Invalid' if ID=111( Camera) has anything other than
color id Color Price
1 Black_E 50
1 Black_M 50
For the same ID I am looking for in the received report Table1
Color id Color Price
2 Brown_C 30
2 Brown_A 45
any other color in the received report Table1 is Invalid and Brown_C is not in, is considered 'missing'
the Table I am looking as a reference is the Table2, my report has to match it. I am having problem since the Product and their IDs are not in the reference table (Table2).
Is this help?
Not easy to catch you . really make me headache.
data Table1; input ID Product : $ 20. Color_id Color : $ 20. Price ; cards; 111 Camera 1 Black_E 50 111 Camera 1 Black_M 50 111 Camera 1 Black_T 50 111 Camera 2 Brown_A 45 111 Camera 2 Brown_B 45 111 Camera 3 Gray_C 40 111 Camera 3 Gray_K 40 111 Camera 3 Gray_V 40 113 Printer 1 Black_E 70 113 Printer 1 Black_M 70 113 Printer 1 Black_T 70 113 Printer 2 Brown_A 65 113 Printer 2 Brown_B 65 113 Printer 3 Gray_C 75 113 Printer 3 Gray_K 75 113 Printer 3 Gray_V 75 ; run; data Table2; input Color_id Color : $ 20. Price ; cards; 1 Black_E 50 1 Black_M 50 2 Brown_A 45 2 Brown_C 30 3 Gray_C 40 3 Gray_K 40 3 Gray_V 40 1 Black_E 70 1 Black_M 70 1 Black_T 70 2 Brown_A 65 2 Brown_B 65 3 Gray_C 75 3 Gray_K 75 3 Gray_V 75 3 Gray_Q 80 ; run; data table1; set table1; by ID Product Color_id notsorted; if first.Color_id and Color_id=1 then n+1; run; data table2; set table2; by Color_id notsorted; if first.Color_id and Color_id=1 then n+1; run; proc sort data=table1;by n Color_id Color;run; proc sort data=table2;by n Color_id Color;run; data want; merge table1(in=ina) table2(in=inb rename=(price=_price)); by n Color_id Color; length flag $ 20; if not ina then do;flag='missing';price=_price;end; if not inb then flag='invalid'; if ina and inb then do; if price=_price then flag='valid'; else flag='invalid'; end; drop n _price; run;
Xia Keshan
Xia
thank you for your reply. I am having some misalignement in the Expected table like this:
ID | Product | Color_id | Color | Price | flag |
. | . | 1 | Black_E | 50 | missing |
. | . | 1 | Black_M | 50 | missing |
111 | Camera | 1 | Black_T | 50 | Invalid |
111 | Camera | 2 | Brown_C | 30 | missing |
111 | Camera | 2 | Brown_A | 45 | valid |
111 | Camera | 2 | Brown_B | 45 | Invalid |
111 | Camera | 3 | Gray_C | 40 | valid |
111 | Camera | 3 | Gray_K | 40 | valid |
111 | Camera | 3 | Gray_V | 40 | valid |
111 | Camera | 1 | Black_E | 70 | valid |
113 | Printer | 1 | Black_M | 70 | valid |
113 | Printer | 1 | Black_T | 70 | valid |
113 | Printer | 2 | Brown_A | 65 | valid |
113 | Printer | 2 | Brown_B | 65 | Invalid |
113 | Printer | 3 | Gray_C | 75 | valid |
113 | Printer | 3 | Gray_K | 75 | valid |
113 | Printer | 3 | Gray_V | 75 | valid |
113 | Printer | 3 | Gray_Q | 80 | invalid |
. | |||||
. | |||||
. |
Any idea?
You have missing value for Color_id Color, My code did not consider about it . Do you really need them ?
And What is not expected ?
Xia
i am going to send you the original data to take a look. what i have in post, is just a sample to show the case. thank you
Based on my original data, i have these fields:
Table1 | |||||||
ID | Var1 | Var2_id | Var2 | Var3_id | Var3 | Var4_id | Var4 |
111 | Prod 1 | ||||||
111 | Prod 1 | ||||||
111 | Prod 1 | ||||||
111 | Prod 1 | ||||||
111 | Prod 1 | ||||||
111 | Prod 1 | ||||||
111 | Prod 1 | ||||||
111 | Prod 1 | ||||||
111 | Prod 1 | ||||||
111 | Prod 1 | ||||||
111 | Prod 1 | ||||||
111 | Prod 1 | ||||||
112 | Prod 2 | ||||||
112 | Prod 2 | ||||||
112 | Prod 2 | ||||||
112 | Prod 2 | ||||||
112 | Prod 2 | ||||||
112 | Prod 2 | ||||||
112 | Prod 2 | ||||||
112 | Prod 2 | ||||||
112 | Prod 2 | ||||||
112 | Prod 2 | ||||||
112 | Prod 2 | ||||||
112 | Prod 2 | ||||||
112 | Prod 2 | ||||||
113 | Prod 3 | ||||||
113 | Prod 3 | ||||||
113 | Prod 3 | ||||||
113 | Prod 3 | ||||||
113 | Prod 3 | ||||||
113 | Prod 3 | ||||||
113 | Prod 3 | ||||||
113 | Prod 3 | ||||||
113 | Prod 3 | ||||||
113 | Prod 3 | ||||||
113 | Prod 3 | ||||||
113 | Prod 3 | ||||||
113 | Prod 3 | ||||||
113 | Prod 3 | ||||||
113 | Prod 3 | ||||||
113 | Prod 3 | ||||||
. | |||||||
. | |||||||
. | |||||||
. |
Table2 | |||||
Var2_id | Var2 | Var3_id | Var3 | Var4_id | Var4 |
. | . | . | . | . | . |
. | . | . | . | . | . |
. | . | . | . | . | . |
. | . | . | . | . | . |
. | . | . | . | . | . |
. | . | . | . | . | . |
. | . | . | . | . | . |
. | . | . | . | . | . |
. | . | . | . | . | . |
. | . | . | . | . | . |
. | . | . | . | . | . |
i would like to test for any ID and Var1
t1.var2_id=t2.var2_id , test=valid
t1.var3_id=t2.var3_id, test=valid
t1.var3_id=t2.var3_id, test=valid
or
t1.var2_id^=t2.var2_id, test=invalid
t1.var3_id^=t2.var3_id, test=invalid
t1.var3_id^=t2.var3_id, test=invalid
sometimes for the same ID and Var1 you might not find t2.Var2_id or t2.Var3_id or t2.Var4_id
in table1 (t1) then test =missing
i hope this will help to get the picture of what i am facing. Exactly the same number of variables as my work data.
thanks
No . can't understand you .
IF
t1.var2_id=t2.var2_id ,
OR t1.var3_id=t2.var3_id,
OR t1.var3_id=t2.var3_id, THEN test=valid ?
By that logic why the number of obs in output is not the same as in Table 1 ?
we have some report where it is not equal, that is the reason why i am doing this test to detect which Product does not comply so we can return it back to be rate correctly. we have found some instance where
Product1 , ID 111, has Var4 like this (in table1)
A
B
C
D
while it should be ( based on the Table2):
A
B
C
E
in this case i would say that in Var4 for this product
A is valid
B is valid
C is valid
D is invalid
E is missing
it is complex in term of how to translate it into coding, that is why i am struggling. What make it hard, is the list of the product is very huge, so i need a code to get it done.
Thank you for looking into that
Well. I think my code is on the right way .
If you think ID Product should not be missing value , that would be easy to fix .
Otherwise, post the sample data , and the wrong result you get , and point where is not right .
Xia Keshan
i hpe this sample works because it has the same number of variables i am using:
Table1 | |||||||
ID | Var1 | Var2_id | Var2 | Var3_id | Var3 | Var4_id | Var4 |
111 | Prod 1 | 1 | Printing | 1 | Small_1 | 2 | duration |
111 | Prod 1 | 1 | Printing | 1 | Small_2 | 23 | support |
111 | Prod 1 | 1 | Printing | 1 | Small_3 | 22 | addlt time |
111 | Prod 1 | 1 | Printing | 2 | Med_v | 12 | managers |
111 | Prod 1 | 1 | Printing | 2 | Med_m | 3 | average # of |
111 | Prod 1 | 1 | Printing | 2 | Med_k | 5 | properties |
111 | Prod 1 | 1 | Printing | 3 | Larg_b1 | 15 | strong |
111 | Prod 1 | 1 | Printing | 3 | larg_b2 | 20 | competitive |
111 | Prod 1 | 1 | Printing | 3 | larg_b3 | 24 | occurrence |
111 | Prod 1 | 1 | Printing | 4 | Huge_c1 | 11 | # of cyle |
111 | Prod 1 | 1 | Printing | 4 | Huge_c2 | 13 | tag |
111 | Prod 1 | 1 | Printing | 4 | Huge_c3 | 25 | Tools |
112 | Prod 2 | 2 | Computing | . | . | . | . |
112 | Prod 2 | 2 | Computing | . | . | . | . |
112 | Prod 2 | 2 | Computing | . | . | . | . |
112 | Prod 2 | 2 | Computing | . | . | . | . |
112 | Prod 2 | 2 | Computing | . | . | . | . |
112 | Prod 2 | 2 | Computing | . | . | . | . |
112 | Prod 2 | 2 | Computing | . | . | . | . |
112 | Prod 2 | 2 | Computing | . | . | . | . |
112 | Prod 2 | 2 | Computing | . | . | . | . |
112 | Prod 2 | 2 | Computing | . | . | . | |
112 | Prod 2 | 2 | Computing | . | . | . | . |
112 | Prod 2 | 2 | Computing | . | . | . | |
112 | Prod 2 | 2 | Computing | . | . | . | . |
113 | Prod 3 | 4 | . | . | . | . | . |
113 | Prod 3 | 4 | . | . | . | . | |
113 | Prod 3 | 4 | . | . | . | .. | . |
113 | Prod 3 | 4 | . | . | . | . | . |
113 | Prod 3 | 4 | . | . | . | . | |
113 | Prod 3 | 4 | . | . | . | . | .. |
113 | Prod 3 | 4 | . | . | . | . | . |
113 | Prod 3 | 4 | . | . | . | . | . |
113 | Prod 3 | 4 | . | . | . | . | . |
113 | Prod 3 | 4 | . | . | . | . | . |
113 | Prod 3 | 4 | . | . | . | . | . |
113 | Prod 3 | 4 | . | . | . | . | . |
113 | Prod 3 | 4 | . | . | . | . | . |
113 | Prod 3 | 4 | . | . | . | . | . |
113 | Prod 3 | 4 | . | . | . | . | . |
113 | Prod 3 | 4 | . | . | . | . | .. |
. | |||||||
. | |||||||
. | |||||||
. |
Table2 | |||||
Var2_id | Var2 | Var3_id | Var3 | Var4_id | Var4 |
1 | Printing | 1 | Small_1 | 2 | duration |
1 | Printing | 1 | Small_2 | 23 | support |
1 | Printing | 1 | Small_3 | 22 | addlt time |
1 | Printing | 2 | Med_v | 12 | managers |
1 | Printing | 2 | Med_m | 3 | average # of |
1 | Printing | 2 | Med_k | 5 | properties |
1 | Printing | 3 | Larg_b1 | 15 | strong |
1 | Printing | 3 | larg_b2 | 20 | competitive |
1 | Printing | 3 | larg_b3 | 28 | exam |
1 | Printing | 4 | Huge_c1 | 11 | # of cyle |
1 | Printing | 4 | Huge_c2 | 13 | tag |
1 | Printing | 4 | Huge_c3 | 25 | Tools |
looking at these two tables, i will say that i receive Table 1 and want to compare it with Table2
taking Var4 from Table1
-11 obs match with t2, so "valid"
-Var4_id =24, Var4 =occurrence do not match so "invalid"
-Var_id =28 , Var4 =exam is not in t1 so it is "missing"
i know it is too many example of data. Please consider this one only, as it describes better what i am dealing with.
OK. Hope this time could work.
data Table1; infile cards expandtabs truncover; input ID Var1 & $20. Var2_id Var2 : $20. Var3_id Var3 : $20. Var4_id Var4 & $20. ; cards; 111 Prod 1 1 Printing 1 Small_1 2 duration 111 Prod 1 1 Printing 1 Small_2 23 support 111 Prod 1 1 Printing 1 Small_3 22 addlt time 111 Prod 1 1 Printing 2 Med_v 12 managers 111 Prod 1 1 Printing 2 Med_m 3 average # of 111 Prod 1 1 Printing 2 Med_k 5 properties 111 Prod 1 1 Printing 3 Larg_b1 15 strong 111 Prod 1 1 Printing 3 larg_b2 20 competitive 111 Prod 1 1 Printing 3 larg_b3 24 occurrence 111 Prod 1 1 Printing 4 Huge_c1 11 # of cyle 111 Prod 1 1 Printing 4 Huge_c2 13 tag 111 Prod 1 1 Printing 4 Huge_c3 25 Tools ; run; data Table2; infile cards expandtabs truncover; input Var2_id Var2 & $20. Var3_id Var3 : $20. Var4_id Var4 & $20.; cards; 1 Printing 1 Small_1 2 duration 1 Printing 1 Small_2 23 support 1 Printing 1 Small_3 22 addlt time 1 Printing 2 Med_v 12 managers 1 Printing 2 Med_m 3 average # of 1 Printing 2 Med_k 5 properties 1 Printing 3 Larg_b1 15 strong 1 Printing 3 larg_b2 20 competitive 1 Printing 3 larg_b3 28 exam 1 Printing 4 Huge_c1 11 # of cyle 1 Printing 4 Huge_c2 13 tag 1 Printing 4 Huge_c3 25 Tools ; run; proc sort data=table1;by Var2_id Var2 Var3_id Var3 Var4_id Var4;run; proc sort data=table2;by Var2_id Var2 Var3_id Var3 Var4_id Var4;run; data want; merge table1(in=ina) table2(in=inb ); by Var2_id Var2 Var3_id Var3 Var4_id Var4; length flag $ 20; if not ina then flag='missing'; if not inb then flag='invalid'; if ina and inb then flag='valid'; run;
Xia Keshan
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.