Anyone can help me with this problem?
Table 1 | ||||
ID | Product | Color_id | Color | Price |
111 | Camera | 1 | Black_E | 50 |
111 | Camera | 1 | Black_M | 50 |
111 | Camera | 1 | Black_T | 50 |
111 | Camera | 2 | Brown_A | 45 |
111 | Camera | 2 | Brown_B | 45 |
111 | Camera | 3 | Gray_C | 40 |
111 | Camera | 3 | Gray_K | 40 |
111 | Camera | 3 | Gray_V | 40 |
113 | Printer | 1 | Black_E | 70 |
113 | Printer | 1 | Black_M | 70 |
113 | Printer | 1 | Black_T | 70 |
113 | Printer | 2 | Brown_A | 65 |
113 | Printer | 2 | Brown_B | 65 |
113 | Printer | 3 | Gray_C | 75 |
113 | Printer | 3 | Gray_K | 75 |
113 | Printer | 3 | Gray_V | 75 |
Table2 | (Template) | |
Color_id | Color | Price |
1 | Black_E | 50 |
1 | Black_M | 50 |
2 | Brown_A | 45 |
2 | Brown_C | 30 |
3 | Gray_C | 40 |
3 | Gray_K | 40 |
3 | Gray_V | 40 |
1 | Black_E | 70 |
1 | Black_M | 70 |
1 | Black_T | 70 |
2 | Brown_A | 65 |
2 | Brown_B | 65 |
3 | Gray_C | 75 |
3 | Gray_K | 75 |
3 | Gray_V | 75 |
3 | Gray_Q | 80 |
i am expecting in my test using Table 1 and 2 to have this below :
Table Expected | |||||
ID | Product | Color_id | Color | Price | Test |
111 | Camera | 1 | Black_E | 50 | valid |
111 | Camera | 1 | Black_M | 50 | valid |
111 | Camera | 1 | Black_T | 50 | Invalid |
111 | Camera | 2 | Brown_C | 30 | missing |
111 | Camera | 2 | Brown_A | 45 | valid |
111 | Camera | 2 | Brown_B | 45 | Invalid |
111 | Camera | 3 | Gray_C | 40 | valid |
111 | Camera | 3 | Gray_K | 40 | valid |
111 | Camera | 3 | Gray_V | 40 | valid |
113 | Printer | 1 | Black_E | 70 | valid |
113 | Printer | 1 | Black_M | 70 | valid |
113 | Printer | 1 | Black_T | 70 | valid |
113 | Printer | 2 | Brown_A | 65 | valid |
113 | Printer | 2 | Brown_B | 65 | Valid |
113 | Printer | 3 | Gray_C | 75 | valid |
113 | Printer | 3 | Gray_K | 75 | valid |
113 | Printer | 3 | Gray_V | 75 | valid |
113 | Printer | 3 | Gray_Q | 80 | missing |
Table 1 is what i usually receive from another department and it should follow Table 2 instruction. That is why i need a test on the Table 1.
I hope my problem is clear to understand.
Thank you in advance
it is almost similar to my previous post "HOW TO MERGE TWO TABLES" but i am having difficulty to include a line of code in the provided code by Xia.
What makes a result of "invalid" or "missing"?
'Invalid' if ID=111( Camera) has anything other than
color id Color Price
1 Black_E 50
1 Black_M 50
For the same ID I am looking for in the received report Table1
Color id Color Price
2 Brown_C 30
2 Brown_A 45
any other color in the received report Table1 is Invalid and Brown_C is not in, is considered 'missing'
the Table I am looking as a reference is the Table2, my report has to match it. I am having problem since the Product and their IDs are not in the reference table (Table2).
Is this help?
Not easy to catch you . really make me headache.
data Table1; input ID Product : $ 20. Color_id Color : $ 20. Price ; cards; 111 Camera 1 Black_E 50 111 Camera 1 Black_M 50 111 Camera 1 Black_T 50 111 Camera 2 Brown_A 45 111 Camera 2 Brown_B 45 111 Camera 3 Gray_C 40 111 Camera 3 Gray_K 40 111 Camera 3 Gray_V 40 113 Printer 1 Black_E 70 113 Printer 1 Black_M 70 113 Printer 1 Black_T 70 113 Printer 2 Brown_A 65 113 Printer 2 Brown_B 65 113 Printer 3 Gray_C 75 113 Printer 3 Gray_K 75 113 Printer 3 Gray_V 75 ; run; data Table2; input Color_id Color : $ 20. Price ; cards; 1 Black_E 50 1 Black_M 50 2 Brown_A 45 2 Brown_C 30 3 Gray_C 40 3 Gray_K 40 3 Gray_V 40 1 Black_E 70 1 Black_M 70 1 Black_T 70 2 Brown_A 65 2 Brown_B 65 3 Gray_C 75 3 Gray_K 75 3 Gray_V 75 3 Gray_Q 80 ; run; data table1; set table1; by ID Product Color_id notsorted; if first.Color_id and Color_id=1 then n+1; run; data table2; set table2; by Color_id notsorted; if first.Color_id and Color_id=1 then n+1; run; proc sort data=table1;by n Color_id Color;run; proc sort data=table2;by n Color_id Color;run; data want; merge table1(in=ina) table2(in=inb rename=(price=_price)); by n Color_id Color; length flag $ 20; if not ina then do;flag='missing';price=_price;end; if not inb then flag='invalid'; if ina and inb then do; if price=_price then flag='valid'; else flag='invalid'; end; drop n _price; run;
Xia Keshan
Xia
thank you for your reply. I am having some misalignement in the Expected table like this:
ID | Product | Color_id | Color | Price | flag |
. | . | 1 | Black_E | 50 | missing |
. | . | 1 | Black_M | 50 | missing |
111 | Camera | 1 | Black_T | 50 | Invalid |
111 | Camera | 2 | Brown_C | 30 | missing |
111 | Camera | 2 | Brown_A | 45 | valid |
111 | Camera | 2 | Brown_B | 45 | Invalid |
111 | Camera | 3 | Gray_C | 40 | valid |
111 | Camera | 3 | Gray_K | 40 | valid |
111 | Camera | 3 | Gray_V | 40 | valid |
111 | Camera | 1 | Black_E | 70 | valid |
113 | Printer | 1 | Black_M | 70 | valid |
113 | Printer | 1 | Black_T | 70 | valid |
113 | Printer | 2 | Brown_A | 65 | valid |
113 | Printer | 2 | Brown_B | 65 | Invalid |
113 | Printer | 3 | Gray_C | 75 | valid |
113 | Printer | 3 | Gray_K | 75 | valid |
113 | Printer | 3 | Gray_V | 75 | valid |
113 | Printer | 3 | Gray_Q | 80 | invalid |
. | |||||
. | |||||
. |
Any idea?
You have missing value for Color_id Color, My code did not consider about it . Do you really need them ?
And What is not expected ?
Xia
i am going to send you the original data to take a look. what i have in post, is just a sample to show the case. thank you
Based on my original data, i have these fields:
Table1 | |||||||
ID | Var1 | Var2_id | Var2 | Var3_id | Var3 | Var4_id | Var4 |
111 | Prod 1 | ||||||
111 | Prod 1 | ||||||
111 | Prod 1 | ||||||
111 | Prod 1 | ||||||
111 | Prod 1 | ||||||
111 | Prod 1 | ||||||
111 | Prod 1 | ||||||
111 | Prod 1 | ||||||
111 | Prod 1 | ||||||
111 | Prod 1 | ||||||
111 | Prod 1 | ||||||
111 | Prod 1 | ||||||
112 | Prod 2 | ||||||
112 | Prod 2 | ||||||
112 | Prod 2 | ||||||
112 | Prod 2 | ||||||
112 | Prod 2 | ||||||
112 | Prod 2 | ||||||
112 | Prod 2 | ||||||
112 | Prod 2 | ||||||
112 | Prod 2 | ||||||
112 | Prod 2 | ||||||
112 | Prod 2 | ||||||
112 | Prod 2 | ||||||
112 | Prod 2 | ||||||
113 | Prod 3 | ||||||
113 | Prod 3 | ||||||
113 | Prod 3 | ||||||
113 | Prod 3 | ||||||
113 | Prod 3 | ||||||
113 | Prod 3 | ||||||
113 | Prod 3 | ||||||
113 | Prod 3 | ||||||
113 | Prod 3 | ||||||
113 | Prod 3 | ||||||
113 | Prod 3 | ||||||
113 | Prod 3 | ||||||
113 | Prod 3 | ||||||
113 | Prod 3 | ||||||
113 | Prod 3 | ||||||
113 | Prod 3 | ||||||
. | |||||||
. | |||||||
. | |||||||
. |
Table2 | |||||
Var2_id | Var2 | Var3_id | Var3 | Var4_id | Var4 |
. | . | . | . | . | . |
. | . | . | . | . | . |
. | . | . | . | . | . |
. | . | . | . | . | . |
. | . | . | . | . | . |
. | . | . | . | . | . |
. | . | . | . | . | . |
. | . | . | . | . | . |
. | . | . | . | . | . |
. | . | . | . | . | . |
. | . | . | . | . | . |
i would like to test for any ID and Var1
t1.var2_id=t2.var2_id , test=valid
t1.var3_id=t2.var3_id, test=valid
t1.var3_id=t2.var3_id, test=valid
or
t1.var2_id^=t2.var2_id, test=invalid
t1.var3_id^=t2.var3_id, test=invalid
t1.var3_id^=t2.var3_id, test=invalid
sometimes for the same ID and Var1 you might not find t2.Var2_id or t2.Var3_id or t2.Var4_id
in table1 (t1) then test =missing
i hope this will help to get the picture of what i am facing. Exactly the same number of variables as my work data.
thanks
No . can't understand you .
IF
t1.var2_id=t2.var2_id ,
OR t1.var3_id=t2.var3_id,
OR t1.var3_id=t2.var3_id, THEN test=valid ?
By that logic why the number of obs in output is not the same as in Table 1 ?
we have some report where it is not equal, that is the reason why i am doing this test to detect which Product does not comply so we can return it back to be rate correctly. we have found some instance where
Product1 , ID 111, has Var4 like this (in table1)
A
B
C
D
while it should be ( based on the Table2):
A
B
C
E
in this case i would say that in Var4 for this product
A is valid
B is valid
C is valid
D is invalid
E is missing
it is complex in term of how to translate it into coding, that is why i am struggling. What make it hard, is the list of the product is very huge, so i need a code to get it done.
Thank you for looking into that
Well. I think my code is on the right way .
If you think ID Product should not be missing value , that would be easy to fix .
Otherwise, post the sample data , and the wrong result you get , and point where is not right .
Xia Keshan
i hpe this sample works because it has the same number of variables i am using:
Table1 | |||||||
ID | Var1 | Var2_id | Var2 | Var3_id | Var3 | Var4_id | Var4 |
111 | Prod 1 | 1 | Printing | 1 | Small_1 | 2 | duration |
111 | Prod 1 | 1 | Printing | 1 | Small_2 | 23 | support |
111 | Prod 1 | 1 | Printing | 1 | Small_3 | 22 | addlt time |
111 | Prod 1 | 1 | Printing | 2 | Med_v | 12 | managers |
111 | Prod 1 | 1 | Printing | 2 | Med_m | 3 | average # of |
111 | Prod 1 | 1 | Printing | 2 | Med_k | 5 | properties |
111 | Prod 1 | 1 | Printing | 3 | Larg_b1 | 15 | strong |
111 | Prod 1 | 1 | Printing | 3 | larg_b2 | 20 | competitive |
111 | Prod 1 | 1 | Printing | 3 | larg_b3 | 24 | occurrence |
111 | Prod 1 | 1 | Printing | 4 | Huge_c1 | 11 | # of cyle |
111 | Prod 1 | 1 | Printing | 4 | Huge_c2 | 13 | tag |
111 | Prod 1 | 1 | Printing | 4 | Huge_c3 | 25 | Tools |
112 | Prod 2 | 2 | Computing | . | . | . | . |
112 | Prod 2 | 2 | Computing | . | . | . | . |
112 | Prod 2 | 2 | Computing | . | . | . | . |
112 | Prod 2 | 2 | Computing | . | . | . | . |
112 | Prod 2 | 2 | Computing | . | . | . | . |
112 | Prod 2 | 2 | Computing | . | . | . | . |
112 | Prod 2 | 2 | Computing | . | . | . | . |
112 | Prod 2 | 2 | Computing | . | . | . | . |
112 | Prod 2 | 2 | Computing | . | . | . | . |
112 | Prod 2 | 2 | Computing | . | . | . | |
112 | Prod 2 | 2 | Computing | . | . | . | . |
112 | Prod 2 | 2 | Computing | . | . | . | |
112 | Prod 2 | 2 | Computing | . | . | . | . |
113 | Prod 3 | 4 | . | . | . | . | . |
113 | Prod 3 | 4 | . | . | . | . | |
113 | Prod 3 | 4 | . | . | . | .. | . |
113 | Prod 3 | 4 | . | . | . | . | . |
113 | Prod 3 | 4 | . | . | . | . | |
113 | Prod 3 | 4 | . | . | . | . | .. |
113 | Prod 3 | 4 | . | . | . | . | . |
113 | Prod 3 | 4 | . | . | . | . | . |
113 | Prod 3 | 4 | . | . | . | . | . |
113 | Prod 3 | 4 | . | . | . | . | . |
113 | Prod 3 | 4 | . | . | . | . | . |
113 | Prod 3 | 4 | . | . | . | . | . |
113 | Prod 3 | 4 | . | . | . | . | . |
113 | Prod 3 | 4 | . | . | . | . | . |
113 | Prod 3 | 4 | . | . | . | . | . |
113 | Prod 3 | 4 | . | . | . | . | .. |
. | |||||||
. | |||||||
. | |||||||
. |
Table2 | |||||
Var2_id | Var2 | Var3_id | Var3 | Var4_id | Var4 |
1 | Printing | 1 | Small_1 | 2 | duration |
1 | Printing | 1 | Small_2 | 23 | support |
1 | Printing | 1 | Small_3 | 22 | addlt time |
1 | Printing | 2 | Med_v | 12 | managers |
1 | Printing | 2 | Med_m | 3 | average # of |
1 | Printing | 2 | Med_k | 5 | properties |
1 | Printing | 3 | Larg_b1 | 15 | strong |
1 | Printing | 3 | larg_b2 | 20 | competitive |
1 | Printing | 3 | larg_b3 | 28 | exam |
1 | Printing | 4 | Huge_c1 | 11 | # of cyle |
1 | Printing | 4 | Huge_c2 | 13 | tag |
1 | Printing | 4 | Huge_c3 | 25 | Tools |
looking at these two tables, i will say that i receive Table 1 and want to compare it with Table2
taking Var4 from Table1
-11 obs match with t2, so "valid"
-Var4_id =24, Var4 =occurrence do not match so "invalid"
-Var_id =28 , Var4 =exam is not in t1 so it is "missing"
i know it is too many example of data. Please consider this one only, as it describes better what i am dealing with.
OK. Hope this time could work.
data Table1; infile cards expandtabs truncover; input ID Var1 & $20. Var2_id Var2 : $20. Var3_id Var3 : $20. Var4_id Var4 & $20. ; cards; 111 Prod 1 1 Printing 1 Small_1 2 duration 111 Prod 1 1 Printing 1 Small_2 23 support 111 Prod 1 1 Printing 1 Small_3 22 addlt time 111 Prod 1 1 Printing 2 Med_v 12 managers 111 Prod 1 1 Printing 2 Med_m 3 average # of 111 Prod 1 1 Printing 2 Med_k 5 properties 111 Prod 1 1 Printing 3 Larg_b1 15 strong 111 Prod 1 1 Printing 3 larg_b2 20 competitive 111 Prod 1 1 Printing 3 larg_b3 24 occurrence 111 Prod 1 1 Printing 4 Huge_c1 11 # of cyle 111 Prod 1 1 Printing 4 Huge_c2 13 tag 111 Prod 1 1 Printing 4 Huge_c3 25 Tools ; run; data Table2; infile cards expandtabs truncover; input Var2_id Var2 & $20. Var3_id Var3 : $20. Var4_id Var4 & $20.; cards; 1 Printing 1 Small_1 2 duration 1 Printing 1 Small_2 23 support 1 Printing 1 Small_3 22 addlt time 1 Printing 2 Med_v 12 managers 1 Printing 2 Med_m 3 average # of 1 Printing 2 Med_k 5 properties 1 Printing 3 Larg_b1 15 strong 1 Printing 3 larg_b2 20 competitive 1 Printing 3 larg_b3 28 exam 1 Printing 4 Huge_c1 11 # of cyle 1 Printing 4 Huge_c2 13 tag 1 Printing 4 Huge_c3 25 Tools ; run; proc sort data=table1;by Var2_id Var2 Var3_id Var3 Var4_id Var4;run; proc sort data=table2;by Var2_id Var2 Var3_id Var3 Var4_id Var4;run; data want; merge table1(in=ina) table2(in=inb ); by Var2_id Var2 Var3_id Var3 Var4_id Var4; length flag $ 20; if not ina then flag='missing'; if not inb then flag='invalid'; if ina and inb then flag='valid'; run;
Xia Keshan
Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.
Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.