Hello
i need help to merge or test one table.Here is what I am trying to do:
Table1 is the report i have periodically and i want to make sure that all the Variables are line up according to the report guideline.
Table2 has what the variables should include for every single product (report guideline).
| Table 1 | ||||
| ID | Product | Color_id | Color | Price | 
| 111 | Camera | 1 | Black_E | 50 | 
| 111 | Camera | 1 | Black_M | 50 | 
| 111 | Camera | 1 | Black_T | 50 | 
| 111 | Camera | 2 | Brown_A | 45 | 
| 111 | Camera | 2 | Brown_B | 45 | 
| 111 | Camera | 3 | Gray_C | 40 | 
| 111 | Camera | 3 | Gray_K | 40 | 
| 111 | Camera | 3 | Gray_V | 40 | 
| 113 | Printer | 1 | Black_E | 70 | 
| 113 | Printer | 1 | Black_M | 70 | 
| 113 | Printer | 1 | Black_T | 70 | 
| 113 | Printer | 2 | Brown_A | 65 | 
| 113 | Printer | 2 | Brown_B | 65 | 
| 113 | Printer | 3 | Gray_C | 75 | 
| 113 | Printer | 3 | Gray_K | 75 | 
| 113 | Printer | 3 | Gray_V | 75 | 
| Table2 | ||||
| Color_id | Color | Price | ||
| 1 | Black_E | 50 | ||
| 1 | Black_M | 50 | ||
| 2 | Brown_A | 45 | ||
| 3 | Gray_C | 40 | ||
| 3 | Gray_K | 40 | ||
| 3 | Gray_V | 40 | ||
| 1 | Black_E | 70 | ||
| 1 | Black_M | 70 | ||
| 1 | Black_T | 70 | ||
| 2 | Brown_A | 65 | ||
| 2 | Brown_B | 65 | ||
| 3 | Gray_C | 75 | ||
| 3 | Gray_K | 75 | ||
| 3 | Gray_V | 75 | 
So i need to get a report of the Products not in the line with the guideline. Below is what I am looking for:
| Table 1 | |||||
| ID | Product | Color_id | Color | Price | Test | 
| 111 | Camera | 1 | Black_E | 50 | valid | 
| 111 | Camera | 1 | Black_M | 50 | valid | 
| 111 | Camera | 1 | Black_T | 50 | invalid | 
| 111 | Camera | 2 | Brown_A | 45 | valid | 
| 111 | Camera | 2 | Brown_B | 45 | invalid | 
| 111 | Camera | 3 | Gray_C | 40 | valid | 
| 111 | Camera | 3 | Gray_K | 40 | valid | 
| 111 | Camera | 3 | Gray_V | 40 | valid | 
| 113 | Printer | 1 | Black_E | 70 | valid | 
| 113 | Printer | 1 | Black_M | 70 | valid | 
| 113 | Printer | 1 | Black_T | 70 | invalid | 
| 113 | Printer | 2 | Brown_A | 65 | valid | 
| 113 | Printer | 2 | Brown_B | 65 | invalid | 
| 113 | Printer | 3 | Gray_C | 75 | valid | 
| 113 | Printer | 3 | Gray_K | 75 | valid | 
| 113 | Printer | 3 | Gray_V | 75 | valid | 
I used Proc sql; or
Data Merge_Table1_table2;
merge table1 (in=a) table2 (in=b);
by Color_id Color;
if b;
run;
Any idea to proceed with this type of report?
Thank you in advance
So you only care the firstly appeared 'Color_id' 'Color' in Table2 and not care 'Price' ?
data table1;
input ID     Product $     Color_id     Color $     Price ;
cards;
111     Camera     1     Black_E     50
111     Camera     1     Black_M     50
111     Camera     1     Black_T     50
111     Camera     2     Brown_A     45
111     Camera     2     Brown_B     45
111     Camera     3     Gray_C     40
111     Camera     3     Gray_K     40
111     Camera     3     Gray_V     40
113     Printer     1     Black_E     70
113     Printer     1     Black_M     70
113     Printer     1     Black_T     70
113     Printer     2     Brown_A     65
113     Printer     2     Brown_B     65
113     Printer     3     Gray_C     75
113     Printer     3     Gray_K     75
113     Printer     3     Gray_V     75
;
run;
data table2;
input      Color_id     Color $     Price ;
cards;
1     Black_E     50
1     Black_M     50
2     Brown_A     45
3     Gray_C     40
3     Gray_K     40
3     Gray_V     40
1     Black_E     70
1     Black_M     70
1     Black_T     70
2     Brown_A     65
2     Brown_B     65
3     Gray_C     75
3     Gray_K     75
3     Gray_V     75
;
run;
data key;
 set table2;
 by Color_id notsorted;
 retain first 0;
 array x{10000} _temporary_;
 if first.Color_id then do;
   if Color_id not in x then do;n+1;x{n}=Color_id;first=1;end;
    else first=0;
 end;
 if first;
run;
data want;
 if _n_ eq 1 then do;
  declare hash h(dataset:'key');
  h.definekey('Color_id','Color');
  h.definedone();
 end;
set table1;
length test $ 20;
if h.check()=0 then test='valid';
 else test='invalid';
run;
Xia Keshan
Which do you want to report as "invalid": Color in Table 1 doesn't match Color in Table2 for the color_id, Price in Table 1 doesn't match Price in Table2 for the color_id, both don't match, or any of the two don't match?
From my original post above I stated this:
Table1 is the report i have periodically and i want to make sure that all the Variables are line up according to the report guideline.
Table2 has what the variables should include for every single product (report guideline)
So when you look at both tables you will see some relations. The 3rd table Table1 is the expected result.
hope this help.
| 13 | Printer | 1 | Black_M | 70 | valid | 
| 113 | Printer | 1 | Black_T | 70 | invalid | 
| 113 | Printer | 2 | Brown_A | 65 | valid | 
| 113 | Printer | 2 | Brown_B | 65 | invalid | 
Why would be invalid, since they are included in Table2 ?
'Invalid' because when we receive the report (Table1) , it has to follow Table2 parrain. So Black_T should not be there, so on in the table. I am trying to create a code to give the result looking like the third Table which i named Table1 again. I should name it Test_table something lke that.
So you only care the firstly appeared 'Color_id' 'Color' in Table2 and not care 'Price' ?
data table1;
input ID     Product $     Color_id     Color $     Price ;
cards;
111     Camera     1     Black_E     50
111     Camera     1     Black_M     50
111     Camera     1     Black_T     50
111     Camera     2     Brown_A     45
111     Camera     2     Brown_B     45
111     Camera     3     Gray_C     40
111     Camera     3     Gray_K     40
111     Camera     3     Gray_V     40
113     Printer     1     Black_E     70
113     Printer     1     Black_M     70
113     Printer     1     Black_T     70
113     Printer     2     Brown_A     65
113     Printer     2     Brown_B     65
113     Printer     3     Gray_C     75
113     Printer     3     Gray_K     75
113     Printer     3     Gray_V     75
;
run;
data table2;
input      Color_id     Color $     Price ;
cards;
1     Black_E     50
1     Black_M     50
2     Brown_A     45
3     Gray_C     40
3     Gray_K     40
3     Gray_V     40
1     Black_E     70
1     Black_M     70
1     Black_T     70
2     Brown_A     65
2     Brown_B     65
3     Gray_C     75
3     Gray_K     75
3     Gray_V     75
;
run;
data key;
 set table2;
 by Color_id notsorted;
 retain first 0;
 array x{10000} _temporary_;
 if first.Color_id then do;
   if Color_id not in x then do;n+1;x{n}=Color_id;first=1;end;
    else first=0;
 end;
 if first;
run;
data want;
 if _n_ eq 1 then do;
  declare hash h(dataset:'key');
  h.definekey('Color_id','Color');
  h.definedone();
 end;
set table1;
length test $ 20;
if h.check()=0 then test='valid';
 else test='invalid';
run;
Xia Keshan
hi Xia
below is the error i am getting when i run the code:
349 data Result;
350 if _n_eq 1 then do;
-
388
200
ERROR 388-185: Expecting an arithmetic operator.
ERROR 200-322: The symbol is not recognized and will be ignored.
351 declare hash h(dataset:'key');
ERROR: DATA STEP Component Object failure. Aborted during the COMPILATION phase
Any idea?
Hi,
Put a space before the operator eq e.g:
if _n_ eq 1 then do;
^
Thank you a lot Xia, it works perfectly. You made my day!
That is weird . I can run this code successfully . Did you change my code ?
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.
