Hello:
I have two dataset, test1 and test2. I would like to find the match based on the following situation:
(Test1 "Analysis" = Test2 "Analysis") and [ Test1 "Product" = any of the Test2 ("Class" / "Component" / "Product") ]
I have the inner join code below, I found out that the "Cleft Lip +/- Palate" and "ATENOLOL" are shown twice in the result. Any idea why? How to fix the problem? Thanks.
data test1;
infile datalines dlm='|';
input name : $20. Analysis : $100. Product : $100.;
datalines;
Ali| Cleft Lip +/- Palate| LABETOLOL|
Ali| Cleft Lip +/- Palate| ATENOLOL|
Ali| VSDs (Excludes VSDavc)| ATENOLOL|
Ali| VSDs (Excludes VSDavc)| FLUCONAZOLE|
Carol| Choanal Atresia| LEVOTHYROXINE SODIUM|
Carol| Cleft Lip +/- Palate| ATENOLOL|
Carol| Intestinal Atresia/Stenosis| EPHEDRA|
Malyn| TGA +/- DORV| ALBUTEROL SULFATE|
Malyn| TOF +/- PA, DORV -TGA| BUTALBITAL|
Malyn| Esoph Atresia +/- TEF| FLONASE|
;
proc sort data=test1; by analysis; run;
data test2;
infile datalines dlm='|';
input Analysis : $100. Class : $100. Component : $100. Product : $100.;
datalines;
Cleft Lip +/- Palate| LEVOTHYROXINE SODIUM| | |
VSDs (Excludes VSDavc)| ANTI/VITA| | |
Choanal Atresia| LEVOTHYROXINE SODIUM| | |
TGA +/- DORV| FLONASE| | |
Cleft Lip +/- Palate| | ATENOLOL| |
Intestinal Atresia/Stenosis| | EPHEDRA| |
VSDs (Excludes VSDavc)| | BUTALBITAL| |
Esoph Atresia +/- TEF| | FLONASE| |
Cleft Lip +/- Palate| | | LABETOLOL|
Esoph Atresia +/- TEF| | | ATENOLOL|
VSDs (Excludes VSDavc)| | | ALBUTEROL SULFATE|
TOF +/- PA, DORV -TGA| | | BUTALBITAL|
Choanal Atresia| | | LEVOTHYROXINE SODIUM|
;
proc sort data=test2; by analysis component; run;
proc sql;
create table test2Matchtotal as
select a.*
from test2 as a inner join
test1 as b on a.Analysis=b.Analysis and (a.class=b.product or a.component=b.product or a.product=b.product);
quit;
The values are duplicated because this:
(a.class=b.product or a.component=b.product or a.product=b.product);
creates a match for potentially 3 conditions but the requested output (a.*) doesn't change the result selected
If you don't want duplicated results one approach is the distinct predicate.
Select distinct a.*
Post your code into the code editor.
And format and comment it.
The values are duplicated because this:
(a.class=b.product or a.component=b.product or a.product=b.product);
creates a match for potentially 3 conditions but the requested output (a.*) doesn't change the result selected
If you don't want duplicated results one approach is the distinct predicate.
Select distinct a.*
Cool !!!
Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.
Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.