BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Sauda
Fluorite | Level 6

Hi Expert,

I am not getting correct output when testing this code in huge data. The programming is running with zero output. 

 

I want all records from child sheet when country, product and dosage matches between two datasets. 

            or country-product matches between two dataset only when dosage values is null for child dataset

 

Product data column should be considered as pattern matching.

Can you please help me where I have done the mistake or is there any altenative approach to retrieve data in a faster way.

 

data Child;
input pid country $ product & $25. dosage $;
cards;
101 France Atorvastatin          Oral
102 France PaRacetamol            .
103 Russia Nimusulide injection  sc
104 China DIovan                 Tablet
105 Japan Gelusil                Tablet
106 India Codeine                Tablet 
;
run;

data Parent;
input country $ product & $25. dosage $;
cards;
JapaN Atorvastatin calcium 		Oral
China Atorvastatin 				Oral   
France Atorvastatin calcium 	Oral   
Russia Paracetamol 				Oral   
India Paracetamol 				Oral   
France Paracetamol 				Oral   
RusSia Nimusulide 				sc   
China diovan Syrup   			.
China diovan 					.
China Gelusil 					.
india Codeine 					.
Japan Gelusil Tablet			.
;
run;

proc sql;
create table match_country_prod_dos as
			select c.* 
				from child c, parent p
				where upper(compress(c.country)) in (select upper(compress(p.country)) from parent)
				and (upper(compress(c.product)) like '%' || upper(compress(p.product)) || '%'  or upper(compress(p.product)) like '%' ||upper(compress(c.product))|| '%')
				and upper(compress(c.dosage)) in (select upper(compress(p.dosage)) from parent);
quit;
 

proc sql;
create table match_country_prod_dosagenull as
			select c.* 
				from child c, parent p
				where c.dosage is null
				and upper(compress(c.country)) in (select upper(compress(p.country)) from parent)
				and (upper(compress(c.product)) like '%' || upper(compress(p.product)) || '%'  or upper(compress(p.product)) like '%' ||upper(compress(c.product))|| '%')
				;
quit;
 

data final;
set  match_country_prod_dos match_country_prod_dosagenull;
run;

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
RW9
Diamond | Level 26 RW9
Diamond | Level 26

Add in the upcase(strip()) part and it should work:

proc sql;
  create table match_country_prod_dos as
  select A.*
  from   CHILD A
  where  exists(select distinct B.COUNTRY from PARENT B where (upcase(strip(B.COUNTRY))=upcase(strip(A.COUNTRY)) 
                  and upcase(strip(B.DOSAGE))=upcase(strip(A.DOSAGE)) and index(upcase(strip(B.PRODUCT)),upcase(strip(A.PRODUCT)))>0));
quit;

However, I still advise you, if you are doing anything with this data, to first process it into a usable, standardised format - this will help you so much later on.  I.e. make sure the data is all formatted correctly - upcase country in both, apply standardised codes to the countries so different spellings no longer factor in.  Then merge on WHODrug or a drug library to get the various group levels to provide an acurate mergable variable.

View solution in original post

6 REPLIES 6
RW9
Diamond | Level 26 RW9
Diamond | Level 26

Compress is not the function you want to use.  You want to remove blanks from start and end, compress will trim out all spaces.  If you modify your code to:

proc sql;
  create table match_country_prod_dos as
select c.* from child c, parent p where upper(strip(c.country)) in (select upper(strip(p.country)) from parent)
and (upper(strip(c.product)) like cats('%',upper(p.product),'%') or upper(strip(p.product)) like cats('%',upper(c.product),'%')) and upper(strip(c.dosage)) in (select upper(strip(p.dosage)) from parent); quit;

Then you get results.

Sauda
Fluorite | Level 6

Thanks in looking into my concern.

I would like to know whether the below code is correct. it means whether I should use alias p.country or only country to get correct result.

 

upper(strip(c.country)) in (select upper(strip(p.country)) from parent)

But I am not getting output while testing the same in my project. 

 

Is there any alternative approach to retreive the data in  a more faster way.

 

 

RW9
Diamond | Level 26 RW9
Diamond | Level 26

You could try:

proc sql;
  create table match_country_prod_dos as
select * from CHILD where exists(select B.COUNTRY from PARENT B where upcase(strip(B.COUNTRY))=upcase(strip(COUNTRY)) and index(B.PRODUCT,PRODUCT)>0 and upcase(strip(B.DOSAGE))=upcase(strip(DOSAGE))); quit;

I would question why you have drug terms, which are not coded.  I do not see how you will merge these correctly.  Drugnames should be code with the appropriate coding dictionary WHODrug for instance.  This will give you the various coded levels for you to properly merge on.

Sauda
Fluorite | Level 6

The exists operator is not working and I am getting all rows from child dataset.

Actually the drug code is not available for this raw data for which I am comparing the product name.

 

 

RW9
Diamond | Level 26 RW9
Diamond | Level 26

Add in the upcase(strip()) part and it should work:

proc sql;
  create table match_country_prod_dos as
  select A.*
  from   CHILD A
  where  exists(select distinct B.COUNTRY from PARENT B where (upcase(strip(B.COUNTRY))=upcase(strip(A.COUNTRY)) 
                  and upcase(strip(B.DOSAGE))=upcase(strip(A.DOSAGE)) and index(upcase(strip(B.PRODUCT)),upcase(strip(A.PRODUCT)))>0));
quit;

However, I still advise you, if you are doing anything with this data, to first process it into a usable, standardised format - this will help you so much later on.  I.e. make sure the data is all formatted correctly - upcase country in both, apply standardised codes to the countries so different spellings no longer factor in.  Then merge on WHODrug or a drug library to get the various group levels to provide an acurate mergable variable.

Sauda
Fluorite | Level 6
Any help please

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 6 replies
  • 802 views
  • 2 likes
  • 2 in conversation