DATA Step, Macro, Functions and more

getting desired result in many-to-many join Proc Sql

Reply
Super Contributor
Posts: 271

getting desired result in many-to-many join Proc Sql

Hello,

 

While using data step merge I get the warning " "NOTE: MERGE statement has more than one data set with repeats of BY values."  I used Proc Sql to get rid of this warning. This , however, gives all the comibinations of multiple by values. I want to keep only some observations from them. Is there a technique to get rid of undesired records without manually?

 

Thanks !

Super User
Posts: 19,815

Re: getting desired result in many-to-many join Proc Sql

Posted in reply to SAS_inquisitive

You need to restrict it somehow using a where clause. 

Super Contributor
Posts: 298

Re: getting desired result in many-to-many join Proc Sql

Posted in reply to SAS_inquisitive

Many to Many Join can be done by Data Step much more conveniently than Proc SQL. Show your data sets through datalines.

Super User
Posts: 5,509

Re: getting desired result in many-to-many join Proc Sql

Posted in reply to SAS_inquisitive

There's a way to do anything.  But you would have to show an example of the result you want ... an example of the many-to-many situation before the merge, and the matching example showing which observations you want after the merge.

Super Contributor
Posts: 271

Re: getting desired result in many-to-many join Proc Sql

Posted in reply to Astounding

@Astounding@datasp@Reeza

 

Here is the program and datasets.  The problem is I don't know how records in dat1 are releated to records in dat2.

data dat1;
  input patient_id $  code $ 10. code_startdate $ 12.;
  cards;
  A dermatitis 2015-10-05
  A dermatitis 2015-10-09
  ;
  run;

data dat2;
  input patient_id $  code $ 10. treatment_startdate $ 12.;
  cards;
  A dermatitis 2014-10-08
  A dermatitis 2015-10-07
  ;
  run;

 proc sql noprint;
    create table t1 as 
       select a.patient_id,
	          a.code,
			  a.code_startdate,
			  b.treatment_startdate
	  
		 from dat1 as a 
                 full join dat2 as b
		               on a.patient_id = b.patient_id and        
			              a.code       = b.code;
/*				where b.treatment_startdate > a.code_startdate;*/
quit;

 

 

Super User
Posts: 5,509

Re: getting desired result in many-to-many join Proc Sql

Posted in reply to SAS_inquisitive

Now comes the key question.  For these sample data sets you provided, what would you like the result to look like after they are combined?

Super Contributor
Posts: 271

Re: getting desired result in many-to-many join Proc Sql

Posted in reply to Astounding

@Astounding . The  final data should look this - earlier code_startdate relates to earlier tretment_startdate. 

 

Patient_id    code            code_startdate    treatment_startdate

A                  dermatitis     2015-05-05          2014-10-08       

A                  dermatitis      2015-05-09         2015-10-07

Super User
Posts: 5,430

Re: getting desired result in many-to-many join Proc Sql

Posted in reply to SAS_inquisitive
Ok. To be able to direct you we also need to know if your real data have any deviations from this pattern.
Or does a code date always get followed by a treatment, and vice versa?
Data never sleeps
Super User
Posts: 5,509

Re: getting desired result in many-to-many join Proc Sql

Posted in reply to SAS_inquisitive

Based on everything to date, I would suggest modifying the join.  Unfortunately, my SQL isn't strong enough to give you the exact syntax, but here's the idea.

 

group by patient_id code code_startdate

 

And embellish the WHERE clause:

 

where treatment_startdate < code_startdate and (code_startdate - treatment_startdate) = min(code_startdate - treatment_startdate)

 

I'm not sure if a full join is still appropriate in that case.

 

You also might need to experiment with a GROUP BY clause that uses treatment_startdate instead of code_startdate.

 

I can do the equivalent in a DATA step if that would be an acceptable approach. But it becomes cumbersome if there are additional variables involved in the "code" data set.

Super Contributor
Posts: 271

Re: getting desired result in many-to-many join Proc Sql

Posted in reply to Astounding

@LinusH , @Astounding Thank you. I am trying to get more information about data pattern.

Ask a Question
Discussion stats
  • 9 replies
  • 417 views
  • 3 likes
  • 5 in conversation