BookmarkSubscribeRSS Feed
manya92
Fluorite | Level 6

Hi all, 

 

I am trying to add a variable called height to my cohort dataset by left joining. I have 92 observations in my cohort so I am expecting the final number to be 92 after the join since I am only left joining (adding variables to the existing conditions). I am including both scenarios where height is missing and not missing. I am only getting 64 observations. I only want to see the most recent record of height 30 days prior to the index date. Attached is my code. 

 

Please let me know how can I solve this. 


*height most recent height within 30 days prior to the index date; proc sql ; create table _02a_ht as select a.*, b.labcomponent, b.TestResultCleaned,b.TestDate, case when (a.index_date-30) le b.TestDate le a.index_date and b.labcomponent ='Body Height' and b.TestResultCleaned ne . then b.TestResultCleaned else . end as height from _02a_ecog2 as a inner join dfi_panc.vitals as b on a.PatientID=b.PatientID and b.labcomponent ='Body Height' order by a.PatientID,b.TestDate; quit ;/*3463 */ *making sure the testdate is 30 days prior the index date; data _02a_testdate; set _02a_ht ; by PatientID TestDate ; if last.PatientID; where height = . or height ne .; run ;/*64 */

 

 There's no error

6 REPLIES 6
Reeza
Super User

I am trying to add a variable called height to my cohort dataset by left joining.

 



*height most recent height within 30 days prior to the index date;
proc sql ;
	create table _02a_ht as 
	select a.*, b.labcomponent, b.TestResultCleaned,b.TestDate,
	case 
		when (a.index_date-30) le b.TestDate le a.index_date  and b.labcomponent ='Body Height' and b.TestResultCleaned ne .
		then b.TestResultCleaned 
		else .
	end as height	
	from _02a_ecog2 as a
	inner join dfi_panc.vitals as b 
	on a.PatientID=b.PatientID and b.labcomponent ='Body Height'
	order by a.PatientID,b.TestDate;
quit ;/*3463 */

*making sure the testdate is 30 days prior the index date;
data _02a_testdate;	
	set _02a_ht ;
	by PatientID TestDate ;
	if last.PatientID;
	where height = . or height ne .;
run ;/*64 */

Try changing inner join to left join?

ballardw
Super User

While you may start with 92 in your cohort this bit of code:

 

 inner join dfi_panc.vitals as b
on a.PatientID=b.PatientID and b.labcomponent ='Body Height'

 

has the potential to remove patientid values either because it does not appear in both data sets OR the only match in set b does not have "Body Height" as the value for labcomponent. Possibly due to spelling differences.

By definition from the online documentation:

An inner join returns only the subset of rows from the first table that matches rows from the second table.

manya92
Fluorite | Level 6

I changed that to a left join but now i have 89 obsv still not 92

Reeza
Super User

WHERE filters out the data. If you remove that do you get the records you want?

Look at the missing records and determine what condition is excluding them....with less than a 100 records you can manually verify it. 

 


@manya92 wrote:

I changed that to a left join but now i have 89 obsv still not 92


 

Reeza
Super User
Sorry, not the WHERE but this likely filters out the rest: and b.labcomponent ='Body Height'
manya92
Fluorite | Level 6

Okay, now i have 89 obvs , i want to see which obsv from the prev dataset haev been deleted ? how do i see that?

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 6 replies
  • 931 views
  • 0 likes
  • 3 in conversation