BookmarkSubscribeRSS Feed
joebacon
Pyrite | Level 9

Hi all,

I have data that looks like this:

 

 

data work.test ;
  infile datalines dsd dlm='|' truncover;
  input redcap_event_name :$23. BL_wt_clinic_ave BL_wt_home_ave
    FU_wt_clinic_ave FU_wt_home_ave wt_change_source ID
  ;
  format redcap_event_name $23. BL_wt_clinic_ave best.
    BL_wt_home_ave best. FU_wt_clinic_ave best. FU_wt_home_ave best.
    wt_change_source best.
  ;
  informat redcap_event_name $23. ;
  label redcap_event_name='redcap_event_name'
    BL_wt_clinic_ave='BL_wt_clinic_ave' BL_wt_home_ave='BL_wt_home_ave'
    FU_wt_clinic_ave='FU_wt_clinic_ave' FU_wt_home_ave='FU_wt_home_ave'
    wt_change_source='wt_change_source'
3                                                          The SAS System                            23:38 Thursday, August 27, 2020

  ;
datalines4;
screener_arm_1|||||1|1
baseline_arm_1|259.3||||1|1
post_appointment_arm_1|||||1|1
12_week_follow_up_arm_1|||251.3||1|1
screener_arm_1|||||1|2
baseline_arm_1|245.7||||1|2
post_appointment_arm_1|||||1|2
12_week_follow_up_arm_1|||242.4||1|2
screener_arm_1|||||1|4
baseline_arm_1|166.7||||1|4
post_appointment_arm_1|||||1|4
12_week_follow_up_arm_1|||166||1|4
screener_arm_1|||||1|6
baseline_arm_1|280.933333333333||||1|6
post_appointment_arm_1|||||1|6
12_week_follow_up_arm_1|||282.8||1|6
screener_arm_1||||||8
baseline_arm_1|434.5|||||8
post_appointment_arm_1||||||8
12_week_follow_up_arm_1||||442.66|8
;;;;

 

What I need is to be able to find the baseline weight by ID. There are two different variables which could be baseline weight: BL_wt_clinic_ave BL_wt_home_ave

 

Each ID will only have a value under one of these. I want to create a new variable called BL_wt. I want to do the same thing with follow up weight which only has two variables: FU_wt_clinic_ave FU_wt_home_ave. 

I will call this one FU_wt.

 

Afterward, I am trying to subtract the baseline from followup (FU_Wt - BL_wt) and call it Wt_change.

 

I am having trouble populating each of the fields for the BL_wt and Fu_wt for all of the observations under each ID. I have tried taking the max for each of these by variable, but then I cant subtract them as they are not on the same observation. This seems quite simple and I might be missing something rudimentary, but any help would be appreciated.

 

5 REPLIES 5
joebacon
Pyrite | Level 9

I wasn't able to edit it, but the dataset had a small error toward the bottom. It should look like this:

 

data work.test ;
  infile datalines dsd dlm='|' truncover;
  input redcap_event_name :$23. BL_wt_clinic_ave BL_wt_home_ave
    FU_wt_clinic_ave FU_wt_home_ave wt_change_source ID
  ;
  format redcap_event_name $23. BL_wt_clinic_ave best.
    BL_wt_home_ave best. FU_wt_clinic_ave best. FU_wt_home_ave best.
    wt_change_source best.
  ;
  informat redcap_event_name $23. ;
  label redcap_event_name='redcap_event_name'
    BL_wt_clinic_ave='BL_wt_clinic_ave' BL_wt_home_ave='BL_wt_home_ave'
    FU_wt_clinic_ave='FU_wt_clinic_ave' FU_wt_home_ave='FU_wt_home_ave'
    wt_change_source='wt_change_source'
3                                                          The SAS System                            23:38 Thursday, August 27, 2020

  ;
datalines4;
screener_arm_1|||||1|1
baseline_arm_1|259.3||||1|1
post_appointment_arm_1|||||1|1
12_week_follow_up_arm_1|||251.3||1|1
screener_arm_1|||||1|2
baseline_arm_1|245.7||||1|2
post_appointment_arm_1|||||1|2
12_week_follow_up_arm_1|||242.4||1|2
screener_arm_1|||||1|4
baseline_arm_1|166.7||||1|4
post_appointment_arm_1|||||1|4
12_week_follow_up_arm_1|||166||1|4
screener_arm_1|||||1|6
baseline_arm_1|280.933333333333||||1|6
post_appointment_arm_1|||||1|6
12_week_follow_up_arm_1|||282.8||1|6
screener_arm_1|||||2|8
baseline_arm_1|434.5||||2|8
post_appointment_arm_1|||||2|8
12_week_follow_up_arm_1|||442.66|2|8
;;;;

I created the source group to differentiate between the different nature of the data (self report or clinic weighed)

jimbarbour
Meteorite | Level 14

There's probably a more elegant way to do this with SQL, but I'm tired, so I just wrote a quick Data step:

DATA	WORK.Test_Final;
	SET	WORK.Test;
		BY	ID;

		DROP	Save_:	redcap_event_name;

	RETAIN	Save_BL_wt_clinic_ave Save_BL_wt_home_ave Save_FU_wt_clinic_ave Save_FU_wt_home_ave;

	IF	NOT	MISSING(BL_wt_clinic_ave)				THEN
		Save_BL_wt_clinic_ave	=	BL_wt_clinic_ave;

	IF	NOT	MISSING(BL_wt_home_ave)					THEN
		Save_BL_wt_home_ave		=	BL_wt_home_ave;

	IF	NOT	MISSING(FU_wt_clinic_ave)				THEN
		Save_FU_wt_clinic_ave	=	FU_wt_clinic_ave;

	IF	NOT	MISSING(FU_wt_home_ave)					THEN
		Save_FU_wt_home_ave		=	FU_wt_home_ave;

	IF	LAST.ID										THEN
		DO;
			BL_wt_clinic_ave	=	Save_BL_wt_clinic_ave;
			BL_wt_home_ave		=	Save_BL_wt_home_ave;
			FU_wt_clinic_ave	=	Save_FU_wt_clinic_ave;
			FU_wt_home_ave		=	Save_FU_wt_home_ave;
			
			IF	NOT	MISSING(BL_wt_clinic_ave)		THEN
				BL_Wt			=	BL_wt_clinic_ave;
			ELSE
			IF	NOT	MISSING(BL_wt_home_ave)			THEN
				BL_Wt			=	BL_wt_home_ave;

			IF	NOT	MISSING(FU_wt_clinic_ave)		THEN
				FU_Wt			=	FU_wt_clinic_ave;
			ELSE
			IF	NOT	MISSING(FU_wt_home_ave)			THEN
				FU_Wt			=	FU_wt_home_ave;

			Wt_change			=	FU_Wt - BL_wt;
			OUTPUT;
			CALL	MISSING(Save_BL_wt_clinic_ave, Save_BL_wt_home_ave, Save_FU_wt_clinic_ave, Save_FU_wt_home_ave);
		END;
RUN;

The above Data step follows immediately after the Data step that you coded and uses the output of your Data step in this Data step's SET statement.

 

Results:

Last.Example_2020-09-03_23-00-51.jpg

Is this what you wanted in terms of results?  It's basically one line per ID with all of the data consolidated from all of the raw input data lines.

 

Jim

joebacon
Pyrite | Level 9

It is really close! I am trying to do the same thing, but not consolidate the lines. Each of the observations in the IDs is a different time point. I was hoping that I could have the BL_Wt and the Fu_Wt replicate for each of the observations within ID. However, all the rest is spot on!

jimbarbour
Meteorite | Level 14

Oh, OK.  That's not too difficult. 

 

Do the following:

  1. Take that Data step that I shared with you previously and change the dataset name from WORK.Test_Final to WORK.Test_Summary.
  2. Join WORK.Test_Summary back with the original WORK.Test dataset with SQL like so:
PROC	SQL;
	CREATE TABLE	WORK.Test_Final
		AS	SELECT	Detail.*
			,Summary.BL_Wt
			,Summary.FU_Wt
			,Summary.Wt_Change
			FROM	WORK.Test					Detail
			INNER	JOIN	WORK.Test_Summary	Summary
				ON	Detail.ID	=	Summary.ID
				;
QUIT;

You should get the following results:

Last.Example_Summary_2020-09-03_23-00-51.jpg

 

Jim

PaigeMiller
Diamond | Level 26

In the data you provided, there is always exactly one baseline measuremeant (never 2 or more) and exactly one follow-up measurement (never two or more), so this will work:

 

proc summary data=test nway;
	class id;
	var bl_wt_clinic_ave--fu_wt_home_ave;
	output out=stats(drop=_:) max=max1-max4;
run;
data want;
	merge test stats;
	by id;
	bl_wt=max(max1,max2);
	fu_wt=max(max3,max4);
	wt_change=bl_wt-fu_wt;
	drop max1-max4;
run;

  

--
Paige Miller

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 5 replies
  • 819 views
  • 0 likes
  • 3 in conversation