BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
mmovahed
Fluorite | Level 6

Hi everyone

 

I like to create dataset C from two dataset A (time varying exposure) and B (time-varying outcome) (please see attached). 

 

Question: Which SAS codes (program) can I use to have dataset C from two dataset A and B

 

 

 

 

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
SASJedi
SAS Super FREQ

Something like this, maybe?

data a;
	infile datalines dsd;
	INPUT PatID Days Treatment;
	label patid='Patient ID' Days= 'Days from diagnosis' ;
datalines;
1,0,0
1,100,1
1,150,1
1,200,1
1,250,0
;

title "A";
proc print; run;

data b;
	infile datalines dsd;
	INPUT PatID Days Hospitalized;
	label patid='Patient ID' Days= 'Days from diagnosis' ;
datalines;
1,0,0
1,100,0
1,180,1
1,220,1
;

title "B";
proc print; run;

data C;
	merge a b;
	by PatID days;
        /* If new Hospitalized between Treatment dates, use previous treatment */
	Treatment=coalesce(Treatment,lag1(Treatment));
        /* If new Treatment between Hospitalized dates, use 0 */
	Hospitalized=coalesce(Hospitalized,0);
run;

title "C";
proc print; run;
Check out my Jedi SAS Tricks for SAS Users

View solution in original post

12 REPLIES 12
SASJedi
SAS Super FREQ

Something like this, maybe?

data a;
	infile datalines dsd;
	INPUT PatID Days Treatment;
	label patid='Patient ID' Days= 'Days from diagnosis' ;
datalines;
1,0,0
1,100,1
1,150,1
1,200,1
1,250,0
;

title "A";
proc print; run;

data b;
	infile datalines dsd;
	INPUT PatID Days Hospitalized;
	label patid='Patient ID' Days= 'Days from diagnosis' ;
datalines;
1,0,0
1,100,0
1,180,1
1,220,1
;

title "B";
proc print; run;

data C;
	merge a b;
	by PatID days;
        /* If new Hospitalized between Treatment dates, use previous treatment */
	Treatment=coalesce(Treatment,lag1(Treatment));
        /* If new Treatment between Hospitalized dates, use 0 */
	Hospitalized=coalesce(Hospitalized,0);
run;

title "C";
proc print; run;
Check out my Jedi SAS Tricks for SAS Users
mmovahed
Fluorite | Level 6
Thanks Jedi SAS

It seems working. Now I have all actual values for treatment and hospitalization after merging.

Regards

mmovahed
Fluorite | Level 6

Hi everyone

 

I have two medication files with start and stop in each. I want to merge them somehow to cover time interval that patient have taken both medA and medB. Please see attached file: dataset A and B and dataset I want to create using SAS code.

ballardw
Super User

Many users here don't want to download Office files because of virus potential, others have such things blocked by security software. Also if you give us office files we have to create a SAS data set and due to the non-existent constraints on  data cells the result we end up with may not have variables of the same type (numeric or character) and even values.

 

Please provide example in the form of working data step code. You can convert your SAS data set using Instructions here: https://communities.sas.com/t5/SAS-Communities-Library/How-to-create-a-data-step-version-of-your-dat... will show how to turn an existing SAS data set into data step code that can be pasted into a forum code box using the </> icon or attached as text to show exactly what you have and that we can test code against.

 

Best is include a similar example of the expected output given the example input data sets.

mmovahed
Fluorite | Level 6

Thanks for the note.

 

Dataset A

studyid

Start

stop

medA

1

83

948

1

1

1042

1322

1

1

2792

5457

1

2

0

200

1

 

Dataset B

studyid

Start

stop

MedB

1

0

101

1

3

50

150

1

 

 

 

 

 

Dataset I want to create

studyid

Start

stop

MedB

MedA

1

0

83

1

1

1

83

101

1

1

1

101

948

0

1

1

948

1042

0

1

1

1042

1322

0

1

1

2792

5457

0

1

2

0

200

0

1

3

50

150

1

0

ballardw
Super User

Not working data steps...

 

I am very sure I do not understand why you want

Dataset I want to create

 

studyid

Start

stop

MedB

MedA

1

0

82

1

0

1

83

101

1

1

1

101

948

0

1

1

948

1042

0

1

1

1042

1322

0

1

1

2792

5457

0

1

2

0

200

0

1

3

50

150

1

0

 

You said "cover time interval that patient have taken both medA and medB". Assuming 1 for MedA or MedB means "taking med", then I would think it makes more sense to have where the endpoints do not imply something taken for the entire interval what not indicated in the starting data sets. Like this

Placing a 1 in MedA for the first line would be saying that MedA was taken for the entire interval.

 

studyid

Start

stop

MedB

MedA

1

0

82

1

0

1

83

101

1

1

1

102

948

0

1

1

949

1042

0

0

1

1043

1322

0

1

1

2792

5457

0

1

2

0

200

0

1

3

50

150

1

0

 

And possibly even include an interval:

1

1323

2791

0

0

 

One might also wonder if there should be actual dates somewhere if this is going to be compared with other data that has dates.

mmovahed
Fluorite | Level 6
Thanks again

I want to create a time-varying variable of multiple medication when in some inetrval patient take med A , some inteval take med A & B , and some inteval just take medB.

If you just help me to have the related SAS code for doing this it would be appreciated. in STATA there is a specific code called tvc merge but I want to use it data in SAS.

M
mmovahed
Fluorite | Level 6

Thanks Ballardw

 

for your reply. I agree with your modification and explanation. having considered your modification, I wonder which SAS codes I can use to create this new dataset

 

Regards

Ksharp
Super User

As Balladaw said there are some questions you need to answer.

data A;
input
studyid

Start

stop

medA
;
cards;
1

83

948

1

1

1042

1322

1

1

2792

5457

1

2

0

200

1
;
 

data B;
input
studyid

Start

stop

MedB
;
cards;
1

0

101

1

3

50

150

1
;

data temp_a(index=(x=(studyid date)));
 set a;
 n_a+1;
 do date=start to stop;
   output;
 end;
 drop start stop;
run;

data temp_b(index=(x=(studyid date)));
 set b;
 n_b+1;
 do date=start to stop;
   output;
 end;
 drop start stop;
run;

data temp;
 merge temp_a temp_b;
 by studyid date;
run;
data temp2;
 set temp;
 if studyid ne lag(studyid) or n_a ne lag(n_a) or n_b ne lag(n_b) then group+1;
run;
proc sql;
create table want as
select group,studyid,min(date) as start,max(date) as stop,max(meda) as meda,max(medb) as medb
 from temp2
  group by group,studyid;
quit;
mmovahed
Fluorite | Level 6

Thanks for the code. It works 

 

Regards

M

ballardw
Super User

The basic part of your question looks like this should be part of the solution:

 

data c;
   set a
         b
  ;
run;

proc sort data=c;
by patid daysfromdiagnosis;
run;

Some items to address:

For future questions if you have SAS data sets use the names of the variables. "Days from diagnosis" by default is not going to be a valid variable name.

 

Second: are you sure that you want values that are not in the original data to be 0 or 1 and not missing? We have no context for what the values of Treatment or Hospitalized actually mean in real world terms and I do not like "guessing" what the underlying rules for assigning such values might be.

There could very well be another step after the Proc Sort of the combined data sets.

 

Note: Merge in SAS terms means a combination from side by side and has some restrictions for most usage, such as sorted by one or more common variables. We do not know that is the real case.

 

Please provide data in the form of working data sets so we at least know variable types and actual names. If you can't do that provide text pasted into a text box opened with the </> icon above the main message window. We can't code against values buried in pictures.

mmovahed
Fluorite | Level 6
Thanks for your reply.

I have already tried this code. the problem is your second concern about the real value for treatment after merging. I know the approach in the STATA ( tvc_merge) to deal with this actual values . but I am looking for an appropriate code in SAS in this regard.

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 12 replies
  • 1110 views
  • 2 likes
  • 4 in conversation