BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
daszlosek
Quartz | Level 8

Hello all, I have data set of lab tests where some of the tests were performed in house and some were sent out to a different lab to get tested for a patient. Those lab test that get sent out have a date that is 1-7 days post the date of the patients visit/ time of sample collection (thus added to a different row). I would like to add this information to the same row as all the other patient information collection for that visit. For example I have an example of two variables (lab_test_1 and lab_test_2) shown below for a single id where those lab tests order dates were 1 day after. I will have to do this for every id

 

ID var1 var2 var3 var4 var5 labtest_1 labtest_2 test_order_date Time Between Tests
1 24 0.33 Y 7 23 .   11/3/2015 0
1 45 0.54 Y 9 22 .   12/17/2015 44
1 . . . . . 1.28 50-75 12/18/2015 1
1 354 0.36 N 8 21 .   6/22/2016 187
1 6 0.45 N 13 22 1.31 0-2 8/18/2016 57
1 . . . . . 1.31 0-2 8/18/2016 0
1 4 0.32 N 12 28 .   11/15/2016 89

 

Want dataset:

ID var1 var2 var3 var4 var5 labtest_1 labtest_2 test_order_date Time Between Tests alternative_lab_test__date
1 24 0.33 Y 7 23 .   11/3/2015 0  
1 45 0.54 Y 9 22 1.28 50-75 12/17/2015 44    12/18/2015
1 354 0.36 N 8 21 .   6/22/2016 187  
1 6 0.45 N 13 22 1.31 0-2 8/18/2016 57  
1 4 0.32 N 12 28 .   11/15/2016 89  

 

 

I think I would need to create some sort of threshold to look at time between test dates (shown above) that is between 1-7 days post collection date and somehow match onto the row that has the most non-missing information. I am mainly stuck on how to create a theshold for the dates. 

 

Best

 

EDIT: I played around a little bit with the code that PG created and I think I made my example data a little too simple. If I only have one id, this would work well, but I have cases with multiple IDs and the lag function does not seem to work well when you have multple subject ids (it will carry over the date from the last row from the prevous id to the first row of the current id, even though they are two different subjects. I have added a second subject as an example:

data have;
infile datalines truncover;
input ID labID var1 var2 var3 $ var4 var5 labtest_1 labtest_2 $ test_order_date :mmddyy10. ;
format test_order_date yymmdd10.;
datalines;
1 1 24 0.33 Y 7 23 . . 11/3/2015
1 1 45 0.54 Y 9 22 . . 12/17/2015
1 2 . . . . . 1.28 50-75 12/18/2015
1 1 354 0.36 N 8 21 . . 6/22/2016
1 1 6 0.45 N 13 22 1.31 0-2 8/18/2016
1 2 . . . . . 1.31 0-2 8/18/2016
1 1 4 0.32 N 12 28 . . 11/15/2016
2 1 14 2.2 Y 5 23 . . 8/16/2015
2 2 . . Y 5 23 . . 8/16/2015
2 1 . . . . . 0-2 1.013 8/21/2015
2 2	. . . . . 0-2 1.013	8/21/2015
2 1	22 2.1 N 5 25 .	. 9/12/2015
2 1	23 2.5 N 4 21 .	. 10/28/2015
2 1	18 2.1 N 2 22 .	. 4/8/2016
2 2	26 1.7 . . . 0-2 1.014 10/26/2016
;
1 ACCEPTED SOLUTION

Accepted Solutions
PGStats
Opal | Level 21

Had to fix your data a bit. Added variable labId to identify the results origin and fixed the number of columns on some obs. Didn't require variable days_between_tests.

 

data have;
infile datalines truncover;
input ID labID var1 var2 var3 $ var4 var5 labtest_1 labtest_2 $ test_order_date :mmddyy10. Days_Between_Tests;
format test_order_date yymmdd10.;
datalines;
1 1 24 0.33 Y 7 23 . . 11/3/2015 0
1 1 45 0.54 Y 9 22 . . 12/17/2015 44
1 2 . . . . . 1.28 50-75 12/18/2015 1
1 1 354 0.36 N 8 21 . . 6/22/2016 187
1 1 6 0.45 N 13 22 1.31 0-2 8/18/2016 57
1 2 . . . . . 1.31 0-2 8/18/2016 0
1 1 4 0.32 N 12 28 . . 11/15/2016 89
;

proc sort data=have; by id test_order_date; run;

data temp;
set have; by id;
prevDate = lag(test_order_date);
prevLabId = lag(labId);
if first.id then test + 1;
else if labId = prevLabId or intck("day", prevDate, test_order_date) > 7 then test + 1;
drop prev:;
run;

data want;
update temp(where=(labId=1)) temp(where=(labId ne 1)); by test;
drop labId test days_between_tests;
run;

proc print data=want; run;
PG

View solution in original post

3 REPLIES 3
ballardw
Super User

Please post data in the form of a datastep.

Instructions here: https://communities.sas.com/t5/SAS-Communities-Library/How-to-create-a-data-step-version-of-your-dat... will show how to turn an existing SAS data set into data step code that can be pasted into a forum code box using the {i} icon or attached as text to show exactly what you have and that we can test code against.

 

Anything else we have to attempt to create data to test code with and we may make choices on variable types that will not work with your data.

 

And in the case of some browsers and the form you pasted the data many of the cells are appearing over the marginal information in the forum and just plain hard to read.

PGStats
Opal | Level 21

Had to fix your data a bit. Added variable labId to identify the results origin and fixed the number of columns on some obs. Didn't require variable days_between_tests.

 

data have;
infile datalines truncover;
input ID labID var1 var2 var3 $ var4 var5 labtest_1 labtest_2 $ test_order_date :mmddyy10. Days_Between_Tests;
format test_order_date yymmdd10.;
datalines;
1 1 24 0.33 Y 7 23 . . 11/3/2015 0
1 1 45 0.54 Y 9 22 . . 12/17/2015 44
1 2 . . . . . 1.28 50-75 12/18/2015 1
1 1 354 0.36 N 8 21 . . 6/22/2016 187
1 1 6 0.45 N 13 22 1.31 0-2 8/18/2016 57
1 2 . . . . . 1.31 0-2 8/18/2016 0
1 1 4 0.32 N 12 28 . . 11/15/2016 89
;

proc sort data=have; by id test_order_date; run;

data temp;
set have; by id;
prevDate = lag(test_order_date);
prevLabId = lag(labId);
if first.id then test + 1;
else if labId = prevLabId or intck("day", prevDate, test_order_date) > 7 then test + 1;
drop prev:;
run;

data want;
update temp(where=(labId=1)) temp(where=(labId ne 1)); by test;
drop labId test days_between_tests;
run;

proc print data=want; run;
PG
daszlosek
Quartz | Level 8

I just wanted to give an update and say that the code PGStats has pointed me in the correct direcion. I have a much better grasp on how I want to code the issue. Once it is complete, I will post it. 

 

Thank you very much!

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 3 replies
  • 786 views
  • 1 like
  • 3 in conversation