BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
DathanMD
Obsidian | Level 7

Hello Colleagues, I am new to SAS and I would like help with some data manipulation from someone here.

 

Question: I have data as shown below, but I would like to collapse some rows with unreasonable time interval (job durations). I would like to assume that durations of less than 5 days are unlikely to be true and hwere such exists, the next job with longer duration should be assumed to have been began on the start date of the overridden job.

Data I have:

id

Start_date

End-date

JOB

1

12JUN2015

14JUN2016

CHEF

1

14JUN2016

15JUN2016

MANAGER

1

15JUN2016

17JAN2018

DRIVER

1

18JAN2015

19JAN2018

ADMIN

1

19JAN2018

19OCT2019

FIRE OFFICER

2

 

 

 

2

 

 

 

2

 

 

 

 

Data I want

id

Start_date

End-date

JOB

1

12JUN2016

19JAN2018

DRIVER

1

19JAN2018

19OCT2019

FIRE OFFICER

2

 

 

 

2

 

 

 

2

 

 

 

 

 

Thanks

 

Dathan Byonanebye

1 ACCEPTED SOLUTION

Accepted Solutions
novinosrin
Tourmaline | Level 20

Hi @DathanMD  I read this post of yours 3 times and still not sure of the logic. So I am gonna take a guess

 


data have;
input id (Start_date End_date)(:date9.) JOB & $20.;
format Start_date End_date date9.;
cards;
1
12JUN2015
14JUN2016
CHEF
1
14JUN2016
15JUN2016
MANAGER
1
15JUN2016
17JAN2018
DRIVER
1
18JAN2015
19JAN2018
ADMIN
1
19JAN2018
19OCT2019
FIRE OFFICER
;

data want;
 do until(last.id);
  set have;
  by id;
  if t=start_date and intck('days',start_date,end_date)>5 then output;
  t=end_date;
 end;
 drop t;
run;
id Start_date End_date JOB
1 15JUN2016 17JAN2018 DRIVER
1 19JAN2018 19OCT2019 FIRE OFFICER

View solution in original post

5 REPLIES 5
novinosrin
Tourmaline | Level 20

Hi @DathanMD  I read this post of yours 3 times and still not sure of the logic. So I am gonna take a guess

 


data have;
input id (Start_date End_date)(:date9.) JOB & $20.;
format Start_date End_date date9.;
cards;
1
12JUN2015
14JUN2016
CHEF
1
14JUN2016
15JUN2016
MANAGER
1
15JUN2016
17JAN2018
DRIVER
1
18JAN2015
19JAN2018
ADMIN
1
19JAN2018
19OCT2019
FIRE OFFICER
;

data want;
 do until(last.id);
  set have;
  by id;
  if t=start_date and intck('days',start_date,end_date)>5 then output;
  t=end_date;
 end;
 drop t;
run;
id Start_date End_date JOB
1 15JUN2016 17JAN2018 DRIVER
1 19JAN2018 19OCT2019 FIRE OFFICER
DathanMD
Obsidian | Level 7

Thank you Novinosrin.

 

My issue  main issue is to override dates (rows) where the difference between start and end_dates is less than five days. 

I will explain a little better (I hope I do!). Let’s assume I have data below regarding patient treatment with variables patient, start_date, end_date and antibiotic.

 

Patient    start_date     end_date                          antibiotic

101         26DEC1994          31JAN1995                          Ceftriaxone

101         25FEB1997           06OCT1997                          ciprofloxacin

101         07OCT1997          28JAN1998                          doxycycline

101         28JAN1998          09DEC1998                          amoxacillin

101         09DEC1998          14DEC1999                          cefixime

101         15DEC1999          14AUG2000                        azithromycin

101         23APR2001          25APR2001                          cefamor

101         25APR2001          31JUL2002                           cefaclor

101         31JUL2002           01AUG2002                        zedex

101         01AUG2002        13JAN2003                          bromoxin

101         14JAN2003          14MAY2005                        cedrol

101         14MAY2005        30MAY2005                        kabryl

101         27NOV2007        02NOV2010                        benedryl

 

What this data shows is that patient 101 too ceftriaxone for 37 days then ciprofloxacin from feb-oct1997 . But took cefamor for only 2 days. In that case I would like the data to look like the patient took cefaclor from 23APR2001, there by eliminating Cefamor from the dataset. In the same way the end_date for kabryl should become 02NOV2010, thereby eliminating the row here treatment-benedryl.

 

This procedure is aimed at eliminating rows where (end_date-startdate)<5 days but restore continuity in dates.

 

Patrick
Opal | Level 21

@DathanMD 

Is a patient only ever going to use one antibiotic at a time? If not then how to treat overlapping dates or a case where patient 101 start taking a 2nd antibiotic on 25APR2001 (or overlapping on 24APR2001? or uses another med from 01APR2001 to 31APR2001?, or if there is a gap and the new med only starts at 27APR2001? ...or...).

DathanMD
Obsidian | Level 7

hI patrick. I would like a patient taking one antibiotic at a time (although I know in reeality patients may take more than one). But, in this case, i would like to retain only the antibiotiv with the longest durationa nd its dates whould overlap the antibiotics whose duration<5 days.

 

Dathan 

Patrick
Opal | Level 21

@DathanMD wrote:

hI patrick. I would like a patient taking one antibiotic at a time (although I know in reeality patients may take more than one). But, in this case, i would like to retain only the antibiotiv with the longest durationa nd its dates whould overlap the antibiotics whose duration<5 days.

 

Dathan 


The question is: Does your actual data support the simple logic you describe or can there be multiple medications with overlapping date ranges or gaps? If your actual data is not as simple as the sample data you've posted then you need to define the transformation logic which is suitable for your actual data - and you should post sample data which is representative for your actual data.

 

Sooo.... what is it?

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
Mastering the WHERE Clause in PROC SQL

SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 5 replies
  • 834 views
  • 1 like
  • 3 in conversation