BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
reemm225
Calcite | Level 5

I have a dataset where I have multiple rows per id, covariates for each id, year of prescription (yearrx), and study start (ss) and study end (se). I want to calculate the total person time donated be each member. What I eventually want to have is 2 rows per id, so that in the first row, I will sum the person time from the study start (ss) until the end of that year, and a second row that will sum the person time one day after the end of the study start year, until the end of the study. 

 

For example patient 2, entered the study In may,2,2001, had 2 prescription types in 2019, and left the study at June,30,2019. I want to create for this patient 2 rows, the first row with the time between SS and end of ss(12,31,year(SS)), and a second row calculating the time between (01,01,year(SS)+1), and the end of the study (SE).  I previously tried to calculate the person time in each row but it didn't work, any thoughts? This codes creates columns, but I want two rows per patient at the end. 

data x; set try2;
d1= intck('day', ss, mdy(12,31, year(ss)));
d2= intck('day', mdy(1,1,(year(ss)+1)), se);
run;

 

idcov1cov2RxYeartypeNRxSSSE
1xz20122406-Aug-1230-Jun-19
1xz20132806-Aug-1230-Jun-19
1xz20142806-Aug-1230-Jun-19
1xz20152606-Aug-1230-Jun-19
1xz20162506-Aug-1230-Jun-19
2yv20191102-May-0130-Jun-19
2yv20192102-May-0130-Jun-19
1 ACCEPTED SOLUTION

Accepted Solutions
Tom
Super User Tom
Super User

I don't understand why you have the other variable and the multiple observations.

So let's ignore them.

 

data have;
  input id cov1 $ cov2 $ RxYear type NRx SS :date. SE :date. ;
  format ss se date9.;
cards;
1 x z 2012 2 4 06-Aug-12 30-Jun-19
1 x z 2013 2 8 06-Aug-12 30-Jun-19
1 x z 2014 2 8 06-Aug-12 30-Jun-19
1 x z 2015 2 6 06-Aug-12 30-Jun-19
1 x z 2016 2 5 06-Aug-12 30-Jun-19
2 y v 2019 1 1 02-May-01 30-Jun-19
2 y v 2019 2 1 02-May-01 30-Jun-19
;

proc sort data=have(keep=id ss se)  out=subjects nodupkey;
  by id ;
run;

data want;
  set subjects;
  if year(se)=year(ss) then do;
    year1=1;
    rest=1;
    duration=se-ss;
    output;
  end;
  else do;
    year1=1;
    rest=0;
    duration = mdy(1,1,year(ss)+1)-ss;
    output;
    year1=0;
    rest=1;
    duration = se - mdy(1,1,year(ss)+1);
    output;
  end;
run;

proc print;
run;

Results

Obs    id           SS           SE    year1    rest    duration

 1      1    06AUG2012    30JUN2019      1        0        148
 2      1    06AUG2012    30JUN2019      0        1       2371
 3      2    02MAY2001    30JUN2019      1        0        244
 4      2    02MAY2001    30JUN2019      0        1       6389

View solution in original post

9 REPLIES 9
PaigeMiller
Diamond | Level 26
data x; 
    set try2;
    d1= intck('day', ss, mdy(12,31, year(ss)));
    output;
    d1= intck('day', mdy(1,1,(year(ss)+1)), se);
    output;
run;
--
Paige Miller
reemm225
Calcite | Level 5
Thank you. This code now multiplies each id data, is there a way to keep just the number of rows per id per year?
PaigeMiller
Diamond | Level 26

SHow us the desired output.

--
Paige Miller
Tom
Super User Tom
Super User

Which is it?

 

Screenshot 2022-02-07 114226.jpg

If you do not want to change the number of observations use something like your original code where you have one variable for the first year and another for the "rest".  Otherwise if you want to output separate observations for the first year and the rest then by definition you are going to have multiple observations.

reemm225
Calcite | Level 5
That makes sense. Thanks!
Tom
Super User Tom
Super User

I don't understand why you have the other variable and the multiple observations.

So let's ignore them.

 

data have;
  input id cov1 $ cov2 $ RxYear type NRx SS :date. SE :date. ;
  format ss se date9.;
cards;
1 x z 2012 2 4 06-Aug-12 30-Jun-19
1 x z 2013 2 8 06-Aug-12 30-Jun-19
1 x z 2014 2 8 06-Aug-12 30-Jun-19
1 x z 2015 2 6 06-Aug-12 30-Jun-19
1 x z 2016 2 5 06-Aug-12 30-Jun-19
2 y v 2019 1 1 02-May-01 30-Jun-19
2 y v 2019 2 1 02-May-01 30-Jun-19
;

proc sort data=have(keep=id ss se)  out=subjects nodupkey;
  by id ;
run;

data want;
  set subjects;
  if year(se)=year(ss) then do;
    year1=1;
    rest=1;
    duration=se-ss;
    output;
  end;
  else do;
    year1=1;
    rest=0;
    duration = mdy(1,1,year(ss)+1)-ss;
    output;
    year1=0;
    rest=1;
    duration = se - mdy(1,1,year(ss)+1);
    output;
  end;
run;

proc print;
run;

Results

Obs    id           SS           SE    year1    rest    duration

 1      1    06AUG2012    30JUN2019      1        0        148
 2      1    06AUG2012    30JUN2019      0        1       2371
 3      2    02MAY2001    30JUN2019      1        0        244
 4      2    02MAY2001    30JUN2019      0        1       6389
Tom
Super User Tom
Super User

Perhaps you meant to generate a separate observation per year?

data want;
  set subjects;
  do year=year(ss) to year(se);
    duration = min(se,mdy(12,31,year)) - max(ss,mdy(1,1,year)) + 1;
    output;
  end;
run;

Results

Obs    id           SS           SE    year    duration

  1     1    06AUG2012    30JUN2019    2012       148
  2     1    06AUG2012    30JUN2019    2013       365
  3     1    06AUG2012    30JUN2019    2014       365
  4     1    06AUG2012    30JUN2019    2015       365
  5     1    06AUG2012    30JUN2019    2016       366
  6     1    06AUG2012    30JUN2019    2017       365
  7     1    06AUG2012    30JUN2019    2018       365
  8     1    06AUG2012    30JUN2019    2019       181
  9     2    02MAY2001    30JUN2019    2001       244
 10     2    02MAY2001    30JUN2019    2002       365
 11     2    02MAY2001    30JUN2019    2003       365
 12     2    02MAY2001    30JUN2019    2004       366
 13     2    02MAY2001    30JUN2019    2005       365
 14     2    02MAY2001    30JUN2019    2006       365
 15     2    02MAY2001    30JUN2019    2007       365
 16     2    02MAY2001    30JUN2019    2008       366
 17     2    02MAY2001    30JUN2019    2009       365
 18     2    02MAY2001    30JUN2019    2010       365
 19     2    02MAY2001    30JUN2019    2011       365
 20     2    02MAY2001    30JUN2019    2012       366
 21     2    02MAY2001    30JUN2019    2013       365
 22     2    02MAY2001    30JUN2019    2014       365
 23     2    02MAY2001    30JUN2019    2015       365
 24     2    02MAY2001    30JUN2019    2016       366
 25     2    02MAY2001    30JUN2019    2017       365
 26     2    02MAY2001    30JUN2019    2018       365
 27     2    02MAY2001    30JUN2019    2019       181
reemm225
Calcite | Level 5
Yes that is what I wanted, but I though to just create 2 rows per patient. fist year and then lump the rest. Thank you!!
reemm225
Calcite | Level 5
This worked this is very helpful!!!

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 9 replies
  • 1271 views
  • 0 likes
  • 3 in conversation