BookmarkSubscribeRSS Feed
cyoung
Fluorite | Level 6

Hi everyone!

 

Conceptually, I've run into an issue with my code and I'm wondering if anyone's done something similar/has an idea how to do this task?  

 

Essentially I have a variable for the first non-fatal event in a month (i.e. Event1_date1) followed by two variables for a second non-fatal event within the month (i.e. Event1_date2) or a fatal event after the first (i.e. FEvent_date1).  My data is organized by person-time for each month of the study.  A sample of the data is below:

 

data test1;
  infile cards dlm=',';
  input ID MonthYear  Event1_date1  Event1_date2  FEvent_date1;
  cards;
1, 1, 3000, , 
1, 2, , ,
1, 3, 3060, , 3100
1, 4, , , 
2, 8, 2090, 3040, 
2, 9, ,
2, 10, , , 3190 3, 1, 2935, 2945, 2965 3, 2, , , ;

I am trying to figure out a way to calculate the time to an event (either non-fatal or fatal) after the initial non-fatal event.  Originally I was thinking of running a datastep that subtracted the first in either Event1_date2 or FEvent_date1, but then realized this would ignore if there's an earlier nonfatal event in the Event1_date1 column (see case 1).

 

Does anyone have an idea how to find the second largest number across 3 variables, then both subtract this from the initial value and count the months between this event and the initial event?

 

This is what I've been working with so far, with little success.

 

data test2;
do until (last.id); 
set test1
2nd_largest=whichn(largest(2,of col(Event1_date1, Event1_date2, FEvent_date1)),of col(Event1_date1, Event1_date2, FEvent_date2)); event2=vname(col(2nd_largest));
end; run;

data test3;
do until (last.id);
set test2;
time2event2=.;
IF first.Event1_date THEN time2event2= Event1_date1 - event2;

time2event2_mth=.;
IF first.id THEN counter=0;
IF counter=0 THEN do;
counter=1;
ELSE if Event1_date1<event2 THEN DO;
counter+1;
END;
end;

 

8 REPLIES 8
RW9
Diamond | Level 26 RW9
Diamond | Level 26

You can simply put them into an array then call sortn() to sort them, hence the second element would be the seoncond smallest if existing:

data test1;
  infile cards dlm=',';
  input ID MonthYear  Event1_date1  Event1_date2  FEvent_date1;
  cards;
1, 1, 3000, , 
1, 2, , ,
1, 3, 3060, , 3100
1, 4, , , 
2, 8, 2090, 3040, 
2, 9, ,
2, 10, , , 3190
3, 1, 2935, 2945, 2965
3, 2, , ,
;
run;
data want;
  set test1;
  array vals{*} event1_date1 event1_date2 fevent_date1;
  call sortn(of vals{*});
  second_large=ifn(vals{2}=.,vals{3},vals{2});
run;

However, that being said, you will likely find your task a lot easier if you normalise your data (have the dates going down the page, rather than across).

cyoung
Fluorite | Level 6

Unfortunately, I can't normalize that way since we're trying to keep the observations in Person-Month (also I can't combine my variables since some represent different types of events).

 

Using this code, would I follow by using the "second_large" variable for the data step to identify time since first event and months since first event? (I'm still stuck on how to get the months since first event to work.)

Reeza
Super User

For number of months subtract and divide by 365.25/12. 

 

Nmonths = ( Date1- date2) /(365.25/12);

 

'month' is not a standard measure of time as hour or days, so this at least standardizes the definition. 

BrunoMueller
SAS Super FREQ

hi

 

I might not have fully understand the task, but there are some handy functions that will help you with this.

 

The example below uses the LARGEST function to the return the largest as well as the second largest values.

Using the WHICHN function will return the position of the value within the array i this matter in any way.

 

data test1;
  infile cards dlm=',';
  input ID MonthYear  Event1_date1  Event1_date2  FEvent_date1;

  array xevents{*} Event1_date1  Event1_date2  FEvent_date1;

  nEvents = n(of xevents{*});
  largestEvent = largest(1, of xevents{*});
  largestEvent2 = largest(2, of xevents{*});
  iLargestEvent = WHICHN(largestEvent, OF xevents{*});
  iLargestEvent2 = WHICHN(largestEvent2, OF xevents{*});

  format event: fevent: largest: date9.;

  cards;
1, 1, 3000, , 
1, 2, , ,
1, 3, 3060, , 3100
1, 4, , , 
2, 8, 2090, 3040, 
2, 9, ,
2, 10, , , 3190
3, 1, 2935, 2945, 2965
3, 2, , ,
;

 

ballardw
Super User

You should provide an example of the desired output given the exampe input data set. I'm not sure whether you are actually looking across observations or note from the words you are using.

cyoung
Fluorite | Level 6
data want;
  infile cards dlm=',';
  input ID MonthYear  Event1_date1  Event1_date2  FEvent_date1 time2event2 time2event2_mth;
  cards;
1, 1, 3000, , , 60, 2
1, 2, , , , 60, 2
1, 3, 3060, , 3100, 60, 2
1, 4, , , , 60, 2
2, 8, 2090, 3040, , 50, 0
2, 9, , , 50, 0
2, 10, , , 3190, 50, 0 3, 1, 2935, 2945, 2965, 10, 0 3, 2, , , , 10, 0 ;

 

so I would be hoping for something like this- where the variables for time2event2 and time2event2_mth are added.  Time2event2 would calculate the difference between the first date value and the second largest date value for that id and the time2event2_mth would calculate the change in monthyear_r2 variable between the two dates.

 

sorry for the confusion- does that make sense? / is this possible?

cyoung
Fluorite | Level 6

Correction: (changed a couple values in the original dummy dataset)

data want;
  infile cards dlm=',';
  input ID MonthYear  Event1_date1  Event1_date2  FEvent_date1 time2event2 time2event2_mth;
  cards;
1, 1, 3100, , , 60, 2
1, 2, , , , 60, 2
1, 3, 3040, , 3000, 60, 2
1, 4, , , , 60, 2
2, 8, 3040, 3010, , 30, 0
2, 9, , , 50, 0
2, 10, , , 2090, 50, 0 3, 1, 2965, 2955, 2935, 10, 0 3, 2, , , , 10, 0 ;
ChrisNZ
Tourmaline | Level 20

Your needs are still very confusing and it looks like  your data is wrong.

If I understood, just merge this back:

 

data HAVE;
  infile cards dlm=',';
  input ID MonthYear  Event1_date1  Event1_date2  FEvent_date1;
  cards;
1, 1, 3000,     , 
1, 2,     ,     ,
1, 3, 3060,     , 3100
1, 4,     ,     , 
2, 8, 3010, 3040, 
2, 9,     ,     ,
2, 10,    ,     , 3190
3, 1, 2935, 2945, 2965
3, 2,     ,     ,
run;

* I consider you only ever get 3 events;
data WANT;
  set HAVE;
  by ID;
  retain FIRST_DTE FIRST_MTH;
  if first.ID then do;
    call missing(time2event2,time2event2_mth);
    FIRST_DTE=Event1_date1;
    FIRST_MTH=MonthYear;
  end;
  if Event1_date2 
   | (Event1_date1 & Event1_date1 ne FIRST_DTE) then do;
    time2event2=coalesce(Event1_date2,Event1_date1)-FIRST_DTE;
    time2event2_mth=MonthYear-FIRST_MTH;
    output;
  end;
  keep time2event2 time2event2_MTH;
run;

proc print noobs; run;



time2event2 time2event2_mth
60 2
30 0
10 0

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 8 replies
  • 2542 views
  • 0 likes
  • 6 in conversation