About pdick2

pdick2 · ‎04-21-2024

1st set of codes worked! Thanks so much!

pdick2 · ‎04-21-2024

Lets adjust that. Here is something that might be easier to work with: data have; input Hour HR Temp SepsisLabel Patient_ID time_difference; datalines; 0 88 36.11 0 34 -12 1 88 36.17 0 34 -11 2 88 . 0 34 -10 3 83.5 . 0 34 -9 4 80 . 0 34 -8 5 88 36.5 0 34 -7 6 91 . 0 34 -6 7 88 . 0 34 -5 8 80 . 0 34 -4 9 80 . 0 34 -3 10 80 . 0 34 -2 11 82 . 0 34 -1 12 77 . 0 34 0 0 88 36 0 40 -4 1 88 36 0 40 -3 2 88 36 0 40 -2 3 88 36 0 40 -1 4 88 36 0 40 0 ; What I want is to label the hours since the last observation. For example, patient_ID=40 has their last observed hour = 4. This will be time_difference = 0. Then the previous row will have time_difference = -1 (coming from 3-4 = -1). So ideally it would look something like this (for each unique patient_ID): time_difference = (last observed hour for that unique patient_ID) - (hour of interest for that unique patient_ID) = a negative time difference per row. I think this will be easier if I do it in the format above, since I have another dataset in a similar format. Let me know if you need anything else, and thanks for helping! I'm slowly learning my way around SAS..

pdick2 · ‎04-21-2024

data have; input Hour HR Temp SepsisLabel Patient_ID time_difference; datalines; 0 88 36.11 0 34 -13 1 88 36.17 0 34 -12 2 88 . 0 34 -11 3 83.5 . 0 34 -10 4 80 . 0 34 -9 5 88 36.5 0 34 -8 6 91 . 0 34 -7 7 88 . 0 34 -6 8 80 . 0 34 -5 9 80 . 0 34 -4 10 80 . 0 34 -3 11 82 . 0 34 -2 12 77 . 0 34 -1 0 88 36 0 40 -5 1 88 36 0 40 -4 2 88 36 0 40 -3 3 88 36 0 40 -2 4 88 36 0 40 -1 ; Opps. I left the "onset_time" from a input template from a previous dataset. This line of code should make more sense. Apologies!

pdick2 · ‎04-21-2024

Hello! I have a dataset that I would like to get the time difference from one observed value from the last observation value by patient_ID. Here is an example of what I want:. data have; input Hour HR Temp SepsisLabel Patient_ID onset_time time_difference; datalines; 0 88 36.11 0 34 -13 1 88 36.17 0 34 -12 2 88 . 0 34 -11 3 83.5 . 0 34 -10 4 80 . 0 34 -9 5 88 36.5 0 34 -8 6 91 . 0 34 -7 7 88 . 0 34 -6 8 80 . 0 34 -5 9 80 . 0 34 -4 10 80 . 0 34 -3 11 82 . 0 34 -2 12 77 . 0 34 -1 0 88 36 0 40 -5 1 88 36 0 40 -4 2 88 36 0 40 -3 3 88 36 0 40 -2 4 88 36 0 40 -1 ; Please let me know what's the easiest way to get this! Chatgpt is struggling to understand how lag functions work, and so am I. Thanks!

pdick2 · ‎04-19-2024

After looking at my data some more, turns out I might need to include individuals who are only sepsislabel=0 for their whole time in the hospital, but also include those who are first sepsislabel=0 and then transition to sepsislabel=1. How would you do that? (I.e. Include patient_ID=3206 (the only sepsislabel=0) and 3205 (who goes from sepsislabel=0 to 1) in my dataset). This will exclude people who are only sepsislabel=1. Trying to to a cox model and need a good comparison group for those who do not develop sepsis. If you have any more advice, please let me know!

pdick2 · ‎04-16-2024

This worked thank you so much!

pdick2 · ‎04-16-2024

I'll try to be more clear. Here's what I do want or would like to see in a datalines format: data have; input Hour HR Temp SepsisLabel Patient_ID TimeDifference; datalines; 0 88 36.11 0 34 -3 1 88 36.17 0 34 -2 2 88 . 0 34 -1 3 83.5 . 1 34 0 4 80 . 1 34 1 5 88 36.5 1 34 2 6 91 . 1 34 3 7 88 . 1 34 4 8 80 . 1 34 5 9 80 . 1 34 6 10 80 . 1 34 7 11 82 . 1 34 8 12 77 . 1 34 9 ; And I didn't think about putting them on one data set. I see what you mean, it'll be easier to use one "calculated values" data set (and using the vars I want) instead of 6 data sets. Do you recommend just making one new data set with all of the calculated values or have them on the original "have" data set? This isn't the source data set, but just another copy of it (so I don't overwrite the original data). What I would like to do is, for example, take the mean of a column variable (lets use HR) for one that unique Patient_ID between the hours of -4 to 0. This will allow me to get the mean value of HR 4 hours before sepsislabel changes from 0 to 1. I would like to look at other variables besides HR, and other time intervals as well (such as -6, -12, etc. before the sepsislabel changes from 0 to 1). The code you gave didn't work for me. Maybe the specifics above will help. Thank you so much for helping me! First time SAS user here. Also, I can send the data set where it only has data when sepsislabel changed from 0 to 1, if you want to experiment. Here is a screenshot of what I got from the code you gave:

pdick2 · ‎04-16-2024

Hello again! I have a long dataset from the "Early Prediction of Sepsis from Clinical Data: the PhysioNet/Computing in Cardiology Challenge 2019" This has over 1.5 million rows of hourly data, with over 40,000 unique Patient_IDs. There are many variable (such as Hour, HR, Resp, O2Sat, various lab values , etc.) by one outcome variable (SepsisLabel. 0 for no sepsis, 1 for sepsis). I now have the dataset where SepsisLabel changes from 0 to 1, while excluding those who are only SepsisLabel=0 and excluding those who are only SepsisLabel=1. Now I want to do some other things to the data. I want to take the time difference between when sepsislabel changes from 0 to 1 before onset. Here is an example dataset. I've been using ChatGPT to help, but it doesn't seem to understand what I want. I kinda got the code it gave me to work with the time difference AFTER sepsislabel changes from 0 to 1, but not before. Before it gives me missing values. Code below and an example dataset for Patient_ID=34. data have; input Hour HR Temp SepsisLabel Patient_ID onset_time TimeDifference; datalines; 0 88 36.11 0 34 . . 1 88 36.17 0 34 . . 2 88 . 0 34 . . 3 83.5 . 1 34 3 0 4 80 . 1 34 3 1 5 88 36.5 1 34 3 2 6 91 . 1 34 3 3 7 88 . 1 34 3 4 8 80 . 1 34 3 5 9 80 . 1 34 3 6 10 80 . 1 34 3 7 11 82 . 1 34 3 8 12 77 . 1 34 3 9 ; data biosp.sepsis_0_to_1_time_diff; set biosp.sepsis_0_to_1; by Patient_ID; retain onset_time; if first.Patient_ID then onset_time =.; *Initialize the onset time for each unique patient_ID; if sepsislabel=1 and onset_time=. then onset_time=Hour; *Records when sepsislabel changes to 1; if not missing(onset_time) then TimeDifference = Hour - onset_time; /* Calculate time difference */ else TimeDifference=.; run; Please help me with this. Something else I want to do is get the mean, median, mode, q1, q3, min, and max for certain variables (such as HR, Resp, Temp., etc.) at certain time intervals before sepsislabel changes from 0 to 1. I was going to look at t=-4 hours, t=-6 hours, and t=-12 hours. This will single out this long data into 1 row per 1 patient_ID (instead of 11 rows, or 100 rows in some instances for one patient), but the rows will have the mean, median, etc. for those variables of interest (HR, Temp., etc.). This will create many datasheets (1 datasheet for mean values, 1 for median, 1 for q1, etc.), but this can be more usable in a logistic analysis in my opinion. Can anyone help me with this please? Thank you!!!

pdick2 · ‎04-15-2024

This worked! Took some tinkering (I had to sort my data by Patient_ID and hour first), but it matches the original dataset, but keeping only those who change from 0 to 1. There aren't any instances on it going from 1 to 0, but thank you for giving that! The one SASJedi gave was a little memory intensive and I'm not sure why (I'm new to SAS).

pdick2 · ‎04-14-2024

Hello! I have a long dataset from the "Early Prediction of Sepsis from Clinical Data: the PhysioNet/Computing in Cardiology Challenge 2019" This has over 1.5 million rows of hourly data, with over 40,000 unique Patient_IDs. There are many variable (such as Hour, HR, Resp, O2Sat, various lab values , etc.) by one outcome variable (SepsisLabel. 0 for no sepsis, 1 for sepsis). I'm trying to keep the hourly rows of datasets from where SepsisLabel changes from 0 to 1, while excluding those who are only SepsisLabel=0 and excluding those who are only SepsisLabel=1. I've included a truncated example of one unique Patient_ID below with only Hour HR Temp Resp SepsisLabel and Patient_ID Hour HR Temp Resp SepsisLabel Patient_ID 0 . . . 0 3205 1 76 . 20 0 3205 2 78 . 20 0 3205 3 81 . 16 0 3205 4 79 37.89 12 0 3205 5 81 . 15.25 0 3205 6 78 . 12 0 3205 7 75 . 13.5 1 3205 8 76 37.5 18.5 1 3205 9 84 . 14.5 1 3205 10 82 . 33 1 3205 11 95 . 28 1 3205 12 99 37.67 26 1 3205 13 96 . 21 1 3205 14 92 . 22 1 3205 15 85 . 26 1 3205 16 92 37.33 22 1 3205 Please help me with this!

Online Status	Offline
Date Last Visited	‎04-24-2024 01:20 PM

Re: Time since last observation

Re: Time since last observation

Re: Time since last observation

Time since last observation

Re: Keep rows of data when one var changes from 0 to 1, but excluding ...

Re: Taking time difference from before when one variable changes from ...

Re: Taking time difference from before when one variable changes from ...

Taking time difference from before when one variable changes from 0 to...

Re: Keep rows of data when one var changes from 0 to 1, but excluding ...

Keep rows of data when one var changes from 0 to 1, but excluding when...

Re: Time since last observation

Re: Keep rows of data when one var changes from 0 to 1, but excluding ...

Re: Keep rows of data when one var changes from 0 to 1, but excluding ...

Re: Time since last observation

Re: Time since last observation

Re: Time since last observation

Time since last observation

Re: Keep rows of data when one var changes from 0 to 1, but excluding ...

Re: Taking time difference from before when one variable changes from ...

Re: Taking time difference from before when one variable changes from ...

Taking time difference from before when one variable changes from 0 to...

Re: Keep rows of data when one var changes from 0 to 1, but excluding ...

Keep rows of data when one var changes from 0 to 1, but excluding when...