About jspoend

jspoend · ‎10-17-2023

Thank you so much, this worked well. I totally agree, that understanding dates in SAS is worthwhile. I

jspoend · ‎10-17-2023

Hi there, I have a dataset with 1 line per ID with one relevant date per ID. I have transformed the dates into numeric format using 'date_as_num = putn(date,'yymmddn8.');'. The date range is 1 January 2018 (i.e. 20180101) to 31 December 2020 (20201231). I now need to convert these dates into consecutive numbers starting at 1 with 1 January 2018 being 1, 2 January being 2 and so forth. I'm using SAS version 9.4. I don't know how to do this. Can anyone help? Thank you very much! Julia

jspoend · ‎05-02-2022

Thanks Garnet, for the later added line, that did in fact sovle a problem I found in my data!

jspoend · ‎04-19-2022

Too simple, thank you very much!

jspoend · ‎04-19-2022

Hi there, I have 2 datafiles. Claims_file: all claims during pregnancy of women insured with a certain insurer pregID date_claim ATC 1 mmddyy xy 1 mmddyy xy 1 mmddyy xy 2 mmddyy xy 2 mmddyy xy 3 mmddyy xy 3 mmddyy xy 3 mmddyy xy 4 mmddyy xy 4 mmddyy xy abortions_file: IDs of pregnancies which had an abortion shortly before the start of pregnancy PregID date_abort 1 mmddyy 3 mmddyy What I need to do is to exclude all pregnancies (so all claims from any pregnancy) from the claims_file, which are listed in the abortion_file. I'm not sure how to go about this. I usually run this code to ONLY include patients defined in another file, but I'm not sure how to change it, in order to EXCLUDE patients defined in another file. data claims_no abort; set claims_file; if _n_ = 1 then do; declare hash a (dataset:"abortions_file"); a.definekey("pregID"); a.definedone(); end; if a.check() = 0; run; Many thanks in advance, Julia

jspoend · ‎04-13-2022

Thanks a lot, I'll implement this tomorrow to check if it works in my dataset! Always great to learn something new!

jspoend · ‎04-13-2022

Hi, Thanks a lot for this, I just implemented the code and it seems to do what I want it to do. Many thanks! That made my life a lot easier:)

jspoend · ‎04-13-2022

Hi Kurt, Thanks a lot for the fast reply. I'm struggling to understand: how would I combine the two if I only create the sequence number in the pregnancy file? Then I could still not identify which claims from the claims file belong to this pregnancy, or do I misunderstand?

jspoend · ‎04-13-2022

Hi there, I have two datasets: 1) all health care claims (drugs) of women who delivered during the study period (2015-2021) 2) pregnancy file The two files are structured as follows: Claims file: ID date_claim claim_code 1 1JAN2015 ATC1 1 20FEB2015 ATC2 1 15JUL2016 ATC3 1 20SEP2017 ATC2 2 3JAN2017 ATC7 2 5FEB2018 ATC1 2 8MAR2019 ATC11 2 15AUG2020 ATC12 2 20DEC2021 ATC11 Pregnancy file: ID pregnancy_start pregnancy_end 1 29JAN2017 28OCT2017 2 30JAN2018 15OCT2018 2 25JUN2020 15MAR2021 My goal is to 1) combine the two datasets and to 2) exclude all claims that were not billed during a pregnancy and to 3) create a new ID per pregnancy (bc one patient may have contributed several pregnancies - so the analytic unit will be pregnancy not ID). I'm not sure how to go about this - I'm mainly struggling with step 2. My first attempt was to combine the two datasets with a set command and then to impute the pregnancy_start and pregnancy_end for all lines in the dataset using the retain fuction by ID. However, given that one patient can contribute several pregnancies that does not work. I'm a bit stuck right now, so any input is appreciated. Thanks a lot in advance! Julia

jspoend · ‎04-04-2022

Hi StatDave, Thanks a lot for your answer. I have thought about that too. However, I'veread that using linear models to quantify absolute risk differences is an option. E.g. here:. https://www.statalist.org/forums/forum/general-stata-discussion/general/1408193-binary-dependent-variable-in-difference-in-difference-method But am definitely no expert. The problem is that the output is harder to interpret if I use logistic regression, as I want to quantify simple absolute risk differences. As far as I understand I cant get those with proc logistic? Thanks a lot.

jspoend · ‎04-01-2022

Hi there, I'm conducting a difference in difference analysis for the first time. My aim is to compare the proportion of preterm deliveries (in a dataset of deliveries, 1 line per delivery) before and after a policy change. The control group are deliveries during the same time period in a year during which no policy change has happened. I do not expect any confounders so propensity scores or adjusting are not planned. I want to conduct a linear probability model to quantify the difference of the difference of the probability of preterm delivery between the two years with a robust 95% CI (because one mum could potentially contribute several deliveries to the dataset). My dataset is structured as follows: Exposed=1 if delivery in year of policy change, 0 if delivery in year without policy change Post=1 if delivery after date of policy change, 0 if delivery before date of policy change MumID Preterm Exposed Post 1 0 1 0 2 1 1 0 3 1 0 1 4 0 0 1 5 0 1 1 I've found the following code which seems to run. However, given I have not done this analysis before I'm unsure if I've implemented it correctly. proc surveyreg data=dataset; cluster mumid; *I assume this calculates robust 95% CI by accounting for same Mumid; class post exposed; model preterm= post exposed post*exposed / CLPARM solution vadjust=none; estimate "Diff in Diff" post*exposed 1 -1 -1 1; lsmeans post*exposed; run; Does 'cluster mumid' indicate to calculate robust 95% CI? I got the following preliminary output (sorry in German) and I'm wondering if I'm interpreting this right: I interpret this such that: the unexposed group (year without intervention) had 6.5% preterms prior to the policy change. The exposed group (year of policy change) had 0.1% more preterms prior to the policy change. I'm not sure how to interpret post 0 = -0.0063864. Is this the average change between pre and post policy change? I interpreted the interaction term as my main result: i.e. that the difference of the difference in the proportion of preterm deliveries between the two years is 0.4/100 deliveries, which is not statistically significant (p=0.203). So the policy change did not significantly change the proportion of preterm deliveries. Any insight into whether I'm reading this correctly or how to improve my code is appreciated, as I have not done this before. Many thanks, Julia

jspoend · ‎02-17-2022

Dear Peter, Both of your answers seem to work perfectly, thanks a lot! Julia

jspoend · ‎02-17-2022

Hi there, I am working with a dataset of midwive billing codes. The dataset has one line per midwive visit, so several lines per patient. The datafile is structured as follows: PatID VisitID location purpose 1 1 hospital postpartum 1 2 home postpartum 1 3 home postpartum 2 1 home postpartum 2 2 home postpartum 3 1 hospital postpartum 4 1 home postpartum 4 2 home postpartum 4 3 home postpartum 4 4 home postpartum I need to identify and exclude those patients, who exclusively had visits at location 'hospital' and none at home. So some code checking if they had any visit at home and if not they would be flagged. If they had a first visit at hospital with subsequent visits at home they stay in. i.e. patient 3 would have to be excluded. I'm not sure how to best go about this, so any inputs are appreciated. Many thanks in advance, Julia

jspoend · ‎02-01-2022

Hi Peter, Thanks a lot for sending this, I've tried with merge before, but I was worried it would overwrite anything. But I guess it should be fine if there are no overlapping variables. Many thanks, Julia

jspoend · ‎02-01-2022

Dear Kurt, Thank you so much for the second time today, your command worked perfectly again, and I will know how to adjust it now as well. Have a great day, Julia

Online Status	Offline
Date Last Visited	‎11-08-2023 12:52 PM

Re: How do I conevert numeric date variable into consecutive numbers s...

How do I conevert numeric date variable into consecutive numbers start...

Re: Claims file with several lines per ID: identify claims during preg...

Re: Exclude patients from a file, based on patid defined in another fi...

Exclude patients from a file, based on patid defined in another file

Re: Claims file with several lines per ID: identify claims during preg...

Re: Claims file with several lines per ID: identify claims during preg...

Re: Claims file with several lines per ID: identify claims during preg...

Claims file with several lines per ID: identify claims during pregnanc...

Re: Difference-in-Difference analysis binary outcome

Re: How do I conevert numeric date variable into consecutive numbers s...

Re: Difference-in-Difference analysis binary outcome

Re: Identify patients across several lines based on lack of a specific...

Re: Identify patients across several lines based on lack of a specific...

Re: Identify patients across several lines based on lack of a specific...

Re: How do I conevert numeric date variable into consecutive numbers s...

How do I conevert numeric date variable into consecutive numbers start...

Re: Claims file with several lines per ID: identify claims during preg...

Re: Exclude patients from a file, based on patid defined in another fi...

Exclude patients from a file, based on patid defined in another file

Re: Claims file with several lines per ID: identify claims during preg...

Re: Claims file with several lines per ID: identify claims during preg...

Re: Claims file with several lines per ID: identify claims during preg...

Claims file with several lines per ID: identify claims during pregnanc...

Re: Difference-in-Difference analysis binary outcome

Difference-in-Difference analysis binary outcome

Re: Identify patients across several lines based on lack of a specific...

Identify patients across several lines based on lack of a specific val...

Re: How to add variables from a different dataset by ID

Re: How to add variables from a different dataset by ID