BookmarkSubscribeRSS Feed
bsriv
Calcite | Level 5

Hi All:  Relatively new to SAS.  I'm having difficulties with a dataset, and I apologizes I must explain in generalities because I don't know where to begin.  I'll try to include some of the codes, etc I have but I apologize in advance for how vague this may seem... I'm lost!

 

So I have a large-ish dataset of hospital emergency department (ED) visits (approx 40,000 visits), and I am looking at opioid overdoses and suicide with a number of other predictors.   The way the data was delivered to me, the diagnoses I requested are listed as categories with "yes" "no" has level of variables.   

 

Later I am going to create regression models with just the sample of patients with a diagnosis of "Opioid poisoning" (n~2000) with a number of predictors for suicidality as an outcome category (binary outcome "yes" "no").  I would like one of those predictors to be previous presentation for a different psychiatric reason (meaning, the patient was previously in the ED NOT for an opioid overdose but FOR a psychiatric diagnosis- again these are categories with "yes" and "no" as levels).   

 

I do have number of admissions in total as a separate predictor (I grouped these based on number of times MRN was entered).  I also have dates in SAS format DDMONYR 00:00:00

 

I'm trying to categorizes individual patients (identified by MRN) who have overdosed at least once based on a category AND who have had PREVIOUS presentations for non opioid OD reasons).  The n of this total sample should be less than 2000, so not a high number, though I will be drawing from a much larger sample

 

Any tips on how/where to begin??   If I didn't explain things clearly or if there is any other info I can provide please let me know!!!  Unfortunately I cannot post any data because of HIPAA reasons 

 

 

20 REPLIES 20
HB
Barite | Level 11 HB
Barite | Level 11

Is each record an admission with a patient identifier and admission characteristics, or is each record a patient with some nonnormalized data structure?

 

Let us start someplace. Assume data where each record is an admit with a patient and a diagnosis.

 

data admits;
	input patientid admitdate:mmddyy10. opiod:$1. psych:$1. else:$1.;
datalines;
1 1/1/2017 n y n
1 2/1/2017 n n y
1 3/1/2017 y n n
2 1/1/2017 n n y
2 2/1/2017 n n y
2 3/1/2017 y n n
3 1/1/2017 n y n
3 2/1/2017 n n y
3 3/1/2017 n y n
4 4/1/2017 n y n
4 5/1/2017 n n y
4 6/1/2017 y n n
4 7/1/2017 y n n
4 8/1/2017 y n n
5 6/1/2017 y n n
5 7/1/2017 n n n
5 8/1/2017 n y n
;
run;

 

Then let's select anybody who is a y on opiod and also a y on pysch with a date before their y on opiod. In this data 1 and 4.  5 has a psych after and 2 doesn't have a psych.

 

proc sql;
	create table opiod_with_psych as
	select distinct a.patientid 
	from admits a inner join admits b on a.patientid = b.patientid
	where a.opiod = 'y' and b.psych = 'y' and b.admitdate < a.admitdate;
quit;

That gives us

 

The SAS System
 
patientid
1

4

 

Start with that and then let's modify.

bsriv
Calcite | Level 5

Thanks!

 

So I have a list of patient MRNs which fit that category.   The next step is to create a variable in my data set where I only have opioid overdoses (the n~2000) and these patients are assigned a "Y".   Can i merge this?  If so, how?

 

Thanks!

bsriv
Calcite | Level 5

I guess more sxplicitly- I have a list of MRNs that fit this critieria.  In my larger data set how do I create a variable to which I can assign a "y" or "n" to these MRNs based on whether or not they are on this new table?  Thanks!!!

HB
Barite | Level 11 HB
Barite | Level 11
I guess more sxplicitly- I have a list of MRNs that fit this critieria.  In my larger data set how do I create a variable to which I can assign a "y" or "n" to these MRNs based on whether or not they are on this new table?  Thanks!!!

 

 

I have no idea.  We haven't seen even any made-up data.  I have no idea what you want to do. You said you already had things in Y/N format. 

 

You'll have to provide more information and explanation for further help. 

bsriv
Calcite | Level 5

Thanks for the suggestions/tips- was out of town, sorry for the delay.   Here is fake data simplified from my original set with only the pertinent variables (modified slightly for HIPAA purposes):

 

data WORK.OPIOID11;
infile datalines dsd truncover;
input MedRec:BEST12. Enct:BEST12. EnteredDT:DATETIME. opioidx:$1. psychdx:$1.;
format MedRec BEST12. Enct BEST12. EnteredDT DATETIME.;
datalines4;
123456,654387982,04JUN15:16:10:00,2,1
123456,846397542,08JUL15:17:10:00,1,2
123456,846397541,09AUG15:18:10:00,1,1

 

654321,987654321,09JAN15:13:51:00,1,2

654321,123456789,08MAR15:12:30:00,1,1
654321,867530921,07MAY15:14:20:00,2,1

987654,987654321,02FEB15:14:40:00,1,2

987654,123456789, 08MAR15:15:00:00,2,1

987654,867530921, 04APR15:16:00:00,1,1

 

MedRec is the patient's medical record number (again, made up).  This is assigned to an individual patient and sticks with the patient.  Enct is the encounter number, assigned to the patient at an individual visit.  opioidx is if the visit is for an opioid overdose (OD).  psychdx is for other psychiatric complaint.  For both of these, 1=yes, 2=no.  For the sake of simplicity each patient will have 3 encounterss.

 

So I want to identify the patients who have overdosed but PRIOR to any overdose attempt they have had at least one encounterWITHOUT an opioid overdose but WITH a psychiatric diagnosis.   

 

Using the examples above, I would want to select for patient "123456" because there is an encounter where psychdx=1 (yes for psych dx) AND opioidx=2 (no opioid OD) PRIOR to 2 encounters WITH opioid overdoses (and whether or not a psychdx is present on these encounters doesn't matter)

 

I would NOT want to identify patient "654321" because the opioid ODs preceded the psych diagnosis visits.  I WOULD want to identify 987654 because an overdose was preceded by at least one psych dx encounter.

 

I have slightly modified the code above:

proc sql;
create table opioidpsych as
select distinct a.MedRec
from work.opioid11 a inner join work.opioid11 b on a.MedRec = b.MedRec
where a.opioidx = "1" and b.psychdx="1" and b.EnteredDT < a.EnteredDT;
quit;

 

 

And as far as I can tell it creates a table for the MedRecs as specified above (meaning the individual patient has at least one opioid OD with a previous visit for psychdx.

 

Once I am done I get a table as follows (c/w above)

data WORK.OPIOIDPSYCH;
infile datalines dsd truncover;
input BJH_MedRec:BEST12.;
format BJH_MedRec BEST12.;
datalines4;
123456

987654

 

So my next question is how to note these IDs (individual patient med rec#s) on my larger data set (work.opioid11), but with the caveat being that each individual line is designated by encounter.  For example, I would want to flag patient "123456" twice, the 2nd and third encounters because they were both opioid overdoses with a prior psychiatric diagnosis.

 

I hope I am being clear- please let me know if something isn't making sense!


Thanks so much!

 

 

 

 

Reeza
Super User

What do you want as output from this data set?

bsriv
Calcite | Level 5

Hi Reeza:

 

ultimately I would like to create another variable that notes if an encounter that was an opioid overdose ((opioidx=1) was preceded by a prior encounter for THAT SAME PATIENT (meaning, med recs are the same) where there was not an opioid overdose (opioidx=2) but there was a psychiatric diagnosis given (psychdx=1)

Reeza
Super User

Make it easy for me -post the exact data set as above, but with the output. 

bsriv
Calcite | Level 5

Gotcha:  made it even simpler:

data WORK.opioidfake;
input MedRec:BEST12. Enct:BEST12. EnteredDT DATE. opioidx:$1. psychdx:$1.;
format MedRec BEST12. Enct BEST12. EnteredDT DATE.;
datalines;
123456 654387982 04JUN15 2 1
123456 846397542 08JUL15 1 2
123456 846397541 09AUG15 1 1
654321 987654321 09JAN15 1 2
654321 123456789 08MAR15 1 1
654321 867530921 07MAY15 2 1
987654 987654421 02FEB15 1 2
987654 123456689 08MAR15 2 1
987654 867530121 04APR15 1 1

;
run;

 

Output (wouldn't allow complete tables but descending labels go across from L to R)

 MedRec
 
Enct
 
EnteredDT
 
opioidx
 
psychdx
 
112345665438798204JUN1521
212345684639754208JUL1512
312345684639754109AUG1511
465432198765432109JAN1512
565432112345678908MAR1511
665432186753092107MAY1521
798765498765442102FEB1512
898765412345668908MAR1521
998765486753012104APR1511
 
 

Then:  

proc sql;
create table opioidpsych as
select distinct a.MedRec
from work.opioidfake a inner join work.opioidfake b on a.MedRec = b.MedRec
where a.opioidx = "1" and b.psychdx="1" and b.EnteredDT < a.EnteredDT;
quit;

 

 
 
MedRec
 
 
1123456 
2987654
 
 

Thanks again!  Please let me know if you need more info!

Reeza
Super User

I think this gets you pretty close, if not exactly what you want:

 

data have;
input MedRec:BEST12. Enct:BEST12. EnteredDT DATE. opioidx:$1. psychdx:$1.;
format MedRec BEST12. Enct BEST12. EnteredDT DATE.;
datalines;
123456 654387982 04JUN15 2 1
123456 846397542 08JUL15 1 2
123456 846397541 09AUG15 1 1
654321 987654321 09JAN15 1 2
654321 123456789 08MAR15 1 1
654321 867530921 07MAY15 2 1
987654 987654421 02FEB15 1 2
987654 123456689 08MAR15 2 1
987654 867530121 04APR15 1 1
;
run;

proc sort data=have; by medrec entereddt;
run;

data want;
set have;
by medrec;

retain psychflag;

if first.medrec then psychflag=0;
if opioidx='1' and psychflag=1 then output;
if psychdx='1' then psychflag=1;

run;


bsriv
Calcite | Level 5

Thanks- right now when I do this I am only getting the psychflag=1 lines (about 400 out of my total sample)- how do I get this variable (psychflag) to show in my larger dataset (about 40,000)?

 

Thanks!

Reeza
Super User

It matches the output you specified, the two rows. This is why I asked explicitly what you wanted as output. 

 

Try removing the OUTPUT statement and just creating the flags if that's what you're looking to do. 

bsriv
Calcite | Level 5

So in the PROC SQL code above, it looks as if that produced the list of med rec numbers that I specifically wanted (these are the patients with an opioid OD with prior psych dx WITHOUT OD.   I want my ultimate output to be another variable in my main dataset (n=40,000) that matches these med recs with the appropriate encounters (Opioid ODs that for which the individual MedRec had previously been in the ED for a psych dx).

 

So basically, how do I create another variable in my big table (with 40,000) with the Med recs IDd from the proc SQL, assigned to the appropriate Encounters?

 

Please let me know if I'm not making sense!

Reeza
Super User

Try removing the OUTPUT statement and just creating the flags if that's what you're looking to do. 

 

This is still the answer. If you can't get it working, post your modified code and I can help you modify it correctly. 

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 20 replies
  • 1393 views
  • 0 likes
  • 3 in conversation