Re: Complicated IF/then issues

bsriv · Posted 11-02-2017 02:46 PM

Hi All: Relatively new to SAS. I'm having difficulties with a dataset, and I apologizes I must explain in generalities because I don't know where to begin. I'll try to include some of the codes, etc I have but I apologize in advance for how vague this may seem... I'm lost!

So I have a large-ish dataset of hospital emergency department (ED) visits (approx 40,000 visits), and I am looking at opioid overdoses and suicide with a number of other predictors. The way the data was delivered to me, the diagnoses I requested are listed as categories with "yes" "no" has level of variables.

Later I am going to create regression models with just the sample of patients with a diagnosis of "Opioid poisoning" (n~2000) with a number of predictors for suicidality as an outcome category (binary outcome "yes" "no"). I would like one of those predictors to be previous presentation for a different psychiatric reason (meaning, the patient was previously in the ED NOT for an opioid overdose but FOR a psychiatric diagnosis- again these are categories with "yes" and "no" as levels).

I do have number of admissions in total as a separate predictor (I grouped these based on number of times MRN was entered). I also have dates in SAS format DDMONYR 00:00:00

I'm trying to categorizes individual patients (identified by MRN) who have overdosed at least once based on a category AND who have had PREVIOUS presentations for non opioid OD reasons). The n of this total sample should be less than 2000, so not a high number, though I will be drawing from a much larger sample

Any tips on how/where to begin?? If I didn't explain things clearly or if there is any other info I can provide please let me know!!! Unfortunately I cannot post any data because of HIPAA reasons

HB · Posted 11-02-2017 03:26 PM

Is each record an admission with a patient identifier and admission characteristics, or is each record a patient with some nonnormalized data structure?

Let us start someplace. Assume data where each record is an admit with a patient and a diagnosis.

data admits;
	input patientid admitdate:mmddyy10. opiod:$1. psych:$1. else:$1.;
datalines;
1 1/1/2017 n y n
1 2/1/2017 n n y
1 3/1/2017 y n n
2 1/1/2017 n n y
2 2/1/2017 n n y
2 3/1/2017 y n n
3 1/1/2017 n y n
3 2/1/2017 n n y
3 3/1/2017 n y n
4 4/1/2017 n y n
4 5/1/2017 n n y
4 6/1/2017 y n n
4 7/1/2017 y n n
4 8/1/2017 y n n
5 6/1/2017 y n n
5 7/1/2017 n n n
5 8/1/2017 n y n
;
run;

Then let's select anybody who is a y on opiod and also a y on pysch with a date before their y on opiod. In this data 1 and 4. 5 has a psych after and 2 doesn't have a psych.

proc sql;
	create table opiod_with_psych as
	select distinct a.patientid 
	from admits a inner join admits b on a.patientid = b.patientid
	where a.opiod = 'y' and b.psych = 'y' and b.admitdate < a.admitdate;
quit;

That gives us

The SAS System

patientid

1

4

Start with that and then let's modify.

bsriv · Posted 11-03-2017 11:01 AM

Thanks!

So I have a list of patient MRNs which fit that category. The next step is to create a variable in my data set where I only have opioid overdoses (the n~2000) and these patients are assigned a "Y". Can i merge this? If so, how?

Thanks!

bsriv · Posted 11-03-2017 12:06 PM

I guess more sxplicitly- I have a list of MRNs that fit this critieria. In my larger data set how do I create a variable to which I can assign a "y" or "n" to these MRNs based on whether or not they are on this new table? Thanks!!!

HB · Posted 11-03-2017 12:43 PM

I guess more sxplicitly- I have a list of MRNs that fit this critieria. In my larger data set how do I create a variable to which I can assign a "y" or "n" to these MRNs based on whether or not they are on this new table? Thanks!!!

I have no idea. We haven't seen even any made-up data. I have no idea what you want to do. You said you already had things in Y/N format.

You'll have to provide more information and explanation for further help.

bsriv · Posted 11-05-2017 07:03 PM

Thanks for the suggestions/tips- was out of town, sorry for the delay. Here is fake data simplified from my original set with only the pertinent variables (modified slightly for HIPAA purposes):

data WORK.OPIOID11;
infile datalines dsd truncover;
input MedRec:BEST12. Enct:BEST12. EnteredDT:DATETIME. opioidx:$1. psychdx:$1.;
format MedRec BEST12. Enct BEST12. EnteredDT DATETIME.;
datalines4;
123456,654387982,04JUN15:16:10:00,2,1
123456,846397542,08JUL15:17:10:00,1,2
123456,846397541,09AUG15:18:10:00,1,1

654321,987654321,09JAN15:13:51:00,1,2

654321,123456789,08MAR15:12:30:00,1,1
654321,867530921,07MAY15:14:20:00,2,1

987654,987654321,02FEB15:14:40:00,1,2

987654,123456789, 08MAR15:15:00:00,2,1

987654,867530921, 04APR15:16:00:00,1,1

MedRec is the patient's medical record number (again, made up). This is assigned to an individual patient and sticks with the patient. Enct is the encounter number, assigned to the patient at an individual visit. opioidx is if the visit is for an opioid overdose (OD). psychdx is for other psychiatric complaint. For both of these, 1=yes, 2=no. For the sake of simplicity each patient will have 3 encounterss.

So I want to identify the patients who have overdosed but PRIOR to any overdose attempt they have had at least one encounterWITHOUT an opioid overdose but WITH a psychiatric diagnosis.

Using the examples above, I would want to select for patient "123456" because there is an encounter where psychdx=1 (yes for psych dx) AND opioidx=2 (no opioid OD) PRIOR to 2 encounters WITH opioid overdoses (and whether or not a psychdx is present on these encounters doesn't matter)

I would NOT want to identify patient "654321" because the opioid ODs preceded the psych diagnosis visits. I WOULD want to identify 987654 because an overdose was preceded by at least one psych dx encounter.

I have slightly modified the code above:

proc sql;
create table opioidpsych as
select distinct a.MedRec
from work.opioid11 a inner join work.opioid11 b on a.MedRec = b.MedRec
where a.opioidx = "1" and b.psychdx="1" and b.EnteredDT < a.EnteredDT;
quit;

And as far as I can tell it creates a table for the MedRecs as specified above (meaning the individual patient has at least one opioid OD with a previous visit for psychdx.

Once I am done I get a table as follows (c/w above)

data WORK.OPIOIDPSYCH;
infile datalines dsd truncover;
input BJH_MedRec:BEST12.;
format BJH_MedRec BEST12.;
datalines4;
123456

987654

So my next question is how to note these IDs (individual patient med rec#s) on my larger data set (work.opioid11), but with the caveat being that each individual line is designated by encounter. For example, I would want to flag patient "123456" twice, the 2nd and third encounters because they were both opioid overdoses with a prior psychiatric diagnosis.

I hope I am being clear- please let me know if something isn't making sense!

Thanks so much!

Reeza · Posted 11-05-2017 08:23 PM

What do you want as output from this data set?

bsriv · Posted 11-05-2017 08:48 PM

Hi Reeza:

ultimately I would like to create another variable that notes if an encounter that was an opioid overdose ((opioidx=1) was preceded by a prior encounter for THAT SAME PATIENT (meaning, med recs are the same) where there was not an opioid overdose (opioidx=2) but there was a psychiatric diagnosis given (psychdx=1)

Reeza · Posted 11-05-2017 09:17 PM

Make it easy for me -post the exact data set as above, but with the output.

bsriv · Posted 11-05-2017 09:58 PM

Gotcha: made it even simpler:

data WORK.opioidfake;
input MedRec:BEST12. Enct:BEST12. EnteredDT DATE. opioidx:$1. psychdx:$1.;
format MedRec BEST12. Enct BEST12. EnteredDT DATE.;
datalines;
123456 654387982 04JUN15 2 1
123456 846397542 08JUL15 1 2
123456 846397541 09AUG15 1 1
654321 987654321 09JAN15 1 2
654321 123456789 08MAR15 1 1
654321 867530921 07MAY15 2 1
987654 987654421 02FEB15 1 2
987654 123456689 08MAR15 2 1
987654 867530121 04APR15 1 1

;
run;

Output (wouldn't allow complete tables but descending labels go across from L to R)

MedRec

Enct

EnteredDT

opioidx

psychdx

1

123456

654387982

04JUN15

2

1

2

123456

846397542

08JUL15

1

2

3

123456

846397541

09AUG15

1

4

654321

987654321

09JAN15

1

2

5

654321

123456789

08MAR15

1

6

654321

867530921

07MAY15

2

1

7

987654

987654421

02FEB15

1

2

8

987654

123456689

08MAR15

2

1

9

987654

867530121

04APR15

1

Then:

proc sql;
create table opioidpsych as
select distinct a.MedRec
from work.opioidfake a inner join work.opioidfake b on a.MedRec = b.MedRec
where a.opioidx = "1" and b.psychdx="1" and b.EnteredDT < a.EnteredDT;
quit;

MedRec

1

123456

2

987654

Thanks again! Please let me know if you need more info!

Reeza · Posted 11-06-2017 11:25 AM

I think this gets you pretty close, if not exactly what you want:

data have;
input MedRec:BEST12. Enct:BEST12. EnteredDT DATE. opioidx:$1. psychdx:$1.;
format MedRec BEST12. Enct BEST12. EnteredDT DATE.;
datalines;
123456 654387982 04JUN15 2 1
123456 846397542 08JUL15 1 2
123456 846397541 09AUG15 1 1
654321 987654321 09JAN15 1 2
654321 123456789 08MAR15 1 1
654321 867530921 07MAY15 2 1
987654 987654421 02FEB15 1 2
987654 123456689 08MAR15 2 1
987654 867530121 04APR15 1 1
;
run;

proc sort data=have; by medrec entereddt;
run;

data want;
set have;
by medrec;

retain psychflag;

if first.medrec then psychflag=0;
if opioidx='1' and psychflag=1 then output;
if psychdx='1' then psychflag=1;

run;

bsriv · Posted 11-06-2017 12:08 PM

Thanks- right now when I do this I am only getting the psychflag=1 lines (about 400 out of my total sample)- how do I get this variable (psychflag) to show in my larger dataset (about 40,000)?

Thanks!

Reeza · Posted 11-06-2017 12:15 PM

It matches the output you specified, the two rows. This is why I asked explicitly what you wanted as output.

Try removing the OUTPUT statement and just creating the flags if that's what you're looking to do.

bsriv · Posted 11-06-2017 12:36 PM

So in the PROC SQL code above, it looks as if that produced the list of med rec numbers that I specifically wanted (these are the patients with an opioid OD with prior psych dx WITHOUT OD. I want my ultimate output to be another variable in my main dataset (n=40,000) that matches these med recs with the appropriate encounters (Opioid ODs that for which the individual MedRec had previously been in the ED for a psych dx).

So basically, how do I create another variable in my big table (with 40,000) with the Med recs IDd from the proc SQL, assigned to the appropriate Encounters?

Please let me know if I'm not making sense!

Reeza · Posted 11-06-2017 01:49 PM

Try removing the OUTPUT statement and just creating the flags if that's what you're looking to do.

This is still the answer. If you can't get it working, post your modified code and I can help you modify it correctly.

Ready to join fellow brilliant minds for the SAS Hackathon?

Classroom Training Available!