I touched on this earlier but think I got off on the wrong path. I have 125K plus records that vary in the place of service. An example is:
id plc of service dos disch
111 hospital 11dec2010 15dec2010
111 hospice 15dec2010 17dec2010
111 rehab 17dec2010 30dec2010
111 hospital 12nov2011 15nov2011
222 hospital 01jan2010 10jan2010
333 hospital 01feb2011 07feb2011
444 hospital 01jun2010 07jun2010
444 rehab 07jun2010 15jun2010
444 nursing 15jun2010 17jun2010
444 hospital 17aug2011 30aug2011
I need to say that if anything other than hospital shows up, then look at the disch and dos. If the dos is the same as the hospital disch date then all those records go into one file where. So my outcome I want to be:
Hospital table:
id plc of service dos disch
111 hospital 12nov2011 15nov2011
222 hospital 01jan2010 10jan2010
333 hospital 01feb2011 07feb2011
444 hospital 17aug2011 30aug2011
Other table:
id plc of service dos disch
111 hospital 11dec2010 15dec2010
111 hospice 15dec2010 17dec2010
111 rehab 17dec2010 30dec2010
444 hospital 01jun2010 07jun2010
444 rehab 07jun2010 15jun2010
444 nursing 15jun2010 17jun2010
I started my code and know I need a subquery but not sure I am doing the subquery right.
PROC SQL;
CREATE TABLE ReAdm.Claims_inpt2 AS
(SELECT DISTINCT
(SELECT * B.ADMIT_DT FROM ReAdm.Claims_inpt
WHERE A.CLM_PL_OF_SRVC_DESC <> 'INPATIENT HOSPITAL' AND
AS OTHERADM)
FROM ReAdm.Claims_inpt
WHERE A.CLM_PL_OF_SRVC_DESC = 'INPATIENT HOSPITAL');
RUN;
The above should give me something like:
id plc of service dos disch otheradm
111 hospital 11dec2010 15dec2010 15dec2010
111 hospice 15dec2010 17dec2010
111 hospital 12nov2011 15nov2011
222 hospital 01jan2010 10jan2010
333 hospital 01feb2011 07feb2011
444 hospital 01jun2010 07jun2010 07jun2010
444 nursing 07jun2010 15jun2010
444 hospital 17aug2011 30aug2011
Then from there I can take and do another query to isolate only the hospital and only the otheradm.
The code I provided earlier works, with a fix for the episode part. Just make sure your new_id is formatted correctly.
The point is to create a unique id per 'episode' based on the id and then a counter, 1 for first episode. But because you'll have multiple episodes up to the 100 per patients you need to add zeros so things will sort properly.
data have;
informat dos disch date9.;
format dos disch date9.;
input id $ plc $ dos disch ;
cards;
111 hospital 15jun2010 18jun2010
111 hospital 11dec2010 15dec2010
111 hospice 15dec2010 17dec2010
111 rehab 15jan2011 31jan2011
111 hospital 12nov2011 15nov2011
222 hospital 01jan2010 10jan2010
333 hospital 01feb2011 07feb2011
444 hospital 01jun2010 07jun2010
444 rehab 07jun2010 15jun2010
444 nursing 15jun2010 17jun2010
444 hospital 17aug2011 30aug2011
555 rehab 15aug2011 29aug2011
666 hospital 01jan2010 02jan2010
777 skilled 06aug2011 15aug2011
888 hospice 10aug2011 30aug2011
999 hospital 01jan2010 15jan2010
999 hospital 20feb2010 25feb2010
999 rehab 27feb2010 28feb2010
999 skilled 03mar2010 07mar2010
001 hospital 20aug2010 25aug2010
001 hospital 15sep2011 15sep2011
002 rehab 05sep2011 07sep2011
002 hospital 08sep2011 15sep2011
002 hospice 15sep2011 17sep2011
003 hospital 01jan2010 15jan2010
004 rehab 15jan2010 17jan2010
005 hospital 01jan2011 15jan2011
005 rehab 27jan2011 30jan2011
;
run;
proc sort data=have;
by id dos disch;
run;
data episodes;
set have;
by id;
retain episode;
*keep track of previous discharge date;
format prev_disch date9.;
prev_disch=lag(disch);
*time between dos and discharge;
if not first.id then time_between=dos-prev_disch;
*If first id then set episode to 1, else if dos ne disch then increase by 1;
if first.id then episode=1;
else if time_between >30 then episode+1;
*Create id to group episodese for the query later on;
*Make sure this fits your actual data, ie you may have up to 100 episodes;
new_id=trim(id)||put(episode, z2.);
run;
*Only Hospitals;
proc sql;
create table hospital as
select *
from episodes
group by new_id
having min(plc)='hospital' and max(plc)='hospital';
quit;
*All the others;
proc sql;
create table others as
select *
from episodes
group by new_id
having min(plc)ne 'hospital' or max(plc)ne 'hospital';
quit;
I still think the method I proposed is easier to understand.
data have;
informat dos disch date9.;
format dos disch date9.;
input id plc $ dos disch ;
cards;
111 hospital 11dec2010 15dec2010
111 hospice 15dec2010 17dec2010
111 rehab 17dec2010 30dec2010
111 hospital 12nov2011 15nov2011
222 hospital 01jan2010 10jan2010
333 hospital 01feb2011 07feb2011
444 hospital 01jun2010 07jun2010
444 rehab 07jun2010 15jun2010
444 nursing 15jun2010 17jun2010
444 hospital 17aug2011 30aug2011
;
run;
*Isolate into episodes;
data episodes;
set have;
by id;
retain episode;
*keep track of previous discharge date;
format prev_disch date9.;
prev_disch=lag(disch);
*If first id then set episode to 1, else if dos ne disch then increase by 1;
if first.id then episode=1;
else if dos ne prev_disch then episode+1;
*Create id to group episodese for the query later on;
new_id=put(id, z3.)||put(episode, z2.);
run;
*Only Hospitals;
proc sql;
create table hospital as
select *
from episodes
group by new_id
having min(plc)='hospital' and max(plc)='hospital';
quit;
*All the others;
proc sql;
create table others as
select *
from episodes
group by new_id
having min(plc)ne 'hospital' or max(plc)ne 'hospital';
quit;
Ok. Now if there is a situation of the below which I just found in fishing thru the 125k plus records
111 hospital 11dec2010 15dec2010
111 hospice 15dec2010 17dec2010
111 rehab 20dec2010 30dec2010 changed this date
111 hospital 12nov2011 15nov2011
222 hospital 01jan2010 10jan2010
333 hospital 01feb2011 07feb2011
444 hospital 01jun2010 07jun2010
444 rehab 07jun2010 15jun2010
444 nursing 19jun2010 25jun2010 changed this date
444 hospital 17aug2011 30aug2011
would I put:
*Isolate into episodes;
data episodes;
set have;
by id;
retain episode;
*keep track of previous discharge date;
format prev_disch date9.;
prev_disch=lag(disch);
*If first id then set episode to 1, else if dos ne disch then increase by 1;
if first.id then episode=1;
else if dos ne prev_disch then episode+30; a +30 on this???
*Create id to group episodese for the query later on;
new_id=put(id, z3.)||put(episode, z2.);
run;
No. Your question said:
If the dos is the same as the hospital disch date then all those records go into one file
Which was different from your previous requirement of 30 days.
Are you saying it is now 30 days again?
Assuming so, change the condition that is checked, not the calculation.
The condition is dos ne prev_disch so you'll need to see if the difference between these dates is less than 30 days.
(dos - prev_disch) <= 30 (Make sure its less than or equal to depending on what you need.
data episodes;
set have;
by id;
retain episode;
*keep track of previous discharge date;
format prev_disch date9.;
prev_disch=lag(disch);
*Calculate the time from previous stay, if this is not a new id;
if not first.id then time_from_previous_stay=dos-prev_disch;
*If first id then set episode to 1, else if dos ne disch then increase by 1;
if first.id then episode=1;
else if time_from_previous_stay <30 then episode+1; *Should be less than or equal to???
*Create id to group episodese for the query later on;
new_id=put(id, z3.)||put(episode, z2.);
run;
Not sure. The table is auto constructed to have what it has. Not my table it is on a server. It has 125k records and unless I go thru line by line I really do not know all the scenarios. I am up to row 1015 and am writing down all the scenarios as I see them where it is not a 100% hospital. I have to log all of them because I explained to my manager that what I have seen so far is not a simple case of what I described above. I hate projects like these. I am thinking instead of manually going thru all these 125k records that maybe I can create a query that says give me distinct ID and their admit disch date without place of service so I can see all the possible scenarios. I alerted my boss to the scenarios so far and she seems to think if we isolate all 100% hospital and then the others are within a 30day that should get us all that we want. Again, did I say I hate projects like these? Nothing in any of our claims systems is ever cut and dry. There are always a set of scenarios but then you find oops, now I have 12 monkey wrenches thrown into it all.
I will try the above scenarios when my other SAS query is done which is actually extracting some huge data. Will let you know how this works out. Thanks.
I think I did something wrong. I have this:
data have;
informat dos disch date8.;
format dos disch date8.;
input id plc $ dos disch ;
cards;
111 hospital 11dec2010 15dec2010
111 hospital 15jun2010 18jun2010
111 hospice 15dec2010 17dec2010
111 rehab 15jan2011 31jan2011
111 hospital 12nov2011 15nov2011
222 hospital 01jan2010 10jan2010
333 hospital 01feb2011 07feb2011
444 hospital 01jun2010 07jun2010
444 rehab 07jun2010 15jun2010
444 nursing 15jun2010 17jun2010
444 hospital 17aug2011 30aug2011
;
run;
*Isolate into episodes;
data episodes;
set have;
by id;
retain episode;
*keep track of previous discharge date;
format prev_disch date8.;
prev_disch=lag(disch);
*Calculate the time from previous stay, if this is not a new id;
if not first.id then time_from_previous_stay=dos-prev_disch;
*If first id then set episode to 1, else if dos ne disch then increase by 1;
if first.id then episode=1;
else if time_from_previous_stay between >=0 and <=30 then episode+1;
*Create id to group episode for the query later on;
new_id=put(id, z3.)||put(episode, z2.);
run;
*Only Hospitals;
proc sql;
create table hospital as
select *
from episodes
group by new_id
having min(plc)='hospital' and max(plc)='hospital';
quit;
*All the others;
proc sql;
create table others as
select *
from episodes
group by new_id
having min(plc)ne 'hospital' or max(plc)ne 'hospital';
quit;
I get this back on each table:
4 back in the hospital and 7 in the other but there should be 5 based on the 15jun2010 I put in. I am finding more and more oddities in this table of 125k. It is not strictly first service is hospital and then other things follow. I have some where the first service is nursing or rehab and they never even go to the hospital. It is just getting more and more complex the more I dig into this table.
I found this scenario in the table and searching for more.
111 hospital 11dec2010 15dec2010
111 hospital 15jun2010 18jun2010
111 hospice 15dec2010 17dec2010
111 rehab 15jan2011 31jan2011
111 hospital 12nov2011 15nov2011
222 hospital 01jan2010 10jan2010
333 hospital 01feb2011 07feb2011
444 hospital 01jun2010 07jun2010
444 rehab 07jun2010 15jun2010
444 nursing 15jun2010 17jun2010
444 hospital 17aug2011 30aug2011
555 rehab 15aug2011 29aug2011
;
So what's the problem?
According to your criteria there isn't an issue. You need to explicitly state your criteria to avoid these issues when someone else needs to repeat the analysis
ie.
Select all data from period Jan 2009 to February 2012.
Find patients first hospital visit.
Separate out patients who were in hospital
Separate out patients who went to other facilities.
*Because you've selected using a date criteria there will be patients whose first visit is outside of the time period;
*Also if its health care perhaps benefits weren't applicable until after hospital stay or was covered under anotehr policy;
Also there's a bug in your code:
time_from_previous_stay between >=0 and <=30
Check if that line is working the way you expect it to.
Sorry. The problem is this is not my table. It is a table on our claims system. I have to go thru each row of over 125K records to find the varying scenarios I am running across and the problem now is that now that I am having to look at each row in this huge table I might as well click edit in SAS and manually put in skillled nursing or rehab. If I am going thru each record I might as well do that anyway. I was trying to avoid that by coding this but it seems I cannot because I have to view each ID to get all the scenarios logged to get help on the SAS forum. The PM's have no criteria other than they want hospital separate from others. The problem is when summing it all up the numbers do not match and I told my boss and she is like well you will have to figure out what hospital goes with rehab, skilled nursing, hospice, etc. You will have to review the dates of the records for each member ID and I am like there has to be an easier way than this. This is just plain ridiculous..
Sorry this has so many twists and turns. Nature of the data we store. I just wished I was well versed in SAS so I did not have to ask for help and frustrate everyone. If I had someone internally to go to I would this way I could share the REAL information but cannot with outside world as it would be a violation of PHI and HIPAA and I could lose my job and get fined by the govt. So I am doing the best I can with providing the items I am finding. I am fairly to SAS and do mostly proc sql stuff. and then case stuff.
data have;
informat dos disch date8.;
format dos disch date8.;
input id plc $ dos disch ;
cards;
111 hospital 15jun2010 18jun2010
111 hospital 11dec2010 15dec2010
111 hospice 15dec2010 17dec2010
111 rehab 15jan2011 31jan2011
111 hospital 12nov2011 15nov2011
222 hospital 01jan2010 10jan2010
333 hospital 01feb2011 07feb2011
444 hospital 01jun2010 07jun2010
444 rehab 07jun2010 15jun2010
444 nursing 15jun2010 17jun2010
444 hospital 17aug2011 30aug2011
555 rehab 15aug2011 29aug2011
666 hospital 01jan2010 02jan2010
777 skilled 06aug2011 15aug2011
888 hospice 10aug2011 30aug2011
999 hospital 01jan2010 15jan2010
999 hospital 20feb2010 25feb2010
999 rehab 27feb2010 28feb2010
999 skilled 03mar2010 07mar2010
;
run;
*Isolate into episodes;
data episodes;
set have;
by id;
retain episode;
*keep track of previous discharge date;
format prev_disch date8.;
prev_disch=lag(disch);
*Calculate the time from previous stay, if this is not a new id;
if first.id then episode =1;
if not first.id then time_from_previous_stay=dos-prev_disch;
if time_from_previous_stay =>1 or time_from_previous_stay <=30 then episode+1;
*Create id to group episode for the query later on;
new_id=put(id, z3.)||put(episode, z2.);
run;
*Only Hospitals;
proc sql;
create table hospital as
select *
from episodes
group by new_id
having min(plc)='hospital' and max(plc)='hospital';
quit;
*All the others;
proc sql;
create table others as
select *
from episodes
group by new_id
having min(plc)ne 'hospital' or max(plc)ne 'hospital';
quit;
Maybe the time issue of between does not work so I changed it to the above. The query runs but still not right output and I am lost. This project is more than I can handle.
I have an inpatient table and in that table are various dates of admission and discharges. A little over 125k rows in all. Member can have more than 1 admission to everything from hospital, nursing, rehab, hospice. Any hospital only admission where the member clearly only has 1 admission in the entire table goes into the HOSPITAL TABLE. If the member had more than 1 admission but they were never sent to nursing, rehab or hospice, that goes in the HOSPITAL TABLE. The tricky part for the other table. If a member has an admission to a hospital and then is discharged and then admitted to a nursing, rehab or hospice the same date of the discharge, that goes into the OTHER TABLE. If the member has a hospital admission and then is discharged and then admitted to a nursing, rehab or hospice within 30 days that goes into the OTHER TABLE. (I am making the call on 30days to make this easier). If the member was only admitted to rehab, nursing or hospice and never had a hospital stay, that goes into the OTHER TABLE.
I hope this makes sense. The problem is I am doing this readmission project and it was all fine and done and then they threw in place of service. They said separate hospital from all others. I did that and the odds ratio was so off. The readmits were right but the admissions were too high because those that were admitted to hospital and readmitted to hospice, nursing or rehab their original admissions are counted in the first query I did because they have place of service as hospital. I think I have confused everyone haven't I?
Ok Maybe this will help. I have a table of 125k+ rows all with the ID, dos, disch date and place of service. There are various options within this table.
I went thru all 125k+ rows to find all the options.
data have;
informat dos disch date8.;
format dos disch date8.;
input id plc $ dos disch ;
cards;
111 hospital 15jun2010 18jun2010
111 hospital 11dec2010 15dec2010
111 hospice 15dec2010 17dec2010
111 rehab 15jan2011 31jan2011
111 hospital 12nov2011 15nov2011
222 hospital 01jan2010 10jan2010
333 hospital 01feb2011 07feb2011
444 hospital 01jun2010 07jun2010
444 rehab 07jun2010 15jun2010
444 nursing 15jun2010 17jun2010
444 hospital 17aug2011 30aug2011
555 rehab 15aug2011 29aug2011
666 hospital 01jan2010 02jan2010
777 skilled 06aug2011 15aug2011
888 hospice 10aug2011 30aug2011
999 hospital 01jan2010 15jan2010
999 hospital 20feb2010 25feb2010
999 rehab 27feb2010 28feb2010
999 skilled 03mar2010 07mar2010
001 hospital 20aug2010 25aug2010
001 hospital 15sep2011 15sep2011
002 rehab 05sep2011 07sep2011
002 hospital 08sep2011 15sep2011
002 hospice 15sep2011 17sep2011
003 hospital 01jan2010 15jan2010
004 rehab 15jan2010 17jan2010
005 hospital 01jan2011 15jan2011
005 rehab 27jan2011 30jan2011
;
run;
I need to have hospital only items that are never associated with rehab, hospice, skilled put into one table and the rest in the other table. So the outcome of this above should be like this: using up to +30 between the disch and the adm. So in the case of ID 111 if they are discharged from the hospital on 15dec2010 and then admitted to rehab on 15jan2011, the entire record fo 111 during the date periods would be in the other table.
HOSPITAL Table:
111 hospital 15jun2010 18jun2010
111 hospital 12nov2011 15nov2011
222 hospital 01jan2010 10jan2010
333 hospital 01feb2011 07feb2011
444 hospital 17aug2011 30aug2011
666 hospital 01jan2010 02jan2010
999 hospital 01jan2010 15jan2010
001 hospital 20aug2010 25aug2010
001 hospital 15sep2011 15sep2011
003 hospital 01jan2010 15jan2010
OTHER Table:
data have;
informat dos disch date8.;
format dos disch date8.;
input id plc $ dos disch ;
cards;
111 hospital 11dec2010 15dec2010
111 hospice 15dec2010 17dec2010
111 rehab 15jan2011 31jan2011
444 hospital 01jun2010 07jun2010
444 rehab 07jun2010 15jun2010
444 nursing 15jun2010 17jun2010
555 rehab 15aug2011 29aug2011
777 skilled 06aug2011 15aug2011
888 hospice 10aug2011 30aug2011
999 hospital 20feb2010 25feb2010
999 rehab 27feb2010 28feb2010
999 skilled 03mar2010 07mar2010
002 rehab 05sep2011 07sep2011
002 hospital 08sep2011 15sep2011
002 hospice 15sep2011 17sep2011
004 rehab 15jan2010 17jan2010
005 hospital 01jan2011 15jan2011
005 rehab 27jan2011 30jan2011
The code I provided earlier works, with a fix for the episode part. Just make sure your new_id is formatted correctly.
The point is to create a unique id per 'episode' based on the id and then a counter, 1 for first episode. But because you'll have multiple episodes up to the 100 per patients you need to add zeros so things will sort properly.
data have;
informat dos disch date9.;
format dos disch date9.;
input id $ plc $ dos disch ;
cards;
111 hospital 15jun2010 18jun2010
111 hospital 11dec2010 15dec2010
111 hospice 15dec2010 17dec2010
111 rehab 15jan2011 31jan2011
111 hospital 12nov2011 15nov2011
222 hospital 01jan2010 10jan2010
333 hospital 01feb2011 07feb2011
444 hospital 01jun2010 07jun2010
444 rehab 07jun2010 15jun2010
444 nursing 15jun2010 17jun2010
444 hospital 17aug2011 30aug2011
555 rehab 15aug2011 29aug2011
666 hospital 01jan2010 02jan2010
777 skilled 06aug2011 15aug2011
888 hospice 10aug2011 30aug2011
999 hospital 01jan2010 15jan2010
999 hospital 20feb2010 25feb2010
999 rehab 27feb2010 28feb2010
999 skilled 03mar2010 07mar2010
001 hospital 20aug2010 25aug2010
001 hospital 15sep2011 15sep2011
002 rehab 05sep2011 07sep2011
002 hospital 08sep2011 15sep2011
002 hospice 15sep2011 17sep2011
003 hospital 01jan2010 15jan2010
004 rehab 15jan2010 17jan2010
005 hospital 01jan2011 15jan2011
005 rehab 27jan2011 30jan2011
;
run;
proc sort data=have;
by id dos disch;
run;
data episodes;
set have;
by id;
retain episode;
*keep track of previous discharge date;
format prev_disch date9.;
prev_disch=lag(disch);
*time between dos and discharge;
if not first.id then time_between=dos-prev_disch;
*If first id then set episode to 1, else if dos ne disch then increase by 1;
if first.id then episode=1;
else if time_between >30 then episode+1;
*Create id to group episodese for the query later on;
*Make sure this fits your actual data, ie you may have up to 100 episodes;
new_id=trim(id)||put(episode, z2.);
run;
*Only Hospitals;
proc sql;
create table hospital as
select *
from episodes
group by new_id
having min(plc)='hospital' and max(plc)='hospital';
quit;
*All the others;
proc sql;
create table others as
select *
from episodes
group by new_id
having min(plc)ne 'hospital' or max(plc)ne 'hospital';
quit;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.