Hello,
I've a dataset sorted by patient ID (PID), diagnosis code (dx), doctor specialty code (specialty), and service date (date) in descending order, and doctor ID (docID).
I'd like to identify patients seeking more than second opinion of doctor for the same diagnosis and from doctors with the same specialty -- that is, seeing a third doctor of the same specialty -- prior to the the most doctor visit. A patient might go back and forth to different doctors.
The data is collected between January 1, 2011 and June 30, 2012. There is no date restriction except that the most recent visit within the same dx and specialty (or first record) should take place April 1, 2012.
Here are the data and SAS code to read in the data (a fiscal year variable fyear is calculated (going by April 1 as the start of a fiscal year):
data dat; input
PiD 1 Date $ 3-10 DocID 12-14 Fee_Code $ 16-19 Specialty 21-22 Dx 24-26 Amount 28-30 Notes $ 32-76;
datalines;
1 20120501 101 A585 28 289 60 1st doc w/n same PID, specialty code, dx code
1 20120322 101 A585 28 289 60 1st doc
1 20120313 102 A585 28 289 60 2nd doc
1 20120306 101 A585 28 289 60 1st doc
1 20110830 111 A585 28 300 60 3rd doc w/n same PID, specialty code, dx code
1 20110524 101 A585 28 289 60
1 20110420 101 A585 28 289 60
1 20110411 101 A585 28 289 60
1 20110101 101 A585 28 289 60
1 20120515 104 A401 40 110 100 1st doc w/n same PID, specialty code, dx code
1 20120303 105 A401 40 110 100 2nd doc
1 20120101 104 A401 40 110 100 1st doc
2 20120101 101 A303 15 305 30 1st doc but 1st record date < April 1, 2012
2 20111130 105 A303 15 305 50 2nd doc
2 20111009 108 A303 15 305 30 3rd doc
2 20110707 103 A303 15 305 30
3 20120415 201 A220 30 211 116 1st doc w/n same PID, specialty code, dx code
3 20120401 205 A220 30 211 116 2nd doc
3 20120303 201 A220 30 211 116 1st doc
3 20111231 205 A220 30 211 116 2nd doc
;
data dat (rename=(Date2=Date)); format PID Date2 Fyear; set dat;
format Date2 yymmdd10.;
Date2=input(date,yymmdd10.);
fyear = year(intnx("YEAR.4",Date2,0));
drop date;
run;
(The dataset "dat" is attached as well).
Based on the above mentioned criteria, only PID=1 and specialty=28 and DX=289 should be outputted because there is a third visit to DocID=111 on Aug 30, 2011.
PID=1, speciatlty=40, DX=110: there are 3 visits but only to 2 different doctors
PID=2: there are 4 visits to 3 different doctors but the most recent visit took place in January 2010, which is fiscal year 2011 (< April 1, 2012)
PID=3: there are 4 visits but to 2 different doctors
So thte data output should look like:
PiD | Specialty | Dx |
1 | 28 | 289 |
I assume I should carry docID to the next record to compare and see if they are identical, but I need to compare not just the preceding one but all the preceding ones. I could transpose and compare docID1, docID2, docID3....., but I could have many records due to many visit entries (not manageable, just not preferable if I had other solutions).
Thanks in advance for any help you could provide.
On second thought,
Once I get rid of cases with first record's date < April 1, 2012, I think I could have just use Proc SQL or Proc SORT nodupkey to first keep unique records of PID, specialty and DX, docID and then count # of docID. If it's >=3, then there is a visit to a third doctor.
So I'd like to rephrase my questions so that I have to use RETAIN to read down the data and compare preceding records (if possible) - that is what if I'd like to identify only cases with a visit to a third doctor that is within a year prior to the most recent visit date. The output should still be the same.
Thanks.
Hi
With the conditions you mentioned, there will be no output. The output you are expecting is having different dx i.e. diagnosis. But u mentioned it should be same diagnosis from third doctor.
Do you really want to have same dx or just same patient, third doctor and same specialty only....?
Please correct me if i missed something...
Thanks for replying. The third doctor has to be from the same specialty for the same DX of the patient. So yes it has to be under the same patient, same patient's dx and same specialty of doctor.
Ok so first off you want to take only those cases who meet your first criteria which is the first visit was after April 1, 2012. For reference the dataset you put out doesn't have anyone who meets that criteria, but you can run this code to take out all of the other people (I specify in the below code that the date is April 1, 2012.
proc sql;
create table um2 as
select distinct PID, date, docid, specialty, dx, fyear, (max(date)) format=mmddyy10. as last_dt /*You specify what the most recent date was so you can compare to it in the next step*/
from dat
group by PID
having (min(date)>'01Apr2012'd); /*Here you specify that the first visit for the PID has to have occurred after this date (April 1, 2012) and remove all PID who do not meet this criteria*/
create table um3 as
select distinct PID, docid, dx, specialty /*Since you now know what the most recent visit date is, you can specify that the visit that you are counting had to have occurred within a specific time frame, in this case 1 year or 365 days. I wrote it that the most recent date is first so you have to specify datdif to be negative, but you could reverse it. Also I included 0, so the last visit is counted*/
from um2
where datdif(last_dt, date, 'act/act') between -365 and 0;
create table um4 as /*since you have eliminated all of the visits which didn't occur within 1 year you now can reduce your visits based on dx and specialty, and count the number of DocID's that are in the table for each PID*/
select distinct PID, dx, specialty, (count(docid)) as count
from um3
group by pid, dx, specialty;
create table cohort as
select distinct pid, dx, specialty
from um4
where count >=3; /*This line could have been specified in um4 as having count >=3 or it can be written as another line. This specification will give you only those PID who had at least 3 visits to different docids in the year preceding the most recent visit for the same dx and in the same specialty*/
quit;
This code should create the cohort that you have so far specified.
Why would you sort by descending service date? That makes it impossible for a DATA step to do the work.
Do you have any control over the sorted order to the data?
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 16. Read more here about why you should contribute and what is in it for you!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.