Hi Team,
I have a dataset as shown below..
ID Diag ReadmitID
A 4 A'
B 3 B'
C 5 C'
A' 4
B' 4
C' 6
where A B C are the id of the patient and A' B' and C' are the ID of the same patients during readmission.
Our goal is to identify if he has the same Diag code as that of admission (unprimed ID's) during readmission (primed ID's)
Regards
Sort of .. kind of. You would have to change some of the code. e.g.:
data have;
input ID $ Diag ReadmitID $;
cards;
101 4 111
102 3 222
103 5 333
111 4 .
222 4 .
333 6 .
;
DATA fmtDataset (drop=ReadmitID);
set have (rename=(id=start diag=label));
if missing(ReadmitID);
retain fmtname 'idcode' type 'C';
RUN;
PROC FORMAT CNTLIN=fmtDataset;
RUN;
DATA want;
set have;
if not missing(ReadmitID);
if put(ReadmitID,$idcode.) eq diag then same='yes';
else same='no';
RUN;
P.S. Isn't it about time that you marked some of the posts in this thread as being either helpful or correct or, minimally, at least indicate that the question has been answered.
Karun: Someone else will have to show you how this can be stated as a single proc sql call, but I think that the following does what you want:
proc sql;
create table first as
select *
from have
where index(id,"'")=0
;
create table second as
select id as ReadmitID, diag as diag2
from have
where index(id,"'") ne 0
;
create table want as
select a.id,a.diag,a.ReadmitID,b.diag2,
case
when diag=diag2 then "Yes"
else "No"
end as same
from first as a
left join
second as b
on a.ReadmitID=b.ReadmitID
;
quit;
Thanks a ton.
I am not looking for a single SQL call. Simple datastep also could be of great help
Regards
OK, a simple data step. I'm not sure how efficient it will be, but it will work. I added a few more variables that you might (or might not) want.
data want;
set have (keep=ID Diag) nobs=_nobs_;
do _i_=1 to _nobs_;
set have (keep=ReAdmitID Diag rename=(Diag=ReAdmitDiag)) point=_i_;
if (ID=ReAdmitID) and (_n_ ne _i_) then do;
original_observation=_n_;
matching_observation=_i_;
if Diag=ReadmitDiag then same_diagnosis='Y';
else same_diagnosis='N';
output;
end;
end;
run;
OK, maybe it's not really that simple but it's not terrible either.
Hi,
I got the logic but not clear enough.
Could you explain me how can we assign _i_ to a matching observation??
Thanks
Hi Team,
I am trying to execte my problem with all the different solutions.
I understood Art,s logic.
Could you explain to me why should we write (_n_ ne _i_) in the below if statement.....
data want;
set have (keep=ID Diag) nobs=_nobs_;
do _i_=1 to _nobs_;
set have (keep=ReAdmitID Diag rename=(Diag=ReAdmitDiag)) point=_i_;
if (ID=ReAdmitID) and (_n_ ne _i_) then do;
original_observation=_n_;
matching_observation=_i_;
if Diag=ReadmitDiag then same_diagnosis='Y';
else same_diagnosis='N';
output;
end;
end;
run;
Also if someone could explain me HAi's Hash logic..that would be great tooo..
For example..
h.definekey('_ID');
what is h.
is it a library
Thanks
For this particular problem, it's overkill to check:
(_n_ ne _i_)
Basically, it prevents an observation from matching itself. For some applications that's important, but for this one it won't make a difference. The readmission ID would never match the original ID on the same observation.
Exactly,
Thats the reason I was double checking.
Also I was wondering this could be done using PRoc Format CNTLIN??
Thanks
If the ID variable never has duplicates, it would be relatively straightforward to use PROC FORMAT. But that wasn't clear from the limited amount of data. Must ID take on unique values?
YES....The ID variable is Unique.
But using PROC FORMAT i only know until extractling ReadmitIds from the ID variable using CNTLIN.
But I am not sure of the second part...checking to see if the readmits have the same DIAG as the beginning of admission
Regards
OK. I won't be able to work on this for a few hours, but I'd be happy to see anyone else code it. The basic idea is that the CNTLIN data set translates ID into the observation number where that ID occurs. Then march through the observations, and check in a DATA step whether the readmission ID appears in the format. If so, use SET with POINT= to retrieve the matching observation. Along the way, KEEP/RENAME variables as appropriate.
If nobody attempts this, I can get to it this afternoon.
Good luck.
Great Help,
Thanks a ton
data have;
input ID $ Diag ReadmitID $;
cards;
A 4 A'
B 3 B'
C 5 C'
A' 4 .
B' 4 .
C' 6 .
;
DATA fmtDataset;
set have (drop=ReadmitID rename=(id=start diag=label));
if index(start,"'");
retain fmtname '$idcode' type 'C';
RUN;
PROC FORMAT CNTLIN=fmtDataset;
RUN;
DATA want;
set have;
if not index(id,"'");
if put(ReadmitID,$idcode.) eq diag then same='yes';
else same='no';
RUN;
Art,
That's a much better idea ... just use the format to translate from ID into Diag.
Thans a ton for your valuable time and effort.
I am curious. In the future if the alphabets are replaced by numbers i can use the same code without the below if conditions in the code
IT STILL WORKS RIGHT???
if index(start,"'");
if not index(id,"'");
data have;
input ID $ Diag ReadmitID $;
cards;
101 4 111
102 3 222
103 5 333
111 4 .
222 4 .
333 6 .
;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.