BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
robertrao
Quartz | Level 8

Hi Team,

I have a dataset as shown below..

ID   Diag   ReadmitID

A     4            A'

B     3            B'

C     5           C'

A'     4

B'     4

C'     6

where A B C are the id of the patient and A' B' and C' are the ID of the same patients during readmission.

Our goal is to identify if he has the same Diag code as that of admission (unprimed ID's) during readmission (primed ID's)

Regards

1 ACCEPTED SOLUTION

Accepted Solutions
art297
Opal | Level 21

Sort of .. kind of.  You would have to change some of the code. e.g.:

data have;

  input ID $  Diag   ReadmitID $;

  cards;

101     4            111

102     3            222

103    5            333

111    4 .

222    4 .

333     6 .

;

DATA fmtDataset (drop=ReadmitID);

  set have (rename=(id=start diag=label));

  if missing(ReadmitID);

  retain fmtname 'idcode' type 'C';

RUN;

PROC FORMAT CNTLIN=fmtDataset;

RUN;

DATA want;

  set have;

  if not missing(ReadmitID);

  if put(ReadmitID,$idcode.) eq diag then same='yes';

  else same='no';

RUN;

P.S.  Isn't it about time that you marked some of the posts in this thread as being either helpful or correct or, minimally, at least indicate that the question has been answered.

View solution in original post

21 REPLIES 21
art297
Opal | Level 21

Karun:  Someone else will have to show you how this can be stated as a single proc sql call, but I think that the following does what you want:

proc sql;

  create table first as

    select *

      from have

        where index(id,"'")=0

  ;

  create table second as

    select id as ReadmitID, diag as diag2

      from have

        where index(id,"'") ne 0

  ;

  create table want as

    select a.id,a.diag,a.ReadmitID,b.diag2,

           case

             when diag=diag2 then "Yes"

          else "No"

          end as same

      from first as a

        left join

          second as b

            on a.ReadmitID=b.ReadmitID

  ;

quit;

robertrao
Quartz | Level 8

Thanks a ton.

I am not looking for a single SQL call. Simple datastep also could be of great help

Regards

Astounding
PROC Star

OK, a simple data step.  I'm not sure how efficient it will be, but it will work.  I added a few more variables that you might (or might not) want.

data want;

   set have (keep=ID Diag) nobs=_nobs_;

   do _i_=1 to _nobs_;

       set have (keep=ReAdmitID Diag rename=(Diag=ReAdmitDiag)) point=_i_;

       if (ID=ReAdmitID) and (_n_ ne _i_) then do;

         original_observation=_n_;

         matching_observation=_i_;

         if Diag=ReadmitDiag then same_diagnosis='Y';

         else same_diagnosis='N';

         output;

      end;

   end;

run;

OK, maybe it's not really that simple but it's not terrible either.

robertrao
Quartz | Level 8

Hi,

I got the logic but not clear enough.

Could you explain me how can we assign _i_ to a matching observation??

Thanks

robertrao
Quartz | Level 8

Hi Team,

I am trying to execte my problem with all the different solutions.

I understood Art,s logic.

Could you explain to me why  should we write (_n_ ne _i_) in the below if statement.....

data want;

   set have (keep=ID Diag) nobs=_nobs_;

   do _i_=1 to _nobs_;

       set have (keep=ReAdmitID Diag rename=(Diag=ReAdmitDiag)) point=_i_;

       if (ID=ReAdmitID) and (_n_ ne _i_) then do;

         original_observation=_n_;

         matching_observation=_i_;

         if Diag=ReadmitDiag then same_diagnosis='Y';

         else same_diagnosis='N';

         output;

      end;

   end;

run;

Also if someone could explain me HAi's Hash logic..that would be great tooo..

For example..

h.definekey('_ID');

what is h.

is it a library

Thanks

Astounding
PROC Star

For this particular problem, it's overkill to check:

(_n_ ne _i_)

Basically, it prevents an observation from matching itself.  For some applications that's important, but for this one it won't make a difference.  The readmission ID would never match the original ID on the same observation.

robertrao
Quartz | Level 8

Exactly,

Thats the reason I was double checking.

Also I was wondering this could be done using  PRoc Format CNTLIN??

Thanks

Astounding
PROC Star

If the ID variable never has duplicates, it would be relatively straightforward to use PROC FORMAT.  But that wasn't clear from the limited amount of data.  Must ID take on unique values?

robertrao
Quartz | Level 8

YES....The ID variable is Unique.

But using PROC FORMAT i only know until extractling ReadmitIds from the ID variable using CNTLIN.

But I am not sure of the second part...checking to see if the readmits have the same DIAG as the beginning of admission

Regards

Astounding
PROC Star

OK.  I won't be able to work on this for a few hours, but I'd be happy to see anyone else code it.  The basic idea is that the CNTLIN data set translates ID into the observation number where that ID occurs.  Then march through the observations, and check in a DATA step whether the readmission ID appears in the format.  If so, use SET with POINT= to retrieve the matching observation.  Along the way, KEEP/RENAME variables as appropriate.

If nobody attempts this, I can get to it this afternoon.

Good luck.

robertrao
Quartz | Level 8

Great Help,

Thanks a ton

art297
Opal | Level 21

data have;

  input ID $  Diag   ReadmitID $;

  cards;

A     4            A'

B     3            B'

C     5           C'

A'     4 .

B'     4 .

C'     6 .

;

DATA fmtDataset;

  set have (drop=ReadmitID rename=(id=start diag=label));

  if index(start,"'");

  retain fmtname '$idcode' type 'C';

RUN;

PROC FORMAT CNTLIN=fmtDataset;

RUN;

DATA want;

  set have;

  if not index(id,"'");

  if put(ReadmitID,$idcode.) eq diag then same='yes';

  else same='no';

RUN;

Astounding
PROC Star

Art,

That's a much better idea ... just use the format to translate from ID into Diag.

robertrao
Quartz | Level 8

Thans a ton for your valuable time and effort.

I am curious. In the future if the alphabets are replaced by numbers  i can use the same code without the below if conditions in the code

IT STILL WORKS RIGHT???

if index(start,"'");

if not index(id,"'");

data have;

  input ID $  Diag   ReadmitID $;

  cards;

101     4            111

102     3            222

103    5            333

111    4 .

222    4 .

333     6 .

;

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 21 replies
  • 1304 views
  • 6 likes
  • 6 in conversation