Re: How to display a certain value from two linked records?

joshCRF · Posted 08-11-2019 07:38 AM

SUBJECT ID	NEWID	ADVERSE EVENT	AEID	LINKID
100		Dyspnoea	1
100	1	Dyspnoea	2	2
100		Dyspnoea	7	7

Struggled to think of a subject topic to correctly describe what I want to occur; so above is a small sample of a data set I am currently working with.

I would like to know it it's possible that when NEWID is equal to AEID (so the first and second observation) that the missing LINKID is populated with the LINKID where the NEWID is populated. So would be expecting that the missing LINKID would be populated with '2'.

While the above just show's one ADVERSE EVENT, I need this to be correctly be applied to different ADVERSE EVENTS where the numbers might be the same.

No code to provide as I am currently racking my brains as to how to get this to work!

Thanks in advance.

PaigeMiller · Posted 08-11-2019 07:47 AM

This short example is hard to generalize into something that can be coded, and that correctly answers the problem.

Is it always the case that NEWID is populated in the record immediately after AEID has the same value, or can the NEWID of 1 be in any record in the data set?

Can there be more than one record for which NEWID matches an AEID?

--
Paige Miller

joshCRF · Posted 08-11-2019 08:03 AM

SUBJECT ID	NEWID	ADVERSE EVENT	AEID	LINKID
101	7	Rash maculo-papular	5	5
101	12	Rash maculo-papular	7
101	15	Rash maculo-papular	12
101		Rash maculo-papular	15
101	19	Rash maculo-papular	18	18
101		Rash maculo-papular	19

So I've added a further example - within a certain ADVERSE EVENT, when the ADVERSE EVENT is linked the NEWID will always match up to AEID, generally it can fall on any record.

As you can see above the LINKID which I want to display on the linked ADVERSE EVENT is the first value of AEID within the linked ADVERSE EVENT, so I was just using the retain function to populate the missing LINKID's

However, as per the example in the first post, the retain function would be retaining a value from a previous (different ADVERSE EVENT) which is giving me incorrect results.

PaigeMiller · Posted 08-11-2019 08:56 AM

So I've added a further example - within a certain ADVERSE EVENT, when the ADVERSE EVENT is linked the NEWID will always match up to AEID, generally it can fall on any record.

Now you have given us two examples where this is not true, can we have an example where NEWID and AEID are not in consecutive records? Can you please provide the data following these instructions so that we may write a program that works on your actual sample data?

--
Paige Miller

joshCRF · Posted 08-11-2019 10:45 AM

I will have to attach an excel file, due to working off a company laptop the code that has been provided to convert to the dataset to SAS code will not work.

The attached excel file, presents all the records in the dataset. Which you should be able to import into SAS.

So... when NEWID is present (for the very first ADVERSE EVENT of a set of linked adverse events) the LINKID will equal the AEID.

The above has already been created in the dataset.

Then within the same linked adverse events, when NEWID is equal to the AEID, the LINKID needs to equal the LINKID of the first ADVERSE EVENT.

PaigeMiller · Posted 08-11-2019 04:25 PM

Some of us will not (or cannot) download Microsoft Office files, as they are a security threat.

So, to repeat my earlier request:

Can you please provide the data following these instructions so that we may write a program that works on your actual sample data?

--
Paige Miller

PGStats · Posted 08-12-2019 12:30 AM

It can be done in SAS/SQL with a correlated subquery :

data have;
input ID	NEWID	EVENT :$32.	AEID;
datalines;
101	7	Rash_maculo-papular	5	
101	12	Rash_maculo-papular	7	
101	15	Rash_maculo-papular	12	
101	 .	Rash_maculo-papular	15	
101	19	Rash_maculo-papular	18	
101	 .	Rash_maculo-papular	19	
;

proc sql;
select 
    *,
    case when NEWID in (select AEID from have where ID=a.ID and EVENT=a.EVENT) then AEID
    else . end as LINKID
from have as a;
quit;

            ID     NEWID  EVENT                                 AEID    LINKID
      ------------------------------------------------------------------------
           101         7  Rash_maculo-papular                      5         5
           101        12  Rash_maculo-papular                      7         7
           101        15  Rash_maculo-papular                     12        12
           101         .  Rash_maculo-papular                     15         .
           101        19  Rash_maculo-papular                     18        18
           101         .  Rash_maculo-papular                     19         .

PG

joshCRF · Posted 08-12-2019 04:30 AM

Thank you for the above response 🙂

However, I would expect it to look like follows (I have also added how it would look like the for EVENT mentioned at the beginning of this thread):

            ID     NEWID  EVENT                                 AEID    LINKID
      ------------------------------------------------------------------------
           101         7  Rash_maculo-papular                      5         5
           101        12  Rash_maculo-papular                      7         5
           101        15  Rash_maculo-papular                     12         5
           101         .  Rash_maculo-papular                     15         5
           101        19  Rash_maculo-papular                     18        18
           101         .  Rash_maculo-papular                     19        18
           
           102         .  Dyspnoea                                 1         2
           102         1  Dyspnoea                                 2         2

So for ID 101 when NEWID = 7 and EVENT = Rash_maculo-papular this is classed as the first occurence of the EVENT within the linked EVENTS - so LINKID is populated with 5 already.

And then for ID 101 when NEWID = 19 and EVENT = Rash_maculo-papular this is classed as the first occurence of the EVENT within the linked EVENTS - so LINKID is populated with 18 already.

So then when NEWID = AEID I'd like LINKID to be populated with the LINKID from the first occurence (as shown above)

This was done fine in the code I was using, when using a retain function. However, the second example I have given causes problems as I cannot use a retain function on that.

For ID 102 when NEWID = 1 and EVENT = DYSPNOEA this is classed as the first occurence of the EVENT, so LINKID is populated with 2 already for that observation. So I'm wanting the record above it, where the NEWID = AEID to be populated with the LINKID from the first occurence (which would be 2).

So I'm hopefully looking for a solution with the outcome of the table above.

I really appreciate the help so far!, it's just a really awkward one to try to explain, so hoping the above can clear it up abit 🙂

Ksharp · Posted 08-12-2019 09:37 AM

It is one to one linked or one to many linked?

The following could give you a start. Note : the code only suited for ONE id and ONE Event.

data have(keep=newid aeid rename=(newid=_end aeid=_start));
input ID	NEWID	EVENT :$32.	AEID;
datalines;
101	7	Rash_maculo-papular	5	
101	12	Rash_maculo-papular	7	
101	15	Rash_maculo-papular	12	
101	 .	Rash_maculo-papular	15	
101	19	Rash_maculo-papular	18	
101	 .	Rash_maculo-papular	19	
;

proc sql;
create table ancestor as
select * from have
 where _start not in (select _end from have);
quit;

 

data want(keep=path);
if _n_ eq 1 then do;
length path _path  $ 700 ;
if 0 then set have;
declare hash ha(hashexp:20,dataset:'have(where=(_start is not missing and _end is not missing))',multidata:'y');
ha.definekey('_start');
ha.definedata('_end');
ha.definedone();

declare hash pa(ordered:'y');
declare hiter hi_path('pa');
pa.definekey('n');
pa.definedata('n','path');
pa.definedone();

end;


set ancestor;
count=1;n=1;_n=1;
path=catx('|',_start,_end);
*putlog 'WARNING:Found  ' _end;
   
pa.add();
do while(hi_path.next()=0);
 if n ne 1 then pa.remove(key:_n);_n=n;
 _path=path;   
 _start=scan(path,-1,'|');
 rc=ha.find();   if rc ne 0 then output;
 do while(rc=0);
  if not findw(path,strip(_end),'|') then do;
   if length(path)+length(_end)+1 gt lengthc(path) then do;
    putlog 'ERROR: The length of path and _path are set too short';
    stop;
   end;
   
   *putlog 'WARNING:Found  ' _end;
   count+1;n=count;
   path=catx('|',path,_end);
   pa.add(); *output;
   path=_path;
 end;  
 else output; /* It is a circle.*/
  rc=ha.find_next();
end;
end;
pa.clear();

run;

joshCRF · Posted 08-14-2019 04:13 AM

Many thanks for all your responses, been a bit busy so only had chance to respond.

So, I managed to solve my issue with the following set statements:

data want;
  set have;
  set have(firstobs=2 keep=newid aeid linkid rename=(newid=x1 aeid=y1 linkid=z1))
  set have(obs=1 drop=_all_);

  if x1=aeid then linkid=z1;
run;

So for the rash maculo papular events I kept my retain, and then used the above to correct the population of the LINKID when NEWID doesn't run chronologically.

Happy to consider this closed now 🙂

Thanks alot.

Ksharp · Posted 08-14-2019 07:12 AM

You didn't answer my question.

It is one to one matched or one to many matched?

After running my code ,get WANT dataset ,there is variable PATH like:

5|7|12 ......

The first part 5 is what aeid you want. 7 12 is the linked id with 5 .

Merge it back to your dataset and get you final need.

SAS Innovate 2025: Save the Date

SAS Training: Just a Click Away