BookmarkSubscribeRSS Feed
joshCRF
Fluorite | Level 6
SUBJECT IDNEWIDADVERSE EVENTAEIDLINKID
100 Dyspnoea1 
1001Dyspnoea22
100 Dyspnoea77

 

Struggled to think of a subject topic to correctly describe what I want to occur; so above is a small sample of a data set I am currently working with.

 

I would like to know it it's possible that when NEWID is equal to AEID (so the first and second observation) that the missing LINKID is populated with the LINKID where the NEWID is populated. So would be expecting that the missing LINKID would be populated with '2'.

 

While the above just show's one ADVERSE EVENT, I need this to be correctly be applied to different ADVERSE EVENTS where the numbers might be the same.

 

No code to provide as I am currently racking my brains as to how to get this to work!

 

Thanks in advance.

10 REPLIES 10
PaigeMiller
Diamond | Level 26

This short example is hard to generalize into something that can be coded, and that correctly answers the problem.

 

Is it always the case that NEWID is populated in the record immediately after AEID has the same value, or can the NEWID of 1 be in any record in the data set?

 

Can there be more than one record for which NEWID matches an AEID?

--
Paige Miller
joshCRF
Fluorite | Level 6
SUBJECT IDNEWIDADVERSE EVENTAEIDLINKID
1017Rash maculo-papular55
10112Rash maculo-papular7 
10115Rash maculo-papular12 
101 Rash maculo-papular15 
10119Rash maculo-papular1818
101 Rash maculo-papular19 

 

So I've added a further example - within a certain ADVERSE EVENT, when the ADVERSE EVENT is linked the NEWID will always match up to AEID, generally it can fall on any record. 

 

As you can see above the LINKID which I want to display on the linked ADVERSE EVENT is the first value of AEID within the linked ADVERSE EVENT, so I was just using the retain function to populate the missing LINKID's

 

However, as per the example in the first post, the retain function would be retaining a value from a previous (different ADVERSE EVENT) which is giving me incorrect results.

PaigeMiller
Diamond | Level 26

So I've added a further example - within a certain ADVERSE EVENT, when the ADVERSE EVENT is linked the NEWID will always match up to AEID, generally it can fall on any record.

 

Now you have given us two examples where this is not true, can we have an example where NEWID and AEID are not in consecutive records? Can you please provide the data following these instructions so that we may write a program that works on your actual sample data?

--
Paige Miller
joshCRF
Fluorite | Level 6

I will have to attach an excel file, due to working off a company laptop the code that has been provided to convert to the dataset to SAS code will not work.

 

The attached excel file, presents all the records in the dataset. Which you should be able to import into SAS.

 

So... when NEWID is present (for the very first ADVERSE EVENT of a set of linked adverse events) the LINKID will equal the AEID.

The above has already been created in the dataset.

 

Then within the same linked adverse events, when NEWID is equal to the AEID, the LINKID needs to equal the LINKID of the first ADVERSE EVENT.

PaigeMiller
Diamond | Level 26

Some of us will not (or cannot) download Microsoft Office files, as they are a security threat.

 

So, to repeat my earlier request:

 

Can you please provide the data following these instructions so that we may write a program that works on your actual sample data?

--
Paige Miller
PGStats
Opal | Level 21

It can be done in SAS/SQL with a correlated subquery :

 

data have;
input ID	NEWID	EVENT :$32.	AEID;
datalines;
101	7	Rash_maculo-papular	5	
101	12	Rash_maculo-papular	7	
101	15	Rash_maculo-papular	12	
101	 .	Rash_maculo-papular	15	
101	19	Rash_maculo-papular	18	
101	 .	Rash_maculo-papular	19	
;

proc sql;
select 
    *,
    case when NEWID in (select AEID from have where ID=a.ID and EVENT=a.EVENT) then AEID
    else . end as LINKID
from have as a;
quit;
            ID     NEWID  EVENT                                 AEID    LINKID
      ------------------------------------------------------------------------
           101         7  Rash_maculo-papular                      5         5
           101        12  Rash_maculo-papular                      7         7
           101        15  Rash_maculo-papular                     12        12
           101         .  Rash_maculo-papular                     15         .
           101        19  Rash_maculo-papular                     18        18
           101         .  Rash_maculo-papular                     19         .
PG
joshCRF
Fluorite | Level 6

Thank you for the above response 🙂

 

However, I would expect it to look like follows (I have also added how it would look like the for EVENT mentioned at the beginning of this thread):

 

            ID     NEWID  EVENT                                 AEID    LINKID
      ------------------------------------------------------------------------
           101         7  Rash_maculo-papular                      5         5
           101        12  Rash_maculo-papular                      7         5
           101        15  Rash_maculo-papular                     12         5
           101         .  Rash_maculo-papular                     15         5
           101        19  Rash_maculo-papular                     18        18
           101         .  Rash_maculo-papular                     19        18

102 . Dyspnoea 1 2
102 1 Dyspnoea 2 2

 

So for ID 101 when NEWID = 7 and EVENT = Rash_maculo-papular this is classed as the first occurence of the EVENT within the linked EVENTS - so LINKID is populated with 5 already.

 

And then for ID 101 when NEWID = 19 and EVENT = Rash_maculo-papular this is classed as the first occurence of the EVENT within the linked EVENTS - so LINKID is populated with 18 already.

 

So then when NEWID = AEID I'd like LINKID to be populated with the LINKID from the first occurence (as shown above)

 

This was done fine in the code I was using, when using a retain function. However, the second example I have given causes problems as I cannot use a retain function on that.

 

For ID 102 when NEWID = 1 and EVENT = DYSPNOEA this is classed as the first occurence of the EVENT, so LINKID is populated with 2 already for that observation. So I'm wanting the record above it, where the NEWID = AEID to be populated with the LINKID from the first occurence (which would be 2).

 

So I'm hopefully looking for a solution with the outcome of the table above.

 

I really appreciate the help so far!, it's just a really awkward one to try to explain, so hoping the above can clear it up abit 🙂

 

 

Ksharp
Super User

It is one to one linked or one to many linked?

The following could give you a start. Note : the code only suited for ONE id and ONE Event.

 

data have(keep=newid aeid rename=(newid=_end aeid=_start));
input ID	NEWID	EVENT :$32.	AEID;
datalines;
101	7	Rash_maculo-papular	5	
101	12	Rash_maculo-papular	7	
101	15	Rash_maculo-papular	12	
101	 .	Rash_maculo-papular	15	
101	19	Rash_maculo-papular	18	
101	 .	Rash_maculo-papular	19	
;

proc sql;
create table ancestor as
select * from have
 where _start not in (select _end from have);
quit;

 

data want(keep=path);
if _n_ eq 1 then do;
length path _path  $ 700 ;
if 0 then set have;
declare hash ha(hashexp:20,dataset:'have(where=(_start is not missing and _end is not missing))',multidata:'y');
ha.definekey('_start');
ha.definedata('_end');
ha.definedone();

declare hash pa(ordered:'y');
declare hiter hi_path('pa');
pa.definekey('n');
pa.definedata('n','path');
pa.definedone();

end;


set ancestor;
count=1;n=1;_n=1;
path=catx('|',_start,_end);
*putlog 'WARNING:Found  ' _end;
   
pa.add();
do while(hi_path.next()=0);
 if n ne 1 then pa.remove(key:_n);_n=n;
 _path=path;   
 _start=scan(path,-1,'|');
 rc=ha.find();   if rc ne 0 then output;
 do while(rc=0);
  if not findw(path,strip(_end),'|') then do;
   if length(path)+length(_end)+1 gt lengthc(path) then do;
    putlog 'ERROR: The length of path and _path are set too short';
    stop;
   end;
   
   *putlog 'WARNING:Found  ' _end;
   count+1;n=count;
   path=catx('|',path,_end);
   pa.add(); *output;
   path=_path;
 end;  
 else output; /* It is a circle.*/
  rc=ha.find_next();
end;
end;
pa.clear();

run;

 

joshCRF
Fluorite | Level 6

Many thanks for all your responses, been a bit busy so only had chance to respond.

 

So, I managed to solve my issue with the following set statements:

 

data want;
set have;
set have(firstobs=2 keep=newid aeid linkid rename=(newid=x1 aeid=y1 linkid=z1))
set have(obs=1 drop=_all_);

if x1=aeid then linkid=z1;
run;

 

So for the rash maculo papular events I kept my retain, and then used the above to correct the population of the LINKID when NEWID doesn't run chronologically.

 

Happy to consider this closed now 🙂

 

Thanks alot.

Ksharp
Super User

You didn't answer my question.

It is one to one matched or one to many matched?

 

After running my code ,get WANT dataset ,there is variable PATH like:

5|7|12 ......

 

The first part 5 is what aeid you want. 7 12 is the linked id with 5 .

Merge it back to your dataset and get you final need.

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 10 replies
  • 1140 views
  • 0 likes
  • 4 in conversation