BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
djbateman
Lapis Lazuli | Level 10

I have a dataset that came in this form:

IDReactionTreatmentDose Level
1Reaction 1Drug A + Drug BA mg
1Reaction 1Drug A + Drug BB mg
2Reaction 1Drug A + Drug BA mg
2Reaction 1Drug A + Drug BB mg
2Reaction 2Drug A + Drug BA mg
2Reaction 2Drug A + Drug BB mg
2Reaction 3Drug A + Drug BA mg
2Reaction 3Drug A + Drug BB mg

For these patients that received two drugs, they received one line per drug dose per reaction.  For example, Patient #2 had 3 reactions and received 2 drugs, so she has 6 (3x2) line items.  I am trying to condense the dataset to look like this for output:

IDReactionTreatmentDose Level
1Reaction 1Drug A + Drug BA mg / B mg
2Reaction 1; Reaction 2; Reaction 3Drug A + Drug BA mg / B mg


I am sure I could use the RETAIN statement, but I am very unfamiliar with it, and I cannot get it work out.  I wanted to try PROC TRANSPOSE, but that turned out to be a bigger mess than it was worth.  Any help would be greatly appreciated!

1 ACCEPTED SOLUTION

Accepted Solutions
DLing
Obsidian | Level 7

It sounds like you need to collect all the reactions within a dose level within treatment for each ID.

If so, this will solve your problem:

data temp;

    input ID $ @3 Reaction $char10. +1 Treatment $char15. +1 Dose $char4.;

cards;

1 Reaction 1 Drug A + Drug B A mg

1 Reaction 1 Drug A + Drug B B mg

2 Reaction 1 Drug A + Drug B A mg

2 Reaction 1 Drug A + Drug B B mg

2 Reaction 2 Drug A + Drug B A mg

2 Reaction 2 Drug A + Drug B B mg

2 Reaction 3 Drug A + Drug B A mg

2 Reaction 3 Drug A + Drug B B mg

;

proc sort;

    by id reaction treatment dose;

run;

data temp1;

    set temp;

    by id reaction treatment;

    retain doses;   length Doses $ 20;

    if first.treatment then doses='';

    doses = catx(';', doses, dose);

    if last.treatment then output;

    drop dose;

run;

proc sort;

    by id treatment doses reaction;

run;

data temp2;

    set temp1;

    by id treatment doses;

    retain reactions;   length reactions $ 80;

    if first.doses then reactions='';

    reactions = catx(';', reactions, reaction);

    if last.doses then output;

    drop reaction;

run;

The first sort + data step collects the treatment-dose that give rise to a particular reaction.

The second sort+data step collects for each treatment-doses regime all of the reactions.

Hope this helps.

View solution in original post

4 REPLIES 4
DLing
Obsidian | Level 7

It sounds like you need to collect all the reactions within a dose level within treatment for each ID.

If so, this will solve your problem:

data temp;

    input ID $ @3 Reaction $char10. +1 Treatment $char15. +1 Dose $char4.;

cards;

1 Reaction 1 Drug A + Drug B A mg

1 Reaction 1 Drug A + Drug B B mg

2 Reaction 1 Drug A + Drug B A mg

2 Reaction 1 Drug A + Drug B B mg

2 Reaction 2 Drug A + Drug B A mg

2 Reaction 2 Drug A + Drug B B mg

2 Reaction 3 Drug A + Drug B A mg

2 Reaction 3 Drug A + Drug B B mg

;

proc sort;

    by id reaction treatment dose;

run;

data temp1;

    set temp;

    by id reaction treatment;

    retain doses;   length Doses $ 20;

    if first.treatment then doses='';

    doses = catx(';', doses, dose);

    if last.treatment then output;

    drop dose;

run;

proc sort;

    by id treatment doses reaction;

run;

data temp2;

    set temp1;

    by id treatment doses;

    retain reactions;   length reactions $ 80;

    if first.doses then reactions='';

    reactions = catx(';', reactions, reaction);

    if last.doses then output;

    drop reaction;

run;

The first sort + data step collects the treatment-dose that give rise to a particular reaction.

The second sort+data step collects for each treatment-doses regime all of the reactions.

Hope this helps.

art297
Opal | Level 21

One thing DLings proposed code doesn't cover is a duplicate entry, such as:

data have;

  informat reaction $10.;

  informat treatment $20.;

  informat dose_level $5.;

  input ID Reaction & Treatment & Dose_Level &;

  cards;

1 Reaction 1  Drug A + Drug B  A mg

1 Reaction 1  Drug A + Drug B  B mg

2 Reaction 1  Drug A + Drug B  A mg

2 Reaction 1  Drug A + Drug B  A mg

2 Reaction 1  Drug A + Drug B  B mg

2 Reaction 2  Drug A + Drug B  A mg

2 Reaction 2  Drug A + Drug B  B mg

2 Reaction 3  Drug A + Drug B  A mg

2 Reaction 3  Drug A + Drug B  B mg

;

proc sql noprint;

  create table need1 as

    select distinct *

      from have

        group by id, reaction, treatment, dose_level

          having min(id) gt 0

;

  create table need2 as

    select distinct id,treatment,reaction

      from need1

        group by id, treatment

          having min(id) gt 0

;

  create table need3 as

    select distinct id,treatment,dose_level

      from need1

        group by id, treatment

          having min(id) gt 0

;

quit;

proc transpose data=need2 out=need4 (drop=_:) prefix=reaction;

  var reaction;

  by id treatment;

run;

proc transpose data=need3 out=need5 (drop=_:) prefix=dose;

  var dose_level;

  by id treatment;

run;

data want (drop=reaction1-reaction3 dose1-dose2);

  set need4;

  set need5;

  reaction=catx(',',of reaction1-reaction3);

  dose_level=catx('/',of dose1-dose2);

run;

I don't mean to imply that my approach is any better, but it does correct for that problem.

Art

DLing
Obsidian | Level 7

Thanks Art.  I was thinking about that after my quick reply.  More context around how the data is actually structured is needed, but the main ideas are all here.

Ksharp
Super User
data temp;
    input ID $ @3 Reaction $char10. +1 Treatment $char15. +1 Dose $char4.;
cards;
1 Reaction 1 Drug A + Drug B A mg
1 Reaction 1 Drug A + Drug B B mg
2 Reaction 1 Drug A + Drug B A mg
2 Reaction 1 Drug A + Drug B B mg
2 Reaction 2 Drug A + Drug B A mg
2 Reaction 2 Drug A + Drug B B mg
2 Reaction 3 Drug A + Drug B A mg
2 Reaction 3 Drug A + Drug B B mg
;

proc sort;
    by id reaction treatment dose;
run;
data want;
 set temp;
 by id;
 retain _reaction _treatment _dose;
 length _reaction _treatment _dose $ 100 ;
 if first.id then call missing (of _:);
 if not find(_reaction,reaction) then _reaction=catx(';',_reaction,reaction);
 if not find(_treatment,treatment) then _treatment=catx(' ',_treatment,treatment);
 if not find(_dose,dose) then _dose=catx('/',_dose,dose);
 if last.id then output;
 drop  reaction treatment dose ;
run;

Ksharp

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 4 replies
  • 762 views
  • 3 likes
  • 4 in conversation