The SAS Output Delivery System and reporting techniques

Retaining Lines and Combining into a New Variable

Accepted Solution Solved
Reply
Regular Contributor
Posts: 237
Accepted Solution

Retaining Lines and Combining into a New Variable

I have a dataset that came in this form:

IDReactionTreatmentDose Level
1Reaction 1Drug A + Drug BA mg
1Reaction 1Drug A + Drug BB mg
2Reaction 1Drug A + Drug BA mg
2Reaction 1Drug A + Drug BB mg
2Reaction 2Drug A + Drug BA mg
2Reaction 2Drug A + Drug BB mg
2Reaction 3Drug A + Drug BA mg
2Reaction 3Drug A + Drug BB mg

For these patients that received two drugs, they received one line per drug dose per reaction.  For example, Patient #2 had 3 reactions and received 2 drugs, so she has 6 (3x2) line items.  I am trying to condense the dataset to look like this for output:

IDReactionTreatmentDose Level
1Reaction 1Drug A + Drug BA mg / B mg
2Reaction 1; Reaction 2; Reaction 3Drug A + Drug BA mg / B mg


I am sure I could use the RETAIN statement, but I am very unfamiliar with it, and I cannot get it work out.  I wanted to try PROC TRANSPOSE, but that turned out to be a bigger mess than it was worth.  Any help would be greatly appreciated!


Accepted Solutions
Solution
‎08-09-2011 02:43 PM
Frequent Contributor
Posts: 104

Retaining Lines and Combining into a New Variable

Posted in reply to djbateman

It sounds like you need to collect all the reactions within a dose level within treatment for each ID.

If so, this will solve your problem:

data temp;

    input ID $ @3 Reaction $char10. +1 Treatment $char15. +1 Dose $char4.;

cards;

1 Reaction 1 Drug A + Drug B A mg

1 Reaction 1 Drug A + Drug B B mg

2 Reaction 1 Drug A + Drug B A mg

2 Reaction 1 Drug A + Drug B B mg

2 Reaction 2 Drug A + Drug B A mg

2 Reaction 2 Drug A + Drug B B mg

2 Reaction 3 Drug A + Drug B A mg

2 Reaction 3 Drug A + Drug B B mg

;

proc sort;

    by id reaction treatment dose;

run;

data temp1;

    set temp;

    by id reaction treatment;

    retain doses;   length Doses $ 20;

    if first.treatment then doses='';

    doses = catx(';', doses, dose);

    if last.treatment then output;

    drop dose;

run;

proc sort;

    by id treatment doses reaction;

run;

data temp2;

    set temp1;

    by id treatment doses;

    retain reactions;   length reactions $ 80;

    if first.doses then reactions='';

    reactions = catx(';', reactions, reaction);

    if last.doses then output;

    drop reaction;

run;

The first sort + data step collects the treatment-dose that give rise to a particular reaction.

The second sort+data step collects for each treatment-doses regime all of the reactions.

Hope this helps.

View solution in original post


All Replies
Solution
‎08-09-2011 02:43 PM
Frequent Contributor
Posts: 104

Retaining Lines and Combining into a New Variable

Posted in reply to djbateman

It sounds like you need to collect all the reactions within a dose level within treatment for each ID.

If so, this will solve your problem:

data temp;

    input ID $ @3 Reaction $char10. +1 Treatment $char15. +1 Dose $char4.;

cards;

1 Reaction 1 Drug A + Drug B A mg

1 Reaction 1 Drug A + Drug B B mg

2 Reaction 1 Drug A + Drug B A mg

2 Reaction 1 Drug A + Drug B B mg

2 Reaction 2 Drug A + Drug B A mg

2 Reaction 2 Drug A + Drug B B mg

2 Reaction 3 Drug A + Drug B A mg

2 Reaction 3 Drug A + Drug B B mg

;

proc sort;

    by id reaction treatment dose;

run;

data temp1;

    set temp;

    by id reaction treatment;

    retain doses;   length Doses $ 20;

    if first.treatment then doses='';

    doses = catx(';', doses, dose);

    if last.treatment then output;

    drop dose;

run;

proc sort;

    by id treatment doses reaction;

run;

data temp2;

    set temp1;

    by id treatment doses;

    retain reactions;   length reactions $ 80;

    if first.doses then reactions='';

    reactions = catx(';', reactions, reaction);

    if last.doses then output;

    drop reaction;

run;

The first sort + data step collects the treatment-dose that give rise to a particular reaction.

The second sort+data step collects for each treatment-doses regime all of the reactions.

Hope this helps.

PROC Star
Posts: 7,474

Retaining Lines and Combining into a New Variable

One thing DLings proposed code doesn't cover is a duplicate entry, such as:

data have;

  informat reaction $10.;

  informat treatment $20.;

  informat dose_level $5.;

  input ID Reaction & Treatment & Dose_Level &;

  cards;

1 Reaction 1  Drug A + Drug B  A mg

1 Reaction 1  Drug A + Drug B  B mg

2 Reaction 1  Drug A + Drug B  A mg

2 Reaction 1  Drug A + Drug B  A mg

2 Reaction 1  Drug A + Drug B  B mg

2 Reaction 2  Drug A + Drug B  A mg

2 Reaction 2  Drug A + Drug B  B mg

2 Reaction 3  Drug A + Drug B  A mg

2 Reaction 3  Drug A + Drug B  B mg

;

proc sql noprint;

  create table need1 as

    select distinct *

      from have

        group by id, reaction, treatment, dose_level

          having min(id) gt 0

;

  create table need2 as

    select distinct id,treatment,reaction

      from need1

        group by id, treatment

          having min(id) gt 0

;

  create table need3 as

    select distinct id,treatment,dose_level

      from need1

        group by id, treatment

          having min(id) gt 0

;

quit;

proc transpose data=need2 out=need4 (drop=_Smiley Happy prefix=reaction;

  var reaction;

  by id treatment;

run;

proc transpose data=need3 out=need5 (drop=_Smiley Happy prefix=dose;

  var dose_level;

  by id treatment;

run;

data want (drop=reaction1-reaction3 dose1-dose2);

  set need4;

  set need5;

  reaction=catx(',',of reaction1-reaction3);

  dose_level=catx('/',of dose1-dose2);

run;

I don't mean to imply that my approach is any better, but it does correct for that problem.

Art

Frequent Contributor
Posts: 104

Retaining Lines and Combining into a New Variable

Thanks Art.  I was thinking about that after my quick reply.  More context around how the data is actually structured is needed, but the main ideas are all here.

Super User
Posts: 10,028

Retaining Lines and Combining into a New Variable

Posted in reply to djbateman
data temp;
    input ID $ @3 Reaction $char10. +1 Treatment $char15. +1 Dose $char4.;
cards;
1 Reaction 1 Drug A + Drug B A mg
1 Reaction 1 Drug A + Drug B B mg
2 Reaction 1 Drug A + Drug B A mg
2 Reaction 1 Drug A + Drug B B mg
2 Reaction 2 Drug A + Drug B A mg
2 Reaction 2 Drug A + Drug B B mg
2 Reaction 3 Drug A + Drug B A mg
2 Reaction 3 Drug A + Drug B B mg
;

proc sort;
    by id reaction treatment dose;
run;
data want;
 set temp;
 by id;
 retain _reaction _treatment _dose;
 length _reaction _treatment _dose $ 100 ;
 if first.id then call missing (of _:);
 if not find(_reaction,reaction) then _reaction=catx(';',_reaction,reaction);
 if not find(_treatment,treatment) then _treatment=catx(' ',_treatment,treatment);
 if not find(_dose,dose) then _dose=catx('/',_dose,dose);
 if last.id then output;
 drop  reaction treatment dose ;
run;

Ksharp

🔒 This topic is solved and locked.

Need further help from the community? Please ask a new question.

Discussion stats
  • 4 replies
  • 175 views
  • 3 likes
  • 4 in conversation