Retaining Lines and Combining into a New Variable

Solved
Regular Contributor
Posts: 247

Retaining Lines and Combining into a New Variable

I have a dataset that came in this form:

IDReactionTreatmentDose Level
1Reaction 1Drug A + Drug BA mg
1Reaction 1Drug A + Drug BB mg
2Reaction 1Drug A + Drug BA mg
2Reaction 1Drug A + Drug BB mg
2Reaction 2Drug A + Drug BA mg
2Reaction 2Drug A + Drug BB mg
2Reaction 3Drug A + Drug BA mg
2Reaction 3Drug A + Drug BB mg

For these patients that received two drugs, they received one line per drug dose per reaction.  For example, Patient #2 had 3 reactions and received 2 drugs, so she has 6 (3x2) line items.  I am trying to condense the dataset to look like this for output:

IDReactionTreatmentDose Level
1Reaction 1Drug A + Drug BA mg / B mg
2Reaction 1; Reaction 2; Reaction 3Drug A + Drug BA mg / B mg

I am sure I could use the RETAIN statement, but I am very unfamiliar with it, and I cannot get it work out.  I wanted to try PROC TRANSPOSE, but that turned out to be a bigger mess than it was worth.  Any help would be greatly appreciated!

Accepted Solutions
Solution
‎08-09-2011 02:43 PM
Frequent Contributor
Posts: 104

Retaining Lines and Combining into a New Variable

Posted in reply to djbateman

It sounds like you need to collect all the reactions within a dose level within treatment for each ID.

If so, this will solve your problem:

data temp;

input ID \$ @3 Reaction \$char10. +1 Treatment \$char15. +1 Dose \$char4.;

cards;

1 Reaction 1 Drug A + Drug B A mg

1 Reaction 1 Drug A + Drug B B mg

2 Reaction 1 Drug A + Drug B A mg

2 Reaction 1 Drug A + Drug B B mg

2 Reaction 2 Drug A + Drug B A mg

2 Reaction 2 Drug A + Drug B B mg

2 Reaction 3 Drug A + Drug B A mg

2 Reaction 3 Drug A + Drug B B mg

;

proc sort;

by id reaction treatment dose;

run;

data temp1;

set temp;

by id reaction treatment;

retain doses;   length Doses \$ 20;

if first.treatment then doses='';

doses = catx(';', doses, dose);

if last.treatment then output;

drop dose;

run;

proc sort;

by id treatment doses reaction;

run;

data temp2;

set temp1;

by id treatment doses;

retain reactions;   length reactions \$ 80;

if first.doses then reactions='';

reactions = catx(';', reactions, reaction);

if last.doses then output;

drop reaction;

run;

The first sort + data step collects the treatment-dose that give rise to a particular reaction.

The second sort+data step collects for each treatment-doses regime all of the reactions.

Hope this helps.

All Replies
Solution
‎08-09-2011 02:43 PM
Frequent Contributor
Posts: 104

Retaining Lines and Combining into a New Variable

Posted in reply to djbateman

It sounds like you need to collect all the reactions within a dose level within treatment for each ID.

If so, this will solve your problem:

data temp;

input ID \$ @3 Reaction \$char10. +1 Treatment \$char15. +1 Dose \$char4.;

cards;

1 Reaction 1 Drug A + Drug B A mg

1 Reaction 1 Drug A + Drug B B mg

2 Reaction 1 Drug A + Drug B A mg

2 Reaction 1 Drug A + Drug B B mg

2 Reaction 2 Drug A + Drug B A mg

2 Reaction 2 Drug A + Drug B B mg

2 Reaction 3 Drug A + Drug B A mg

2 Reaction 3 Drug A + Drug B B mg

;

proc sort;

by id reaction treatment dose;

run;

data temp1;

set temp;

by id reaction treatment;

retain doses;   length Doses \$ 20;

if first.treatment then doses='';

doses = catx(';', doses, dose);

if last.treatment then output;

drop dose;

run;

proc sort;

by id treatment doses reaction;

run;

data temp2;

set temp1;

by id treatment doses;

retain reactions;   length reactions \$ 80;

if first.doses then reactions='';

reactions = catx(';', reactions, reaction);

if last.doses then output;

drop reaction;

run;

The first sort + data step collects the treatment-dose that give rise to a particular reaction.

The second sort+data step collects for each treatment-doses regime all of the reactions.

Hope this helps.

PROC Star
Posts: 8,163

Retaining Lines and Combining into a New Variable

One thing DLings proposed code doesn't cover is a duplicate entry, such as:

data have;

informat reaction \$10.;

informat treatment \$20.;

informat dose_level \$5.;

input ID Reaction & Treatment & Dose_Level &;

cards;

1 Reaction 1  Drug A + Drug B  A mg

1 Reaction 1  Drug A + Drug B  B mg

2 Reaction 1  Drug A + Drug B  A mg

2 Reaction 1  Drug A + Drug B  A mg

2 Reaction 1  Drug A + Drug B  B mg

2 Reaction 2  Drug A + Drug B  A mg

2 Reaction 2  Drug A + Drug B  B mg

2 Reaction 3  Drug A + Drug B  A mg

2 Reaction 3  Drug A + Drug B  B mg

;

proc sql noprint;

create table need1 as

select distinct *

from have

group by id, reaction, treatment, dose_level

having min(id) gt 0

;

create table need2 as

select distinct id,treatment,reaction

from need1

group by id, treatment

having min(id) gt 0

;

create table need3 as

select distinct id,treatment,dose_level

from need1

group by id, treatment

having min(id) gt 0

;

quit;

proc transpose data=need2 out=need4 (drop=_ prefix=reaction;

var reaction;

by id treatment;

run;

proc transpose data=need3 out=need5 (drop=_ prefix=dose;

var dose_level;

by id treatment;

run;

data want (drop=reaction1-reaction3 dose1-dose2);

set need4;

set need5;

reaction=catx(',',of reaction1-reaction3);

dose_level=catx('/',of dose1-dose2);

run;

I don't mean to imply that my approach is any better, but it does correct for that problem.

Art

Frequent Contributor
Posts: 104

Retaining Lines and Combining into a New Variable

Thanks Art.  I was thinking about that after my quick reply.  More context around how the data is actually structured is needed, but the main ideas are all here.

Super User
Posts: 10,766

Retaining Lines and Combining into a New Variable

Posted in reply to djbateman
```data temp;
input ID \$ @3 Reaction \$char10. +1 Treatment \$char15. +1 Dose \$char4.;
cards;
1 Reaction 1 Drug A + Drug B A mg
1 Reaction 1 Drug A + Drug B B mg
2 Reaction 1 Drug A + Drug B A mg
2 Reaction 1 Drug A + Drug B B mg
2 Reaction 2 Drug A + Drug B A mg
2 Reaction 2 Drug A + Drug B B mg
2 Reaction 3 Drug A + Drug B A mg
2 Reaction 3 Drug A + Drug B B mg
;

proc sort;
by id reaction treatment dose;
run;
data want;
set temp;
by id;
retain _reaction _treatment _dose;
length _reaction _treatment _dose \$ 100 ;
if first.id then call missing (of _:);
if not find(_reaction,reaction) then _reaction=catx(';',_reaction,reaction);
if not find(_treatment,treatment) then _treatment=catx(' ',_treatment,treatment);
if not find(_dose,dose) then _dose=catx('/',_dose,dose);
if last.id then output;
drop  reaction treatment dose ;
run;
```

Ksharp

🔒 This topic is solved and locked.

Need further help from the community? Please ask a new question.

Discussion stats
• 4 replies
• 191 views
• 3 likes
• 4 in conversation