BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
sofia_de_garay
Fluorite | Level 6

Hi all,

I am seeking some help in converting a free text variable that contains multiple observations into a numeric variable while expanding to a long data set with potentially multiple rows per person.

 

Basically, in my "have" dataset, the AE_free_text includes codes for adverse events that were entered with codes in free text. When an individual had more than one adverse event at a study visit, the data entry person inputted this all on the same row. I want to have a long dataset so that each adverse event is on one line per participant.

 

Thank you! I look forward to learning from your responses.  

 

 

Have: 

IDVisitAE_free_text
111,4
211
312,4
411
511
613,9
713
812

 

Want: 

IDVisitAE_num
111
114
211
312
314
411
511
613
619
713
812

 

 

Sample data:

data have;
  input id Visit AE_free_text;
cards;
1 1 1,4
2 1 1
3 1 2,4
4 1 1
5 1 1
6 1 3,9
7 1 3
8 1 2
;

 

1 ACCEPTED SOLUTION

Accepted Solutions
Kurt_Bremser
Super User

A simple DO loop will do:

data have;
infile datalines dsd dlm="09"x truncover;
input ID $ Visit	 $ AE_free_text $;
datalines;
1	1	1,4
2	1	1
3	1	2,4
4	1	1
5	1	1
6	1	3,9
7	1	3
8	1	2
;

data want;
set have;
do i = 1 to countw(AE_free_text,",");
  AE_num = input(scan(AE_free_text,i,","),10.);
  output;
end;
drop i AE_free_text;
run;

Also note how "have" data is posted readily usable in a DATA step with DATALINES. Please do so in the future,

View solution in original post

4 REPLIES 4
Kathryn_SAS
SAS Employee

Here is one way to get the desired results:

data have;
input id Visit AE_free_text $;
cards;
1 1 1,4
2 1 1
3 1 2,4
4 1 1
5 1 1
6 1 3,9
7 1 3
8 1 2
;
run;

data _null_;
retain max 0 temp;
set have;
if _n_=1 then temp=countw(AE_free_text,',');
else do;
tempb=countw(AE_free_text,',');
max=max(temp,tempb,max);
end;
call symputx('total',max);
run;

data want(drop=ae_free_text i);
set have;
array text(*) text1-text&total;
do i=1 to dim(text);
 text(i)=input(scan(ae_free_text,i,','),8.);
end;
run;

proc print;
run;

proc transpose data=want out=want1(drop=_name_ where=(ae_num1 ne .)) prefix=ae_num;
/* use the new variable name in the WHERE= clause */
by id visit;
var text:;
run;

proc print data=want1;
run;
sofia_de_garay
Fluorite | Level 6
Thank you! I learned a lot by going through your solution.
Kurt_Bremser
Super User

A simple DO loop will do:

data have;
infile datalines dsd dlm="09"x truncover;
input ID $ Visit	 $ AE_free_text $;
datalines;
1	1	1,4
2	1	1
3	1	2,4
4	1	1
5	1	1
6	1	3,9
7	1	3
8	1	2
;

data want;
set have;
do i = 1 to countw(AE_free_text,",");
  AE_num = input(scan(AE_free_text,i,","),10.);
  output;
end;
drop i AE_free_text;
run;

Also note how "have" data is posted readily usable in a DATA step with DATALINES. Please do so in the future,

sofia_de_garay
Fluorite | Level 6
Very helpful! Thank you for helping me learn.