Hi all,
I am seeking some help in converting a free text variable that contains multiple observations into a numeric variable while expanding to a long data set with potentially multiple rows per person.
Basically, in my "have" dataset, the AE_free_text includes codes for adverse events that were entered with codes in free text. When an individual had more than one adverse event at a study visit, the data entry person inputted this all on the same row. I want to have a long dataset so that each adverse event is on one line per participant.
Thank you! I look forward to learning from your responses.
Have:
ID | Visit | AE_free_text |
1 | 1 | 1,4 |
2 | 1 | 1 |
3 | 1 | 2,4 |
4 | 1 | 1 |
5 | 1 | 1 |
6 | 1 | 3,9 |
7 | 1 | 3 |
8 | 1 | 2 |
Want:
ID | Visit | AE_num |
1 | 1 | 1 |
1 | 1 | 4 |
2 | 1 | 1 |
3 | 1 | 2 |
3 | 1 | 4 |
4 | 1 | 1 |
5 | 1 | 1 |
6 | 1 | 3 |
6 | 1 | 9 |
7 | 1 | 3 |
8 | 1 | 2 |
Sample data:
data have;
input id Visit AE_free_text;
cards;
1 1 1,4
2 1 1
3 1 2,4
4 1 1
5 1 1
6 1 3,9
7 1 3
8 1 2
;
A simple DO loop will do:
data have;
infile datalines dsd dlm="09"x truncover;
input ID $ Visit $ AE_free_text $;
datalines;
1 1 1,4
2 1 1
3 1 2,4
4 1 1
5 1 1
6 1 3,9
7 1 3
8 1 2
;
data want;
set have;
do i = 1 to countw(AE_free_text,",");
AE_num = input(scan(AE_free_text,i,","),10.);
output;
end;
drop i AE_free_text;
run;
Also note how "have" data is posted readily usable in a DATA step with DATALINES. Please do so in the future,
Here is one way to get the desired results:
data have;
input id Visit AE_free_text $;
cards;
1 1 1,4
2 1 1
3 1 2,4
4 1 1
5 1 1
6 1 3,9
7 1 3
8 1 2
;
run;
data _null_;
retain max 0 temp;
set have;
if _n_=1 then temp=countw(AE_free_text,',');
else do;
tempb=countw(AE_free_text,',');
max=max(temp,tempb,max);
end;
call symputx('total',max);
run;
data want(drop=ae_free_text i);
set have;
array text(*) text1-text&total;
do i=1 to dim(text);
text(i)=input(scan(ae_free_text,i,','),8.);
end;
run;
proc print;
run;
proc transpose data=want out=want1(drop=_name_ where=(ae_num1 ne .)) prefix=ae_num;
/* use the new variable name in the WHERE= clause */
by id visit;
var text:;
run;
proc print data=want1;
run;
A simple DO loop will do:
data have;
infile datalines dsd dlm="09"x truncover;
input ID $ Visit $ AE_free_text $;
datalines;
1 1 1,4
2 1 1
3 1 2,4
4 1 1
5 1 1
6 1 3,9
7 1 3
8 1 2
;
data want;
set have;
do i = 1 to countw(AE_free_text,",");
AE_num = input(scan(AE_free_text,i,","),10.);
output;
end;
drop i AE_free_text;
run;
Also note how "have" data is posted readily usable in a DATA step with DATALINES. Please do so in the future,
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.
Find more tutorials on the SAS Users YouTube channel.