Hi all,
I am seeking some help in converting a free text variable that contains multiple observations into a numeric variable while expanding to a long data set with potentially multiple rows per person.
Basically, in my "have" dataset, the AE_free_text includes codes for adverse events that were entered with codes in free text. When an individual had more than one adverse event at a study visit, the data entry person inputted this all on the same row. I want to have a long dataset so that each adverse event is on one line per participant.
Thank you! I look forward to learning from your responses.
Have:
| ID | Visit | AE_free_text |
| 1 | 1 | 1,4 |
| 2 | 1 | 1 |
| 3 | 1 | 2,4 |
| 4 | 1 | 1 |
| 5 | 1 | 1 |
| 6 | 1 | 3,9 |
| 7 | 1 | 3 |
| 8 | 1 | 2 |
Want:
| ID | Visit | AE_num |
| 1 | 1 | 1 |
| 1 | 1 | 4 |
| 2 | 1 | 1 |
| 3 | 1 | 2 |
| 3 | 1 | 4 |
| 4 | 1 | 1 |
| 5 | 1 | 1 |
| 6 | 1 | 3 |
| 6 | 1 | 9 |
| 7 | 1 | 3 |
| 8 | 1 | 2 |
Sample data:
data have;
input id Visit AE_free_text;
cards;
1 1 1,4
2 1 1
3 1 2,4
4 1 1
5 1 1
6 1 3,9
7 1 3
8 1 2
;
A simple DO loop will do:
data have;
infile datalines dsd dlm="09"x truncover;
input ID $ Visit $ AE_free_text $;
datalines;
1 1 1,4
2 1 1
3 1 2,4
4 1 1
5 1 1
6 1 3,9
7 1 3
8 1 2
;
data want;
set have;
do i = 1 to countw(AE_free_text,",");
AE_num = input(scan(AE_free_text,i,","),10.);
output;
end;
drop i AE_free_text;
run;
Also note how "have" data is posted readily usable in a DATA step with DATALINES. Please do so in the future,
Here is one way to get the desired results:
data have;
input id Visit AE_free_text $;
cards;
1 1 1,4
2 1 1
3 1 2,4
4 1 1
5 1 1
6 1 3,9
7 1 3
8 1 2
;
run;
data _null_;
retain max 0 temp;
set have;
if _n_=1 then temp=countw(AE_free_text,',');
else do;
tempb=countw(AE_free_text,',');
max=max(temp,tempb,max);
end;
call symputx('total',max);
run;
data want(drop=ae_free_text i);
set have;
array text(*) text1-text&total;
do i=1 to dim(text);
text(i)=input(scan(ae_free_text,i,','),8.);
end;
run;
proc print;
run;
proc transpose data=want out=want1(drop=_name_ where=(ae_num1 ne .)) prefix=ae_num;
/* use the new variable name in the WHERE= clause */
by id visit;
var text:;
run;
proc print data=want1;
run;
A simple DO loop will do:
data have;
infile datalines dsd dlm="09"x truncover;
input ID $ Visit $ AE_free_text $;
datalines;
1 1 1,4
2 1 1
3 1 2,4
4 1 1
5 1 1
6 1 3,9
7 1 3
8 1 2
;
data want;
set have;
do i = 1 to countw(AE_free_text,",");
AE_num = input(scan(AE_free_text,i,","),10.);
output;
end;
drop i AE_free_text;
run;
Also note how "have" data is posted readily usable in a DATA step with DATALINES. Please do so in the future,
April 27 – 30 | Gaylord Texan | Grapevine, Texas
Walk in ready to learn. Walk out ready to deliver. This is the data and AI conference you can't afford to miss.
Register now and lock in 2025 pricing—just $495!
Still thinking about your presentation idea? The submission deadline has been extended to Friday, Nov. 14, at 11:59 p.m. ET.
SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.
Find more tutorials on the SAS Users YouTube channel.