I have a dataset with a list of observations of wart assessments that is rather long and I'd like to set it up for a reviewer to see each subject across a fat dataset. Do I need to do a transpose for each variable to make this work?
Turning this:
subject | visit | date | lesion | long | wide | n/a | location | grade |
01/001 | VISIT01 | 10-Oct-18 | 1 | 5 | 2 | . | E4 | 2 |
01/001 | VISIT01 | 10-Oct-18 | 2 | 4 | 2 | 0 | E4 | 2 |
01/001 | VISIT01 | 10-Oct-18 | 3 | 5 | 1 | 0 | D4 | 2 |
01/001 | VISIT01 | 10-Oct-18 | 4 | 6 | 1 | 0 | D4 | 3 |
01/001 | VISIT01 | 10-Oct-18 | 5 | 4 | 2 | 0 | A4 | 2 |
01/001 | VISIT01 | 10-Oct-18 | 6 | . | . | 1 | . | |
01/001 | VISIT02 | 17-Oct-18 | 1 | 5 | 2 | . | 2 | |
01/001 | VISIT02 | 17-Oct-18 | 2 | 4 | 2 | 0 | 2 | |
01/001 | VISIT02 | 17-Oct-18 | 3 | 5 | 1 | 0 | 2 | |
01/001 | VISIT02 | 17-Oct-18 | 4 | 6 | 1 | 0 | 3 | |
01/001 | VISIT02 | 17-Oct-18 | 5 | 4 | 2 | 0 | 2 | |
01/001 | VISIT02 | 17-Oct-18 | 6 | . | . | 1 | . | |
into this:
subject | visit | date | lesion | long | wide | n/a | location | grade | visit | date | lesion | long | wide | n/a | location | grade |
01/001 | VISIT01 | 10-Oct-18 | 1 | 5 | 2 | . | E4 | 2 | VISIT02 | 17-Oct-18 | 1 | 5 | 2 | . | 2 | |
01/001 | VISIT01 | 10-Oct-18 | 2 | 4 | 2 | 0 | E4 | 2 | VISIT02 | 17-Oct-18 | 2 | 4 | 2 | 0 | 2 | |
01/001 | VISIT01 | 10-Oct-18 | 3 | 5 | 1 | 0 | D4 | 2 | VISIT02 | 17-Oct-18 | 3 | 5 | 1 | 0 | 2 | |
01/001 | VISIT01 | 10-Oct-18 | 4 | 6 | 1 | 0 | D4 | 3 | VISIT02 | 17-Oct-18 | 4 | 6 | 1 | 0 | 3 | |
01/001 | VISIT01 | 10-Oct-18 | 5 | 4 | 2 | 0 | A4 | 2 | VISIT02 | 17-Oct-18 | 5 | 4 | 2 | 0 | 2 | |
01/001 | VISIT01 | 10-Oct-18 | 6 | . | . | 1 | . | VISIT02 | 17-Oct-18 | 6 | . | . | 1 | . |
Post your data as a complete data step we can work with rather than a cut-and-paste job.
@PSU_Sudzi wrote:
I have a dataset with a list of observations of wart assessments that is rather long and I'd like to set it up for a reviewer to see each subject across a fat dataset. Do I need to do a transpose for each variable to make this work?
Yes, you need to flip each variable on it's own and then merge the results, similar to this:
Example 5 here
https://stats.idre.ucla.edu/sas/modules/how-to-reshape-data-long-to-wide-using-proc-transpose/
Or you can flip it to a long format and then a wide format:
https://gist.github.com/statgeek/2321b6f62ab78d5bf2b0a5a8626bd7cd
Or you can use a custom macro written by users here:
@PSU_Sudzi wrote:
I have a dataset with a list of observations of wart assessments that is rather long and I'd like to set it up for a reviewer to see each subject across a fat dataset. Do I need to do a transpose for each variable to make this work?
Turning this:
subject visit date lesion long wide n/a location grade 01/001 VISIT01 10-Oct-18 1 5 2 . E4 2 01/001 VISIT01 10-Oct-18 2 4 2 0 E4 2 01/001 VISIT01 10-Oct-18 3 5 1 0 D4 2 01/001 VISIT01 10-Oct-18 4 6 1 0 D4 3 01/001 VISIT01 10-Oct-18 5 4 2 0 A4 2 01/001 VISIT01 10-Oct-18 6 . . 1 . 01/001 VISIT02 17-Oct-18 1 5 2 . 2 01/001 VISIT02 17-Oct-18 2 4 2 0 2 01/001 VISIT02 17-Oct-18 3 5 1 0 2 01/001 VISIT02 17-Oct-18 4 6 1 0 3 01/001 VISIT02 17-Oct-18 5 4 2 0 2 01/001 VISIT02 17-Oct-18 6 . . 1 .
into this:
subject visit date lesion long wide n/a location grade visit date lesion long wide n/a location grade 01/001 VISIT01 10-Oct-18 1 5 2 . E4 2 VISIT02 17-Oct-18 1 5 2 . 2 01/001 VISIT01 10-Oct-18 2 4 2 0 E4 2 VISIT02 17-Oct-18 2 4 2 0 2 01/001 VISIT01 10-Oct-18 3 5 1 0 D4 2 VISIT02 17-Oct-18 3 5 1 0 2 01/001 VISIT01 10-Oct-18 4 6 1 0 D4 3 VISIT02 17-Oct-18 4 6 1 0 3 01/001 VISIT01 10-Oct-18 5 4 2 0 A4 2 VISIT02 17-Oct-18 5 4 2 0 2 01/001 VISIT01 10-Oct-18 6 . . 1 . VISIT02 17-Oct-18 6 . . 1 .
Thank you Reeza!
It can be simple to do if there are only two visits per subject/lesion:
proc sort data=have; by subject lesion date; run;
data want;
merge
have(
where=(visit="VISIT01")
rename=(date=date01))
have(
where=(visit="VISIT02")
rename=(date=date02 long=long2 wide=wide2 n_a=n_a2 grade=grade2)
drop=location);
by subject lesion;
drop visit;
run;
Thank you PGStats, I actually have up to 13 visits for each subject so I assume I could use your code below and just keep adding visits onto it?
Solution through proc transpose...
data have;
infile datalines truncover;
input
subject $ visit $ date date9. lesion long wide n_a location $2. grade;
format date date9.;
datalines;
01001 VISIT01 10-Oct-18 1 5 2 . E4 2
01001 VISIT01 10-Oct-18 2 4 2 0 E4 2
01001 VISIT01 10-Oct-18 3 5 1 0 D4 2
01001 VISIT01 10-Oct-18 4 6 1 0 D4 3
01001 VISIT01 10-Oct-18 5 4 2 0 A4 2
01001 VISIT01 10-Oct-18 6 . . 1 .
01001 VISIT02 17-Oct-18 1 5 2 . 2
01001 VISIT02 17-Oct-18 2 4 2 0 2
01001 VISIT02 17-Oct-18 3 5 1 0 2
01001 VISIT02 17-Oct-18 4 6 1 0 3
01001 VISIT02 17-Oct-18 5 4 2 0 2
01001 VISIT02 17-Oct-18 6 . . 1 .
;
run;
proc transpose data=have out=have1 prefix=var;
by subject visit notsorted;
id lesion ;
var date long wide n_a location grade;
run;
proc transpose data=have1 out=have2 delimiter=_ name=lesion;
by subject notsorted;
id _name_ visit;
var var1-var6;
run;
data want (rename=(lesion1=lesion));
set have2;
lesion1=compress(lesion,'','kd');
drop lesion;
run;
Thank you singhsahab!
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.