I have a dataset with a list of observations of wart assessments that is rather long and I'd like to set it up for a reviewer to see each subject across a fat dataset. Do I need to do a transpose for each variable to make this work?
Turning this:
subject | visit | date | lesion | long | wide | n/a | location | grade |
01/001 | VISIT01 | 10-Oct-18 | 1 | 5 | 2 | . | E4 | 2 |
01/001 | VISIT01 | 10-Oct-18 | 2 | 4 | 2 | 0 | E4 | 2 |
01/001 | VISIT01 | 10-Oct-18 | 3 | 5 | 1 | 0 | D4 | 2 |
01/001 | VISIT01 | 10-Oct-18 | 4 | 6 | 1 | 0 | D4 | 3 |
01/001 | VISIT01 | 10-Oct-18 | 5 | 4 | 2 | 0 | A4 | 2 |
01/001 | VISIT01 | 10-Oct-18 | 6 | . | . | 1 | . | |
01/001 | VISIT02 | 17-Oct-18 | 1 | 5 | 2 | . | 2 | |
01/001 | VISIT02 | 17-Oct-18 | 2 | 4 | 2 | 0 | 2 | |
01/001 | VISIT02 | 17-Oct-18 | 3 | 5 | 1 | 0 | 2 | |
01/001 | VISIT02 | 17-Oct-18 | 4 | 6 | 1 | 0 | 3 | |
01/001 | VISIT02 | 17-Oct-18 | 5 | 4 | 2 | 0 | 2 | |
01/001 | VISIT02 | 17-Oct-18 | 6 | . | . | 1 | . | |
into this:
subject | visit | date | lesion | long | wide | n/a | location | grade | visit | date | lesion | long | wide | n/a | location | grade |
01/001 | VISIT01 | 10-Oct-18 | 1 | 5 | 2 | . | E4 | 2 | VISIT02 | 17-Oct-18 | 1 | 5 | 2 | . | 2 | |
01/001 | VISIT01 | 10-Oct-18 | 2 | 4 | 2 | 0 | E4 | 2 | VISIT02 | 17-Oct-18 | 2 | 4 | 2 | 0 | 2 | |
01/001 | VISIT01 | 10-Oct-18 | 3 | 5 | 1 | 0 | D4 | 2 | VISIT02 | 17-Oct-18 | 3 | 5 | 1 | 0 | 2 | |
01/001 | VISIT01 | 10-Oct-18 | 4 | 6 | 1 | 0 | D4 | 3 | VISIT02 | 17-Oct-18 | 4 | 6 | 1 | 0 | 3 | |
01/001 | VISIT01 | 10-Oct-18 | 5 | 4 | 2 | 0 | A4 | 2 | VISIT02 | 17-Oct-18 | 5 | 4 | 2 | 0 | 2 | |
01/001 | VISIT01 | 10-Oct-18 | 6 | . | . | 1 | . | VISIT02 | 17-Oct-18 | 6 | . | . | 1 | . |
Post your data as a complete data step we can work with rather than a cut-and-paste job.
@PSU_Sudzi wrote:
I have a dataset with a list of observations of wart assessments that is rather long and I'd like to set it up for a reviewer to see each subject across a fat dataset. Do I need to do a transpose for each variable to make this work?
Yes, you need to flip each variable on it's own and then merge the results, similar to this:
Example 5 here
https://stats.idre.ucla.edu/sas/modules/how-to-reshape-data-long-to-wide-using-proc-transpose/
Or you can flip it to a long format and then a wide format:
https://gist.github.com/statgeek/2321b6f62ab78d5bf2b0a5a8626bd7cd
Or you can use a custom macro written by users here:
@PSU_Sudzi wrote:
I have a dataset with a list of observations of wart assessments that is rather long and I'd like to set it up for a reviewer to see each subject across a fat dataset. Do I need to do a transpose for each variable to make this work?
Turning this:
subject visit date lesion long wide n/a location grade 01/001 VISIT01 10-Oct-18 1 5 2 . E4 2 01/001 VISIT01 10-Oct-18 2 4 2 0 E4 2 01/001 VISIT01 10-Oct-18 3 5 1 0 D4 2 01/001 VISIT01 10-Oct-18 4 6 1 0 D4 3 01/001 VISIT01 10-Oct-18 5 4 2 0 A4 2 01/001 VISIT01 10-Oct-18 6 . . 1 . 01/001 VISIT02 17-Oct-18 1 5 2 . 2 01/001 VISIT02 17-Oct-18 2 4 2 0 2 01/001 VISIT02 17-Oct-18 3 5 1 0 2 01/001 VISIT02 17-Oct-18 4 6 1 0 3 01/001 VISIT02 17-Oct-18 5 4 2 0 2 01/001 VISIT02 17-Oct-18 6 . . 1 .
into this:
subject visit date lesion long wide n/a location grade visit date lesion long wide n/a location grade 01/001 VISIT01 10-Oct-18 1 5 2 . E4 2 VISIT02 17-Oct-18 1 5 2 . 2 01/001 VISIT01 10-Oct-18 2 4 2 0 E4 2 VISIT02 17-Oct-18 2 4 2 0 2 01/001 VISIT01 10-Oct-18 3 5 1 0 D4 2 VISIT02 17-Oct-18 3 5 1 0 2 01/001 VISIT01 10-Oct-18 4 6 1 0 D4 3 VISIT02 17-Oct-18 4 6 1 0 3 01/001 VISIT01 10-Oct-18 5 4 2 0 A4 2 VISIT02 17-Oct-18 5 4 2 0 2 01/001 VISIT01 10-Oct-18 6 . . 1 . VISIT02 17-Oct-18 6 . . 1 .
Thank you Reeza!
It can be simple to do if there are only two visits per subject/lesion:
proc sort data=have; by subject lesion date; run;
data want;
merge
have(
where=(visit="VISIT01")
rename=(date=date01))
have(
where=(visit="VISIT02")
rename=(date=date02 long=long2 wide=wide2 n_a=n_a2 grade=grade2)
drop=location);
by subject lesion;
drop visit;
run;
Thank you PGStats, I actually have up to 13 visits for each subject so I assume I could use your code below and just keep adding visits onto it?
Solution through proc transpose...
data have;
infile datalines truncover;
input
subject $ visit $ date date9. lesion long wide n_a location $2. grade;
format date date9.;
datalines;
01001 VISIT01 10-Oct-18 1 5 2 . E4 2
01001 VISIT01 10-Oct-18 2 4 2 0 E4 2
01001 VISIT01 10-Oct-18 3 5 1 0 D4 2
01001 VISIT01 10-Oct-18 4 6 1 0 D4 3
01001 VISIT01 10-Oct-18 5 4 2 0 A4 2
01001 VISIT01 10-Oct-18 6 . . 1 .
01001 VISIT02 17-Oct-18 1 5 2 . 2
01001 VISIT02 17-Oct-18 2 4 2 0 2
01001 VISIT02 17-Oct-18 3 5 1 0 2
01001 VISIT02 17-Oct-18 4 6 1 0 3
01001 VISIT02 17-Oct-18 5 4 2 0 2
01001 VISIT02 17-Oct-18 6 . . 1 .
;
run;
proc transpose data=have out=have1 prefix=var;
by subject visit notsorted;
id lesion ;
var date long wide n_a location grade;
run;
proc transpose data=have1 out=have2 delimiter=_ name=lesion;
by subject notsorted;
id _name_ visit;
var var1-var6;
run;
data want (rename=(lesion1=lesion));
set have2;
lesion1=compress(lesion,'','kd');
drop lesion;
run;
Thank you singhsahab!
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.