I have a dataset with a list of observations of wart assessments that is rather long and I'd like to set it up for a reviewer to see each subject across a fat dataset. Do I need to do a transpose for each variable to make this work?
Turning this:
subject | visit | date | lesion | long | wide | n/a | location | grade |
01/001 | VISIT01 | 10-Oct-18 | 1 | 5 | 2 | . | E4 | 2 |
01/001 | VISIT01 | 10-Oct-18 | 2 | 4 | 2 | 0 | E4 | 2 |
01/001 | VISIT01 | 10-Oct-18 | 3 | 5 | 1 | 0 | D4 | 2 |
01/001 | VISIT01 | 10-Oct-18 | 4 | 6 | 1 | 0 | D4 | 3 |
01/001 | VISIT01 | 10-Oct-18 | 5 | 4 | 2 | 0 | A4 | 2 |
01/001 | VISIT01 | 10-Oct-18 | 6 | . | . | 1 | . | |
01/001 | VISIT02 | 17-Oct-18 | 1 | 5 | 2 | . | 2 | |
01/001 | VISIT02 | 17-Oct-18 | 2 | 4 | 2 | 0 | 2 | |
01/001 | VISIT02 | 17-Oct-18 | 3 | 5 | 1 | 0 | 2 | |
01/001 | VISIT02 | 17-Oct-18 | 4 | 6 | 1 | 0 | 3 | |
01/001 | VISIT02 | 17-Oct-18 | 5 | 4 | 2 | 0 | 2 | |
01/001 | VISIT02 | 17-Oct-18 | 6 | . | . | 1 | . | |
into this:
subject | visit | date | lesion | long | wide | n/a | location | grade | visit | date | lesion | long | wide | n/a | location | grade |
01/001 | VISIT01 | 10-Oct-18 | 1 | 5 | 2 | . | E4 | 2 | VISIT02 | 17-Oct-18 | 1 | 5 | 2 | . | 2 | |
01/001 | VISIT01 | 10-Oct-18 | 2 | 4 | 2 | 0 | E4 | 2 | VISIT02 | 17-Oct-18 | 2 | 4 | 2 | 0 | 2 | |
01/001 | VISIT01 | 10-Oct-18 | 3 | 5 | 1 | 0 | D4 | 2 | VISIT02 | 17-Oct-18 | 3 | 5 | 1 | 0 | 2 | |
01/001 | VISIT01 | 10-Oct-18 | 4 | 6 | 1 | 0 | D4 | 3 | VISIT02 | 17-Oct-18 | 4 | 6 | 1 | 0 | 3 | |
01/001 | VISIT01 | 10-Oct-18 | 5 | 4 | 2 | 0 | A4 | 2 | VISIT02 | 17-Oct-18 | 5 | 4 | 2 | 0 | 2 | |
01/001 | VISIT01 | 10-Oct-18 | 6 | . | . | 1 | . | VISIT02 | 17-Oct-18 | 6 | . | . | 1 | . |
Post your data as a complete data step we can work with rather than a cut-and-paste job.
@PSU_Sudzi wrote:
I have a dataset with a list of observations of wart assessments that is rather long and I'd like to set it up for a reviewer to see each subject across a fat dataset. Do I need to do a transpose for each variable to make this work?
Yes, you need to flip each variable on it's own and then merge the results, similar to this:
Example 5 here
https://stats.idre.ucla.edu/sas/modules/how-to-reshape-data-long-to-wide-using-proc-transpose/
Or you can flip it to a long format and then a wide format:
https://gist.github.com/statgeek/2321b6f62ab78d5bf2b0a5a8626bd7cd
Or you can use a custom macro written by users here:
@PSU_Sudzi wrote:
I have a dataset with a list of observations of wart assessments that is rather long and I'd like to set it up for a reviewer to see each subject across a fat dataset. Do I need to do a transpose for each variable to make this work?
Turning this:
subject visit date lesion long wide n/a location grade 01/001 VISIT01 10-Oct-18 1 5 2 . E4 2 01/001 VISIT01 10-Oct-18 2 4 2 0 E4 2 01/001 VISIT01 10-Oct-18 3 5 1 0 D4 2 01/001 VISIT01 10-Oct-18 4 6 1 0 D4 3 01/001 VISIT01 10-Oct-18 5 4 2 0 A4 2 01/001 VISIT01 10-Oct-18 6 . . 1 . 01/001 VISIT02 17-Oct-18 1 5 2 . 2 01/001 VISIT02 17-Oct-18 2 4 2 0 2 01/001 VISIT02 17-Oct-18 3 5 1 0 2 01/001 VISIT02 17-Oct-18 4 6 1 0 3 01/001 VISIT02 17-Oct-18 5 4 2 0 2 01/001 VISIT02 17-Oct-18 6 . . 1 .
into this:
subject visit date lesion long wide n/a location grade visit date lesion long wide n/a location grade 01/001 VISIT01 10-Oct-18 1 5 2 . E4 2 VISIT02 17-Oct-18 1 5 2 . 2 01/001 VISIT01 10-Oct-18 2 4 2 0 E4 2 VISIT02 17-Oct-18 2 4 2 0 2 01/001 VISIT01 10-Oct-18 3 5 1 0 D4 2 VISIT02 17-Oct-18 3 5 1 0 2 01/001 VISIT01 10-Oct-18 4 6 1 0 D4 3 VISIT02 17-Oct-18 4 6 1 0 3 01/001 VISIT01 10-Oct-18 5 4 2 0 A4 2 VISIT02 17-Oct-18 5 4 2 0 2 01/001 VISIT01 10-Oct-18 6 . . 1 . VISIT02 17-Oct-18 6 . . 1 .
Thank you Reeza!
It can be simple to do if there are only two visits per subject/lesion:
proc sort data=have; by subject lesion date; run;
data want;
merge
have(
where=(visit="VISIT01")
rename=(date=date01))
have(
where=(visit="VISIT02")
rename=(date=date02 long=long2 wide=wide2 n_a=n_a2 grade=grade2)
drop=location);
by subject lesion;
drop visit;
run;
Thank you PGStats, I actually have up to 13 visits for each subject so I assume I could use your code below and just keep adding visits onto it?
Solution through proc transpose...
data have;
infile datalines truncover;
input
subject $ visit $ date date9. lesion long wide n_a location $2. grade;
format date date9.;
datalines;
01001 VISIT01 10-Oct-18 1 5 2 . E4 2
01001 VISIT01 10-Oct-18 2 4 2 0 E4 2
01001 VISIT01 10-Oct-18 3 5 1 0 D4 2
01001 VISIT01 10-Oct-18 4 6 1 0 D4 3
01001 VISIT01 10-Oct-18 5 4 2 0 A4 2
01001 VISIT01 10-Oct-18 6 . . 1 .
01001 VISIT02 17-Oct-18 1 5 2 . 2
01001 VISIT02 17-Oct-18 2 4 2 0 2
01001 VISIT02 17-Oct-18 3 5 1 0 2
01001 VISIT02 17-Oct-18 4 6 1 0 3
01001 VISIT02 17-Oct-18 5 4 2 0 2
01001 VISIT02 17-Oct-18 6 . . 1 .
;
run;
proc transpose data=have out=have1 prefix=var;
by subject visit notsorted;
id lesion ;
var date long wide n_a location grade;
run;
proc transpose data=have1 out=have2 delimiter=_ name=lesion;
by subject notsorted;
id _name_ visit;
var var1-var6;
run;
data want (rename=(lesion1=lesion));
set have2;
lesion1=compress(lesion,'','kd');
drop lesion;
run;
Thank you singhsahab!
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.