Hi all,
I had a question about transposing a dataset to wide format when the dataset has missing data. Here is the dataset I created and the code I was using to try to transpose the data:
DATA WORK.data1;
INFILE DATALINES DELIMITER=',';
INPUT Type $ Year HtFt;
DATALINES;
;
PROC TRANSPOSE
DATA = WORK.data1
OUT = WORK.data2 (DROP = _NAME_
RENAME = (COL1 = Yr2009
COL2 = Yr2010
COL3 = Yr2011
COL4 = Yr2012) );
VAR HtFt;
BY Type;
RUN;
Here was the output:
The values for Maple should start at Yr2010 and the missing value should be for Yr2009 and I'm curious as to how I could remedy this?
Thank you so much,
Kaitlin
DATA WORK.Trees;
INFILE DATALINES DELIMITER=',';
INPUT Type $ Year HtFt;
DATALINES;
Aspen,2009,15
Aspen,2010,16
Maple,2010,6
Maple,2011,8
Maple,2012,10
Spruce,2009,22
Spruce,2010,23
Spruce,2011,24
Spruce,2012,25
;
proc transpose data=trees out=want prefix=YR_;
by type;
id year;
var htft;
run;
proc print noobs;run;
Type | _NAME_ | YR_2009 | YR_2010 | YR_2011 | YR_2012 |
---|---|---|---|---|---|
Aspen | HtFt | 15 | 16 | . | . |
Maple | HtFt | . | 6 | 8 | 10 |
Spruce | HtFt | 22 | 23 | 24 | 25 |
Use a PREFIX= option and an ID statement:
proc transpose ...... prefix=yr;
by type;
var htft;
id year;
run;
and no RENAME= dataset option.
Try this
proc transpose data=trees out=want(drop=_:) prefix=Yr;
by Type;
id Year;
var HtFt;
run;
Result
Type Yr2009 Yr2010 Yr2011 Yr2012 Aspen 15 16 . . Maple . 6 8 10 Spruce 22 23 24 25
Hi there,
The good news is you're close. Two things should help. One, the ID statement will help with properly itemizing the observations, so they don't all end up under the first variable. Two you may want to skip the RENAME statement, given that part of your variable name is already there, and you only need to add "Yr" to it, thus (based on your code):
PROC TRANSPOSE
DATA = WORK.Trees
OUT = WORK.TreesWide (DROP = _NAME_) PREFIX = Yr;
BY Type;
VAR HtFt;
ID Year;
RUN;
The output should look like this:
Obs Type Yr2009 Yr2010 Yr2011 Yr2012
1 Aspen 15 16 . .
2 Maple . 6 8 10
3 Spruce 22 23 24 25
Hi Kaitlin-
I found 2 ways that seemed to work.
Building the data set:
DATA lib.dataset;
INFILE DATALINES DELIMITER=' ';
INPUT Type $
Year
HtFt;
DATALINES;
Aspen 2009 15
Aspen 2010 16
Maple 2010 6
Maple 2011 8
Maple 2012 10
Spruce 2009 22
Spruce 2010 23
Spruce 2011 24
Spruce 2012 25
;
PROC PRINT DATA = lib.dataset;
RUN;
First way to transpose (With Proc Transpose):
PROC TRANSPOSE
DATA = lib.dataset
OUT = lib.datasetwide( DROP = _NAME_ )
PREFIX = Yr;
VAR HtFt;
BY Type;
ID Year;
RUN;
PROC PRINT DATA = lib.datasetwide;
RUN;
Output1:
Aspen | 15 | 16 | . | . |
Maple | . | 6 | 8 | 10 |
Spruce | 22 | 23 | 24 | 25 |
Second way to transpose (With Data Step):
DATA lib.datasetwide2;
SET lib.dataset;
BY Type;
ARRAY Years {2009:2012}Yr2009 - Yr2012 (. . . . );
IF First.type = 1 THEN CALL MISSING(OF Years{*});
Years{Year} = HtFt;
IF Last.Type = 1;
DROP Year HtFt;
RUN;
PROC PRINT Data= lib.datasetwide2;
RUN;
Output 2:
Aspen | 15 | 16 | . | . |
Maple | . | 6 | 8 | 10 |
Spruce | 22 | 23 | 24 | 25 |
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.
Find more tutorials on the SAS Users YouTube channel.