- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hello Everyone,
I have a dataset in the long format that I would like to transpose to a wide dataset. I would also like
the months to remain in sequential order. The dataset is attached. The long data set is in this format:
FRUIT STATUS DATEMON DATEYR MONYR CASECOUNT
APPLE CLOSED 3 2020 Mar-20 1
APPLE OPEN 6 2020 Jun-20 1
GRAPES CLOSED 10 2020 Oct-20 1
GRAPES OPEN 5 2020 May-20 2
ORANGE CLOSED 8 2020 Aug-20 1
ORANGE OPEN 8 2020 Jul-20 2
PEAR CLOSED 9 2020 Sep-20 1
PEAR OPEN 11 2019 Nov-19 1
I would like the data set to look like below with the casecounts corresponding to the month they were opened or closed. For example, in March 2020, apple had one casecount therefore march 2020 would have 1 for apple with a closed status and so on.
FRUIT STATUS Nov-19 Mar-20 May-20 Jun-20 Jul-20 Aug-20 Sep-20 Oct-20
APPLE CLOSED 1
APPLE OPEN
GRAPES CLOSED
GRAPES OPEN
ORANGE CLOSED
ORANGE OPEN
PEAR CLOSED
PEAR OPEN
Any assistance would be great. Thank you.
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
One trick for changing the order of variables in a dataset is to use some statement like RETAIN that sets the order of the variables without actually forcing the data step compiler to determine the other attributes of the variables.
data want;
retain id y1999 y2000 ;
set have;
run;
So in your case you can use an SQL query to get the list of names in the order you want.
proc sql noprint ;
select distinct translate(monyr,'_','-') into :varlist separated by ' '
from have
order by dateyr,datemon
;
Then you can transpose as normal
proc transpose data=have out=want(drop=_name_);
by fruit status ;
id monyr ;
var casecount;
run;
And run another step to change the order of the variables.
data want ;
retain fruit status &varlist;
set want;
run;
Results:
Obs fruit status Nov_19 Mar_20 May_20 Jun_20 Aug_20 Jul_20 Sep_20 Oct_20 1 APPLE CLOSED . 1 . . . . . . 2 APPLE OPEN . . . 1 . . . . 3 GRAPES CLOSED . . . . . . . 1 4 GRAPES OPEN . . 2 . . . . . 5 ORANGE CLOSED . . . . 1 . . . 6 ORANGE OPEN . . . . . 2 . . 7 PEAR CLOSED . . . . . . 1 . 8 PEAR OPEN 1 . . . . . . .
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
One trick for changing the order of variables in a dataset is to use some statement like RETAIN that sets the order of the variables without actually forcing the data step compiler to determine the other attributes of the variables.
data want;
retain id y1999 y2000 ;
set have;
run;
So in your case you can use an SQL query to get the list of names in the order you want.
proc sql noprint ;
select distinct translate(monyr,'_','-') into :varlist separated by ' '
from have
order by dateyr,datemon
;
Then you can transpose as normal
proc transpose data=have out=want(drop=_name_);
by fruit status ;
id monyr ;
var casecount;
run;
And run another step to change the order of the variables.
data want ;
retain fruit status &varlist;
set want;
run;
Results:
Obs fruit status Nov_19 Mar_20 May_20 Jun_20 Aug_20 Jul_20 Sep_20 Oct_20 1 APPLE CLOSED . 1 . . . . . . 2 APPLE OPEN . . . 1 . . . . 3 GRAPES CLOSED . . . . . . . 1 4 GRAPES OPEN . . 2 . . . . . 5 ORANGE CLOSED . . . . 1 . . . 6 ORANGE OPEN . . . . . 2 . . 7 PEAR CLOSED . . . . . . 1 . 8 PEAR OPEN 1 . . . . . . .
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
The code works great! Also, I appreciate you providing a thorough explanation.
Thank you!