Hello,
My data is in format 1 and I'd like it to be in format 2. Any help you can provide on how to do it would be helpful:
Format 1:
Date | Jan-19 | Feb-19 | Mar-19 |
Firm1 | 10 | 20 | 30 |
Firm2 | 5 | 15 | 45 |
Firm3 | 8 | 20 | 55 |
Firm4 | 9 | 22 | |
Firm5 | 12 | 11 |
Format 2:
Jan-19 | Firm1 | 10 |
Jan-19 | Firm2 | 5 |
Jan-19 | Firm3 | 8 |
Jan-19 | Firm4 | 9 |
Jan-19 | Firm5 | 12 |
Feb-19 | Firm1 | 20 |
Feb-19 | Firm2 | 15 |
Feb-19 | Firm3 | 20 |
Feb-19 | Firm4 | 22 |
Feb-19 | Firm5 | 11 |
Mar-19 | Firm1 | 30 |
Mar-19 | Firm2 | 45 |
Mar-19 | Firm3 | 55 |
Mar-19 | Firm4 | |
Mar-19 | Firm5 |
I would still suggest to use transpose, and get the periods from the labels (as these contain the original column names from the Excel):
proc sort data=ret;
by name ticker;
run;
proc transpose
data=ret (where=(name ne ''))
out=trans (rename=(col1=return))
;
var _:;
by name ticker;
run;
data want;
retain period name ticker return;
set trans;
format period mmyys7.;
period = input(cats('01/',_label_),ddmmyy10.);
keep period name ticker return;
run;
proc sort data=want;
by descending period name ticker;
run;
proc print data=want (obs=10) noobs;
run;
Result:
period name ticker return 07/2019 361 Global Equity Absolute Return I AGRZX 0.81384 07/2019 361 Global Equity Absolute Return Inv AGRQX 0.81466 07/2019 361 Global Equity Absolute Return Y AGRWX 0.71138 07/2019 Calvert Absolute Return Bond A CUBAX 0.32992 07/2019 Calvert Absolute Return Bond C CUBCX 0.33157 07/2019 Calvert Absolute Return Bond I CUBIX 0.34951 07/2019 Causeway Global Absolute Return Instl CGAIX -2.01511 07/2019 Causeway Global Absolute Return Inv CGAVX -2.03822 07/2019 Eaton Vance Glbl Macr Absolute Return A EAGMX 1.44828 07/2019 Eaton Vance Glbl Macr Absolute Return C ECGMX 1.38488
data have;
infile cards expandtabs truncover;
input Date $ Jan_19 Feb_19 Mar_19;
cards;
Firm1 10 20 30
Firm2 5 15 45
Firm3 8 20 55
Firm4 9 22
Firm5 12 11
;
proc transpose data=have out=temp;
by date;
var Jan_19--Mar_19 ;
run;
proc sort data=temp out=want(rename=(_name_=month_year col1=value));
by _name_ date;
run;
Hi, I apologize but
This is the data I have: Already in a excel sheet ill be importing. This is a small subset of what i have.
Name | Ticker | 7/2019 | 6/2019 | 5/2019 | 4/2019 |
A | PSPTX | 1.41 | 7.47 | -6.68 | 4.26 |
B | PTOAX | 1.34 | 7.33 | -6.62 | 4.24 |
C | PSOCX | 1.39 | 7.23 | -6.72 | 4.16 |
D | EAGMX | 1.45 | 1.46 | 0.56 | 0.21 |
Id like it this way:
Name | Ticker | ||
7/2019 | A | PSPTX | 1.41 |
7/2019 | B | PTOAX | 1.34 |
7/2019 | C | PSOCX | 1.39 |
7/2019 | D | EAGMX | 1.45 |
6/2019 | A | PSPTX | 7.47 |
6/2019 | B | PTOAX | 7.33 |
6/2019 | C | PSOCX | 7.23 |
6/2019 | D | EAGMX | 1.46 |
5/2019 | A | PSPTX | -6.68 |
5/2019 | B | PTOAX | -6.62 |
5/2019 | C | PSOCX | -6.72 |
5/2019 | D | EAGMX | 0.56 |
4/2019 | A | PSPTX | 4.26 |
4/2019 | B | PTOAX | 4.24 |
4/2019 | C | PSOCX | 4.16 |
4/2019 | D | EAGMX | 0.21 |
The dates along the rows not have a variable name as I had shown in my original post, so i am not able to transpose them. Could you provide help for this situation?
Can you please post the SAS dataset that was import from excel. I would rather use the correct SAS naming convention for variable names.
Btw @pshah38 It is exactly the same code to use
I've attached the file. I tried the above code, but because the dates along the first row are being treated as variable name instead of observations. So i don't know how to transpose them
Thank you . Do you just want 4 i.e
7/2019 | 6/2019 | 5/2019 | 4/2019 |
to be considered. I see ton of missing values.
Okay, I am afraid you have lot of cleaning and validating work to do, which i will leave that up to you. But the process of transpose is simple and straight forward. Once the below is done, review the dataset with some cleaning and validation process for further analysis
Taking variable names as-is from metadata contents
proc contents data=tmp1.ret /*your imported dataset*/
run;
Then
data temp;
set tmp1.ret;
if not missing(name) and not missing(ticker);
array t __2019--__20021;
do over t;
grp=_I_;
month_year=vname(t);
value=t;
output;
end;
keep name ticker grp month_year value;
run;
proc sort data=temp out=want(drop=grp);
by grp month_year value;
run;
I would still suggest to use transpose, and get the periods from the labels (as these contain the original column names from the Excel):
proc sort data=ret;
by name ticker;
run;
proc transpose
data=ret (where=(name ne ''))
out=trans (rename=(col1=return))
;
var _:;
by name ticker;
run;
data want;
retain period name ticker return;
set trans;
format period mmyys7.;
period = input(cats('01/',_label_),ddmmyy10.);
keep period name ticker return;
run;
proc sort data=want;
by descending period name ticker;
run;
proc print data=want (obs=10) noobs;
run;
Result:
period name ticker return 07/2019 361 Global Equity Absolute Return I AGRZX 0.81384 07/2019 361 Global Equity Absolute Return Inv AGRQX 0.81466 07/2019 361 Global Equity Absolute Return Y AGRWX 0.71138 07/2019 Calvert Absolute Return Bond A CUBAX 0.32992 07/2019 Calvert Absolute Return Bond C CUBCX 0.33157 07/2019 Calvert Absolute Return Bond I CUBIX 0.34951 07/2019 Causeway Global Absolute Return Instl CGAIX -2.01511 07/2019 Causeway Global Absolute Return Inv CGAVX -2.03822 07/2019 Eaton Vance Glbl Macr Absolute Return A EAGMX 1.44828 07/2019 Eaton Vance Glbl Macr Absolute Return C ECGMX 1.38488
data temp;
set have;
array t Jan_19 Feb_19 Mar_19;
do over t;
grp=_I_;
month_year=vname(t);
value=t;
output;
end;
run;
proc sort data=temp out=want(keep=date month_year value);
by grp month_year date value;
run;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.