I have data that is structured in a matrix format exactly like this (except with more rows and columns). I've copied the raw data as I am not even sure how to load this into SAS.
firm | 1001 | 1001 | 1001 | 1001 | 1001 | 1001 | 1001 | 1001 | 1001 | 1001 | 1003 | 1003 | 1003 | 1003 | 1003 | 1003 | 1003 | 1003 | 1003 | |
year | 2005 | 2006 | 2007 | 2008 | 2009 | 2010 | 2011 | 2012 | 2013 | 2014 | 2010 | 2011 | 2012 | 2013 | 2014 | 2015 | 2016 | 2017 | 2018 | |
individual | event_year | |||||||||||||||||||
103307-86 | 2010 | 0.3 | 0.0 | 1.0 | 0.6 | 0.0 | 0.0 | 0.9 | 0.8 | 0.8 | 0.0 | 1.0 | 0.7 | 0.1 | 0.1 | 0.7 | 0.4 | 0.3 | 0.6 | 0.3 |
42185-26 | 2005 | 0.8 | 0.6 | 0.3 | 0.7 | 0.8 | 0.4 | 0.7 | 0.2 | 0.1 | 0.6 | 0.2 | 0.4 | 0.8 | 0.1 | 0.4 | 0.5 | 1.0 | 0.4 | 0.5 |
42185-26 | 2009 | 0.6 | 0.7 | 0.9 | 0.4 | 0.5 | 0.1 | 0.9 | 0.7 | 0.9 | 0.6 | 0.6 | 0.7 | 0.3 | 0.5 | 0.9 | 0.1 | 1.0 | 0.7 | 0.7 |
Across the columns are two headers, firm identifiers (e.g., 1001, 1003) and year identifiers for each firm (e.g., 2005 to 2014 for firm=1001 and 2010 to 2018 for firm=1003). Across the rows, I have identifiers for individuals and an event that occurs in a given year for that individual ("event_year"). What I'd like to do is transpose the data into the following format:
DATA want;
input individual & $char10. event_year firm year score;
length individual $10;
DATALINES;
103307-86 2010 1001 2005 0.3
103307-86 2010 1001 2006 0.0
103307-86 2010 1001 2007 1.0
103307-86 2010 1001 2008 0.6
103307-86 2010 1001 2009 0.0
103307-86 2010 1001 2010 0.0
103307-86 2010 1001 2011 0.9
103307-86 2010 1001 2012 0.8
103307-86 2010 1001 2013 0.8
103307-86 2010 1001 2014 0.0
103307-86 2010 1003 2010 1.0
103307-86 2010 1003 2011 0.7
103307-86 2010 1003 2012 0.1
103307-86 2010 1003 2013 0.1
103307-86 2010 1003 2014 0.7
103307-86 2010 1003 2015 0.4
103307-86 2010 1003 2016 0.3
103307-86 2010 1003 2017 0.6
103307-86 2010 1003 2018 0.3
42185-26 2005 1001 2005 0.8
42185-26 2005 1001 2006 0.6
42185-26 2005 1001 2007 0.3
42185-26 2005 1001 2008 0.7
42185-26 2005 1001 2009 0.8
42185-26 2005 1001 2010 0.4
42185-26 2005 1001 2011 0.7
42185-26 2005 1001 2012 0.2
42185-26 2005 1001 2013 0.1
42185-26 2005 1001 2014 0.6
42185-26 2005 1003 2010 0.2
42185-26 2005 1003 2011 0.4
42185-26 2005 1003 2012 0.8
42185-26 2005 1003 2013 0.1
42185-26 2005 1003 2014 0.4
42185-26 2005 1003 2015 0.5
42185-26 2005 1003 2016 1.0
42185-26 2005 1003 2017 0.4
42185-26 2005 1003 2018 0.5
42185-26 2009 1001 2005 0.6
42185-26 2009 1001 2006 0.7
42185-26 2009 1001 2007 0.9
42185-26 2009 1001 2008 0.4
42185-26 2009 1001 2009 0.5
42185-26 2009 1001 2010 0.1
42185-26 2009 1001 2011 0.9
42185-26 2009 1001 2012 0.7
42185-26 2009 1001 2013 0.9
42185-26 2009 1001 2014 0.6
42185-26 2009 1003 2010 0.6
42185-26 2009 1003 2011 0.7
42185-26 2009 1003 2012 0.3
42185-26 2009 1003 2013 0.5
42185-26 2009 1003 2014 0.9
42185-26 2009 1003 2015 0.1
42185-26 2009 1003 2016 1.0
42185-26 2009 1003 2017 0.7
42185-26 2009 1003 2018 0.7
;
RUN;
As you can see, I want to transpose the data into "vertical" format, where each row corresponds to a individual, event_year, firm, and year observation. I've created a variable called "score" that contains the value in the matrix. Thank you.
One simple way to solve the problem is as follows:
There could be more elegant methods but the approach used here is simple and basic - comprehensible with a basic knowledge of SAS.
/*Convert your source data into a tab separated file called your_datafile.
Give the full path and file name.
You can also use a csv file.
I have not used column headers. This keeps it simple.
Then import and create a dataset */
proc import datafile = your_datafile dbms=TAB out=A replace;
getnames=no;
run;
/*Separate the first two rows into a separate data a dataset*/
data b;
set a ;
if _n_<=2;
run;
/* Separate the next three rows into a separate dataset*/
data c;
set a;
if _n_ >2;
run;
/* transpose the first dataset from horizontal to vertical*/
proc transpose data=b out=bb(RENAME=(COL1=FIRM COL2=YEAR));
Var var2-Var21;
run;
/* transpose the second dataset from horizontal to vertical*/
proc transpose data=c out=cc(RENAME=(VAR1=INDIVIDUAL VAR2=EVENT_YEAR));
by Var1 Var2;
Var var3-Var21;
run;
/*sort them */
proc sort data=bb;
by _NAME_;
run;
proc sort data=cc;
by _NAME_;
run;
/* Get the data in the desired format into a dataset called desired
by merging bb and cc.
I am ordering the columns as shown in your question.
*/
data desired(drop=_NAME_);
retain INDIVIDUAl EVENT_YEAR FIRM YEAR SCORE);
merge cc bb;
by _NAME_;
run;
proc sort data=desired;
by INDIVIDUAL EVENT_YEAR;
run;
Everything in the code is simple and self explanatory.
One simple way to solve the problem is as follows:
There could be more elegant methods but the approach used here is simple and basic - comprehensible with a basic knowledge of SAS.
/*Convert your source data into a tab separated file called your_datafile.
Give the full path and file name.
You can also use a csv file.
I have not used column headers. This keeps it simple.
Then import and create a dataset */
proc import datafile = your_datafile dbms=TAB out=A replace;
getnames=no;
run;
/*Separate the first two rows into a separate data a dataset*/
data b;
set a ;
if _n_<=2;
run;
/* Separate the next three rows into a separate dataset*/
data c;
set a;
if _n_ >2;
run;
/* transpose the first dataset from horizontal to vertical*/
proc transpose data=b out=bb(RENAME=(COL1=FIRM COL2=YEAR));
Var var2-Var21;
run;
/* transpose the second dataset from horizontal to vertical*/
proc transpose data=c out=cc(RENAME=(VAR1=INDIVIDUAL VAR2=EVENT_YEAR));
by Var1 Var2;
Var var3-Var21;
run;
/*sort them */
proc sort data=bb;
by _NAME_;
run;
proc sort data=cc;
by _NAME_;
run;
/* Get the data in the desired format into a dataset called desired
by merging bb and cc.
I am ordering the columns as shown in your question.
*/
data desired(drop=_NAME_);
retain INDIVIDUAl EVENT_YEAR FIRM YEAR SCORE);
merge cc bb;
by _NAME_;
run;
proc sort data=desired;
by INDIVIDUAL EVENT_YEAR;
run;
Everything in the code is simple and self explanatory.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.