BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
elbarto
Obsidian | Level 7

I have data that is structured in a matrix format exactly like this (except with more rows and columns). I've copied the raw data as I am not even sure how to load this into SAS.

 

firm   1001 1001 1001 1001 1001 1001 1001 1001 1001 1001 1003 1003 1003 1003 1003 1003 1003 1003 1003
year   2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2010 2011 2012 2013 2014 2015 2016 2017 2018
individual event_year                                      
103307-86 2010 0.3 0.0 1.0 0.6 0.0 0.0 0.9 0.8 0.8 0.0 1.0 0.7 0.1 0.1 0.7 0.4 0.3 0.6 0.3
42185-26 2005 0.8 0.6 0.3 0.7 0.8 0.4 0.7 0.2 0.1 0.6 0.2 0.4 0.8 0.1 0.4 0.5 1.0 0.4 0.5
42185-26 2009 0.6 0.7 0.9 0.4 0.5 0.1 0.9 0.7 0.9 0.6 0.6 0.7 0.3 0.5 0.9 0.1 1.0 0.7 0.7

 

Across the columns are two headers, firm identifiers (e.g., 1001, 1003) and year identifiers for each firm (e.g., 2005 to 2014 for firm=1001 and 2010 to 2018 for firm=1003). Across the rows, I have identifiers for individuals and an event that occurs in a given year for that individual ("event_year"). What I'd like to do is transpose the data into the following format:

DATA want;
  input individual & $char10. event_year firm year score;
  length individual $10;
DATALINES;
103307-86	2010	1001	2005	0.3
103307-86	2010	1001	2006	0.0
103307-86	2010	1001	2007	1.0
103307-86	2010	1001	2008	0.6
103307-86	2010	1001	2009	0.0
103307-86	2010	1001	2010	0.0
103307-86	2010	1001	2011	0.9
103307-86	2010	1001	2012	0.8
103307-86	2010	1001	2013	0.8
103307-86	2010	1001	2014	0.0
103307-86	2010	1003	2010	1.0
103307-86	2010	1003	2011	0.7
103307-86	2010	1003	2012	0.1
103307-86	2010	1003	2013	0.1
103307-86	2010	1003	2014	0.7
103307-86	2010	1003	2015	0.4
103307-86	2010	1003	2016	0.3
103307-86	2010	1003	2017	0.6
103307-86	2010	1003	2018	0.3
42185-26	2005	1001	2005	0.8
42185-26	2005	1001	2006	0.6
42185-26	2005	1001	2007	0.3
42185-26	2005	1001	2008	0.7
42185-26	2005	1001	2009	0.8
42185-26	2005	1001	2010	0.4
42185-26	2005	1001	2011	0.7
42185-26	2005	1001	2012	0.2
42185-26	2005	1001	2013	0.1
42185-26	2005	1001	2014	0.6
42185-26	2005	1003	2010	0.2
42185-26	2005	1003	2011	0.4
42185-26	2005	1003	2012	0.8
42185-26	2005	1003	2013	0.1
42185-26	2005	1003	2014	0.4
42185-26	2005	1003	2015	0.5
42185-26	2005	1003	2016	1.0
42185-26	2005	1003	2017	0.4
42185-26	2005	1003	2018	0.5
42185-26	2009	1001	2005	0.6
42185-26	2009	1001	2006	0.7
42185-26	2009	1001	2007	0.9
42185-26	2009	1001	2008	0.4
42185-26	2009	1001	2009	0.5
42185-26	2009	1001	2010	0.1
42185-26	2009	1001	2011	0.9
42185-26	2009	1001	2012	0.7
42185-26	2009	1001	2013	0.9
42185-26	2009	1001	2014	0.6
42185-26	2009	1003	2010	0.6
42185-26	2009	1003	2011	0.7
42185-26	2009	1003	2012	0.3
42185-26	2009	1003	2013	0.5
42185-26	2009	1003	2014	0.9
42185-26	2009	1003	2015	0.1
42185-26	2009	1003	2016	1.0
42185-26	2009	1003	2017	0.7
42185-26	2009	1003	2018	0.7
;
RUN;

As you can see, I want to transpose the data into "vertical" format, where each row corresponds to a individual, event_year, firm, and year observation. I've created a variable called "score" that contains the value in the matrix. Thank you.

 

1 ACCEPTED SOLUTION

Accepted Solutions
Sajid01
Meteorite | Level 14

One simple way to solve the problem is as follows:
There could be more elegant methods but the approach used here  is simple and basic - comprehensible with a basic knowledge of SAS.

/*Convert your source data into a tab separated file called your_datafile.
Give the full path and file name.
You can also use a csv file.
I have not used column headers. This keeps it simple.
Then import and create a dataset */
proc import datafile = your_datafile dbms=TAB out=A replace;
getnames=no;
run;
/*Separate the first two rows into a separate data a dataset*/
data b;
set a ;
if _n_<=2;
run;
/* Separate the next three rows into a separate dataset*/
data c;
set a;
if _n_ >2;
run;
/* transpose the first dataset from horizontal to vertical*/
proc transpose data=b out=bb(RENAME=(COL1=FIRM COL2=YEAR));
Var var2-Var21;
run;
/* transpose the second dataset from horizontal to vertical*/
proc transpose data=c out=cc(RENAME=(VAR1=INDIVIDUAL VAR2=EVENT_YEAR));
by Var1 Var2;
Var var3-Var21;
run;
/*sort them */
proc sort data=bb;
by _NAME_;
run;
proc sort data=cc;
by _NAME_;
run;
/* Get the data in the desired format into a dataset  called desired
by merging bb and cc. I am ordering the columns as shown in your question. */ data desired(drop=_NAME_); retain INDIVIDUAl EVENT_YEAR FIRM YEAR SCORE); merge cc bb; by _NAME_; run; proc sort data=desired; by INDIVIDUAL EVENT_YEAR; run;

Everything in the code is simple and self explanatory.

View solution in original post

1 REPLY 1
Sajid01
Meteorite | Level 14

One simple way to solve the problem is as follows:
There could be more elegant methods but the approach used here  is simple and basic - comprehensible with a basic knowledge of SAS.

/*Convert your source data into a tab separated file called your_datafile.
Give the full path and file name.
You can also use a csv file.
I have not used column headers. This keeps it simple.
Then import and create a dataset */
proc import datafile = your_datafile dbms=TAB out=A replace;
getnames=no;
run;
/*Separate the first two rows into a separate data a dataset*/
data b;
set a ;
if _n_<=2;
run;
/* Separate the next three rows into a separate dataset*/
data c;
set a;
if _n_ >2;
run;
/* transpose the first dataset from horizontal to vertical*/
proc transpose data=b out=bb(RENAME=(COL1=FIRM COL2=YEAR));
Var var2-Var21;
run;
/* transpose the second dataset from horizontal to vertical*/
proc transpose data=c out=cc(RENAME=(VAR1=INDIVIDUAL VAR2=EVENT_YEAR));
by Var1 Var2;
Var var3-Var21;
run;
/*sort them */
proc sort data=bb;
by _NAME_;
run;
proc sort data=cc;
by _NAME_;
run;
/* Get the data in the desired format into a dataset  called desired
by merging bb and cc. I am ordering the columns as shown in your question. */ data desired(drop=_NAME_); retain INDIVIDUAl EVENT_YEAR FIRM YEAR SCORE); merge cc bb; by _NAME_; run; proc sort data=desired; by INDIVIDUAL EVENT_YEAR; run;

Everything in the code is simple and self explanatory.

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 1 reply
  • 516 views
  • 0 likes
  • 2 in conversation