I have the data below I'm inputting into SAS from Excel, in spoiler below:
2019 | 2018 | 2017 | 2016 | 2015 | 2014 | 2013 | 2012 | 2011 | 2010 | 2009 | 2008 | 2007 | 2006 | 2005 | 2004 | 2003 | 2002 | 2001 | 2000 | 1999 | 1998 | 1997 | 1996 | |
AL | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 |
AK | 3 | 3 | 3 | 2 | 1 | 1 | 1 | 4 | 2 | 2 | 2 | 2 | 3 | 3 | 3 | 3 | 3 | . | . | . | . | . | . | . |
AZ | 3 | 3 | 3 | 3 | 3 | 4 | 4 | 4 | 4 | 4 | 4 | 3 | 3 | 3 | 3 | 3 | 4 | 4 | . | . | . | . | . | . |
AR | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 |
CA | 4 | 4 | 4 | 4 | 4 | 6 | 6 | 7 | 7 | 7 | 6 | 5 | 5 | 5 | 6 | 6 | 9 | 6 | 5 | 5 | 5 | 5 | 5 | 5 |
CO | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 4 | 4 | 4 | 4 | 4 | 3 | 3 | 3 | 3 | 3 | 3 |
CT | 6 | 6 | 5 | 4 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 |
DE | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 |
FL | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 |
GA | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 |
HI | 2 | 2 | 2 | 2 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 4 | 4 | 4 | 4 | 4 | 4 | 4 | 4 | 4 | 4 | 4 |
ID | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 |
IL | 10 | 10 | 10 | 9 | 7 | 7 | 7 | 6 | 5 | 5 | 5 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 |
IN | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 2 | 2 | 3 | 3 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 |
IA | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 |
KS | 4 | 4 | 4 | 4 | 3 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 |
KY | 6 | 6 | 5 | 5 | 5 | 4 | 4 | 4 | 4 | 4 | 4 | 4 | 4 | 4 | 4 | 4 | 4 | 4 | 3 | 3 | 3 | 3 | 3 | 3 |
LA | 4 | 4 | 4 | 3 | 3 | 3 | 3 | 3 | 3 | 4 | 4 | 5 | 6 | 6 | 6 | 5 | 5 | 6 | 6 | 6 | 6 | 6 | 6 | 6 |
ME | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 4 | 4 | 3 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 |
MD | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 |
MA | 3 | 3 | 3 | 2 | 2 | 2 | 2 | 2 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 4 | 4 | 4 | 4 | 4 | 4 | 4 | 4 | 4 |
MI | 3 | 3 | 4 | 4 | 4 | 4 | 4 | 4 | 4 | 4 | 4 | 4 | 4 | 3 | 3 | 2 | 2 | 1 | 1 | 1 | 1 | 1 | 1 | 1 |
MN | 1 | 1 | 2 | 2 | 2 | 2 | 2 | 2 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 |
MS | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 |
MO | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 |
MT | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 4 | 4 | 4 | 4 | 4 | 4 | 4 | 4 | 4 | 4 | 4 | 4 |
NE | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 |
NV | 2 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 2 | 2 | 2 | 2 | 2 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 |
NH | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 2 | 2 | 2 | 2 | 2 | 2 | 2 |
NJ | 7 | 7 | 7 | 6 | 6 | 5 | 4 | 4 | 4 | 3 | 3 | 3 | 3 | 3 | 3 | 4 | 3 | 3 | 2 | 2 | 2 | 2 | 2 | 2 |
NM | 3 | 3 | 3 | 3 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 |
NY | 2 | 2 | 2 | 2 | 2 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 |
NC | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 |
ND | 2 | 2 | 2 | 2 | 1 | 1 | 1 | 2 | 2 | 2 | 2 | 3 | 3 | 3 | 3 | 4 | 4 | 4 | 4 | 4 | 4 | 4 | 4 | 4 |
OH | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 |
OK | 3 | 3 | 3 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 |
OR | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 3 | 3 | 3 | 3 | 4 | 4 | 4 | 4 | 3 | 3 | 3 | 3 | 3 | 3 | 3 |
PA | 5 | 5 | 4 | 4 | 4 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 |
RI | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 4 | 4 | 4 | 4 | 4 | 4 | 4 | 4 | 4 |
SC | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 |
SD | 1 | 1 | 1 | 1 | 1 | 2 | 2 | 2 | 2 | 3 | 3 | 3 | 3 | 3 | . | . | . | . | . | . | . | . | . | . |
TN | 1 | 1 | 1 | 1 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 |
TX | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 2 | 2 | 2 | 2 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 |
UT | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 |
VT | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 |
VA | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 |
WA | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 3 | 3 | 3 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 |
WV | 4 | 4 | 4 | 4 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 4 | 4 | 4 | 4 | 4 | 4 | 4 | 4 | 4 | 4 | 4 | 4 | 4 |
WI | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 4 | 4 | 4 | 4 | 4 | 4 | 3 | 3 | 3 | 3 | 3 | 3 |
WY | 2 | 2 | 2 | 1 | 1 | 1 | 1 | 1 | 1 | 2 | 2 | 2 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 |
I've tried running the following code:
proc import datafile='C:\Users\jjsin\Desktop\PhD\STCR.xlsx' dbms=xlsx out=have replace;
getnames=no;
run;
proc transpose data=have(obs=2 drop=a) out=apple(drop=_:);
var _all_;
run;
data apple;
set apple;
length year $ 80;
retain year;
if not missing(col1) then year=col1;
drop col1;
run;
data have;
set have;
length state $ 2;
retain state;
if not missing(col1) then pai=col1;
drop col1;
run;
proc transpose data=have(obs=1) out=banana;
var _all_;
run;
data label;
merge banana(firstobs=2) apple(in=inapple);
if inapple;
run;
proc sql noprint;
select _name_ into : names separated by ' ' from label;
select cats(_name_,'="', state,'_',year,'"') into : label separated by ' ' from label;
quit;
proc datasets library=work nodetails nolist;
modify have;
label &label ;
quit;
proc transpose data=have out=temp1;
by a notsorted;
var &names;
run;
data temp1;
set temp1;
id=scan(_label_,1,'_');
year=scan(_label_,-1,'_');
run;
data temp2;
set temp1(firstobs=25);
length a $ 2;
retain a;
rename a=state;
run;
proc transpose data=temp2 out=want(drop=_name_);
by state year notsorted;
id id;
var col1;
run;
proc sort data=want; by year state; run;
I get a result with 1200 rows and 26 columns: the state abbreviations are in the 1st column--where they should be, but with many of them having only one letter. The years are all correctly showing in the 2nd column. Then the data values--instead of all being in a 3rd column, are spread out between the 3rd and 26th columns--and some of them are definitely out of order. I can't even verify them for sure, because I can't verify the state abbreviations. The total number of rows is correct at 1200--50 states * 24 years. But I want a result like this (minus the row numbers column):
You have an apparent State column without a name in the "spoiler" .
I would be very tempted in the first transpose (and likely late places) BY STATE (or whatever variable name you get).
Personally, I save the XLSX to CSV and write a data step to read the data as I want.
Something similar to:
data example; input state $ @; do year=2019 to 2000 by -1; input value @; output; end; input; datalines; AL 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 AK 3 3 3 2 1 1 1 4 2 2 2 2 3 3 3 3 3 . . . . . . . AZ 3 3 3 3 3 4 4 4 4 4 4 3 3 3 3 3 4 4 . . . . . . ;
Your code would have an infile statement pointing to the created CSV with dlm=',' . The above only reads 2019 to 2000 because the "spoiler" cuts off and I can't tell how many more columns might be. But this should get the point and it should be easy to modify unless at some point your columns are no longer in descending order.
The @ in the input statement means "hold the line for reading".
This long form is likely the most flexible for most uses.
Import has lots of quirks, one of which is you may not get the variable type you want.
I really don't see the need for all the transposing back and forth as done but that likely just comes from too many years of experience and getting ahead of the curve by reading the data as I want it..
@Reeza Thanks for pointing that out. I just now noticed when I happened to be looking at the question again that the small table at the end had somehow been deleted when I originally posted the question. I've added it back in now. It's the same table I had posted in a separate spoiler a long time ago. But it makes the context of the original post make a lot more sense when added back in.
You have an apparent State column without a name in the "spoiler" .
I would be very tempted in the first transpose (and likely late places) BY STATE (or whatever variable name you get).
Personally, I save the XLSX to CSV and write a data step to read the data as I want.
Something similar to:
data example; input state $ @; do year=2019 to 2000 by -1; input value @; output; end; input; datalines; AL 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 AK 3 3 3 2 1 1 1 4 2 2 2 2 3 3 3 3 3 . . . . . . . AZ 3 3 3 3 3 4 4 4 4 4 4 3 3 3 3 3 4 4 . . . . . . ;
Your code would have an infile statement pointing to the created CSV with dlm=',' . The above only reads 2019 to 2000 because the "spoiler" cuts off and I can't tell how many more columns might be. But this should get the point and it should be easy to modify unless at some point your columns are no longer in descending order.
The @ in the input statement means "hold the line for reading".
This long form is likely the most flexible for most uses.
Import has lots of quirks, one of which is you may not get the variable type you want.
I really don't see the need for all the transposing back and forth as done but that likely just comes from too many years of experience and getting ahead of the curve by reading the data as I want it..
1 | AL | 1996 | 3 |
2 | AK | 1996 | . |
3 | AZ | 1996 | . |
4 | AR | 1996 | 3 |
5 | CA | 1996 | 5 |
6 | CO | 1996 | 3 |
7 | CT | 1996 | 3 |
… | … | … | … |
1196 | VA | 2019 | 1 |
1197 | WA | 2019 | 2 |
1198 | WV | 2019 | 4 |
1199 | WI | 2019 | 3 |
1200 | WY | 2019 | 2 |
I'm not familiar with Spoilers but I just posted what I'm looking for as a result (minus the row numbers column) in the Spoiler just above. I hope that helps.
My final code:
data alignment;
input row @;
input state $ @;
do year=2019 to 1996 by -1;
input pai @;
output;
end;
input;
datalines;
1 AL 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3
2 AK 3 3 3 2 1 1 1 4 2 2 2 2 3 3 3 3 3 . . . . . . .
3 AZ 3 3 3 3 3 4 4 4 4 4 4 3 3 3 3 3 4 4 . . . . . .
4 AR 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3
5 CA 4 4 4 4 4 6 6 7 7 7 6 5 5 5 6 6 9 6 5 5 5 5 5 5
6 CO 3 3 3 3 3 3 3 3 3 3 3 3 3 4 4 4 4 4 3 3 3 3 3 3
7 CT 6 6 5 4 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3
8 DE 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
9 FL 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2
10 GA 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
11 HI 2 2 2 2 3 3 3 3 3 3 3 3 3 4 4 4 4 4 4 4 4 4 4 4
12 ID 2 2 2 2 2 2 2 2 2 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3
13 IL 10 10 10 9 7 7 7 6 5 5 5 3 3 3 3 3 3 3 3 3 3 3 3 3
14 IN 1 1 1 1 1 1 1 1 1 1 1 1 2 2 3 3 2 2 2 2 2 2 2 2
15 IA 1 1 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 2 2
16 KS 4 4 4 4 3 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
17 KY 6 6 5 5 5 4 4 4 4 4 4 4 4 4 4 4 4 4 3 3 3 3 3 3
18 LA 4 4 4 3 3 3 3 3 3 4 4 5 6 6 6 5 5 6 6 6 6 6 6 6
19 ME 3 3 3 3 3 3 3 3 3 3 3 3 3 4 4 3 2 2 2 2 2 2 2 2
20 MD 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
21 MA 3 3 3 2 2 2 2 2 3 3 3 3 3 3 3 4 4 4 4 4 4 4 4 4
22 MI 3 3 4 4 4 4 4 4 4 4 4 4 4 3 3 2 2 1 1 1 1 1 1 1
23 MN 1 1 2 2 2 2 2 2 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
24 MS 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3
25 MO 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
26 MT 3 3 3 3 3 3 3 3 3 3 3 3 4 4 4 4 4 4 4 4 4 4 4 4
27 NE 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
28 NV 2 3 3 3 3 3 3 3 3 2 2 2 2 2 3 3 3 3 3 3 3 3 3 3
29 NH 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 2 2 2 2 2 2 2
30 NJ 7 7 7 6 6 5 4 4 4 3 3 3 3 3 3 4 3 3 2 2 2 2 2 2
31 NM 3 3 3 3 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
32 NY 2 2 2 2 2 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3
33 NC 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
34 ND 2 2 2 2 1 1 1 2 2 2 2 3 3 3 3 4 4 4 4 4 4 4 4 4
35 OH 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
36 OK 3 3 3 2 2 2 2 2 2 2 2 2 3 3 3 3 3 3 3 3 3 3 3 3
37 OR 2 2 2 2 2 2 2 2 2 3 3 3 3 4 4 4 4 3 3 3 3 3 3 3
38 PA 5 5 4 4 4 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3
39 RI 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 4 4 4 4 4 4 4 4 4
40 SC 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 1 1 1 1 1 1 1 1 1
41 SD 1 1 1 1 1 2 2 2 2 3 3 3 3 3 . . . . . . . . . .
42 TN 1 1 1 1 2 2 2 2 2 2 2 2 2 2 3 3 3 3 3 3 3 3 3 3
43 TX 1 1 1 1 1 1 1 2 2 2 2 3 3 3 3 3 3 3 3 3 3 3 3 3
44 UT 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
45 VT 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
46 VA 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
47 WA 2 2 2 2 2 2 2 2 2 2 2 2 2 3 3 3 2 2 2 2 2 2 2 2
48 WV 4 4 4 4 3 3 3 3 3 3 3 4 4 4 4 4 4 4 4 4 4 4 4 4
49 WI 3 3 3 3 3 3 3 3 3 3 3 3 4 4 4 4 4 4 3 3 3 3 3 3
50 WY 2 2 2 1 1 1 1 1 1 2 2 2 3 3 3 3 3 3 3 3 3 3 3 3
;
proc sort data=alignment out=alignment;
by year row;
run;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.