Have: Population data with one row per county, where each year's population is stored as a separate variable. (I'm new and couldn't figure out how to put a table in, sorry.)
Want: A table with three columns, County, Year, and population.
I tried using proc transpose and got a table with the county listed multiple times and the populations for each year, but it doesn't list what year the population data belongs to. Is there a way to output the name of the variable an observation was transposed from?
Thank you!
proc sort data=pwid.popdata;
by county;
run;
proc transpose data=pwid.popdata name=county let out=work.test;
by county;
run;
proc print data=test; run;
You will always get a variable _name_ from proc transpose that contains the name of the transposed variable.
If you want to extract the year from the variable name, add a step:
data have;
input county $ pop2014 pop2015 pop2016 pop2017 pop2018;
datalines;
A 100 200 300 400 500
B 150 250 350 450 550
;
proc transpose data=have out=trans;
by county;
var pop:;
run;
data want;
set trans;
rename col1=population;
year = input(compress(_name_,'','kd'),best.);
drop _name_;
run;
proc print data=want noobs;
run;
Result:
county population year A 100 2014 A 200 2015 A 300 2016 A 400 2017 A 500 2018 B 150 2014 B 250 2015 B 350 2016 B 450 2017 B 550 2018
Please provide an example of your input dataset.
To give you a blueprint:
data have;
input county $ pop_2000 pop_2001 pop_2002;
datalines;
A 100 150 200
B 200 199 300
C 140 150 160
;
proc transpose data=have out=want;
by county;
var pop:;
run;
data have;
input county $ pop2014 pop2015 pop2016 pop2017 pop2018;
datalines;
A 100 200 300 400 500
B 150 250 350 450 550
etc
;
My data is structured pretty much exactly how you wrote it. Running the transpose code didn't give me years/variable names though.
You will always get a variable _name_ from proc transpose that contains the name of the transposed variable.
If you want to extract the year from the variable name, add a step:
data have;
input county $ pop2014 pop2015 pop2016 pop2017 pop2018;
datalines;
A 100 200 300 400 500
B 150 250 350 450 550
;
proc transpose data=have out=trans;
by county;
var pop:;
run;
data want;
set trans;
rename col1=population;
year = input(compress(_name_,'','kd'),best.);
drop _name_;
run;
proc print data=want noobs;
run;
Result:
county population year A 100 2014 A 200 2015 A 300 2016 A 400 2017 A 500 2018 B 150 2014 B 250 2015 B 350 2016 B 450 2017 B 550 2018
It certainly works for me.
data have;
input county $ pop2014 pop2015 pop2016 pop2017 pop2018;
datalines;
A 100 200 300 400 500
B 150 250 350 450 550
;
proc transpose data=have out=want;
by county;
var pop:;
run;
proc print data=want;
run;
Result:
Obs county _NAME_ COL1 1 A pop2014 100 2 A pop2015 200 3 A pop2016 300 4 A pop2017 400 5 A pop2018 500 6 B pop2014 150 7 B pop2015 250 8 B pop2016 350 9 B pop2017 450 10 B pop2018 550
You might want to post process it to convert the _NAME_ variable into a YEAR variable.
year=input(substr(_name_,4),4.);
You could use a RENAME statement or the RENAME= dataset option to rename the COL1 variable into POPULATION or whatever name you want.
You're right, I missed that it didn't have the let statement that I had in my code. For some reason that was suppressing output of the variable names. Thank you!
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.