Have: Population data with one row per county, where each year's population is stored as a separate variable. (I'm new and couldn't figure out how to put a table in, sorry.)
Want: A table with three columns, County, Year, and population.
I tried using proc transpose and got a table with the county listed multiple times and the populations for each year, but it doesn't list what year the population data belongs to. Is there a way to output the name of the variable an observation was transposed from?
Thank you!
proc sort data=pwid.popdata;
by county;
run;
proc transpose data=pwid.popdata name=county let out=work.test;
by county;
run;
proc print data=test; run;
You will always get a variable _name_ from proc transpose that contains the name of the transposed variable.
If you want to extract the year from the variable name, add a step:
data have;
input county $ pop2014 pop2015 pop2016 pop2017 pop2018;
datalines;
A 100 200 300 400 500
B 150 250 350 450 550
;
proc transpose data=have out=trans;
by county;
var pop:;
run;
data want;
set trans;
rename col1=population;
year = input(compress(_name_,'','kd'),best.);
drop _name_;
run;
proc print data=want noobs;
run;
Result:
county population year A 100 2014 A 200 2015 A 300 2016 A 400 2017 A 500 2018 B 150 2014 B 250 2015 B 350 2016 B 450 2017 B 550 2018
Please provide an example of your input dataset.
To give you a blueprint:
data have;
input county $ pop_2000 pop_2001 pop_2002;
datalines;
A 100 150 200
B 200 199 300
C 140 150 160
;
proc transpose data=have out=want;
by county;
var pop:;
run;
data have;
input county $ pop2014 pop2015 pop2016 pop2017 pop2018;
datalines;
A 100 200 300 400 500
B 150 250 350 450 550
etc
;My data is structured pretty much exactly how you wrote it. Running the transpose code didn't give me years/variable names though.
You will always get a variable _name_ from proc transpose that contains the name of the transposed variable.
If you want to extract the year from the variable name, add a step:
data have;
input county $ pop2014 pop2015 pop2016 pop2017 pop2018;
datalines;
A 100 200 300 400 500
B 150 250 350 450 550
;
proc transpose data=have out=trans;
by county;
var pop:;
run;
data want;
set trans;
rename col1=population;
year = input(compress(_name_,'','kd'),best.);
drop _name_;
run;
proc print data=want noobs;
run;
Result:
county population year A 100 2014 A 200 2015 A 300 2016 A 400 2017 A 500 2018 B 150 2014 B 250 2015 B 350 2016 B 450 2017 B 550 2018
It certainly works for me.
data have;
input county $ pop2014 pop2015 pop2016 pop2017 pop2018;
datalines;
A 100 200 300 400 500
B 150 250 350 450 550
;
proc transpose data=have out=want;
by county;
var pop:;
run;
proc print data=want;
run;
Result:
Obs county _NAME_ COL1 1 A pop2014 100 2 A pop2015 200 3 A pop2016 300 4 A pop2017 400 5 A pop2018 500 6 B pop2014 150 7 B pop2015 250 8 B pop2016 350 9 B pop2017 450 10 B pop2018 550
You might want to post process it to convert the _NAME_ variable into a YEAR variable.
year=input(substr(_name_,4),4.);
You could use a RENAME statement or the RENAME= dataset option to rename the COL1 variable into POPULATION or whatever name you want.
You're right, I missed that it didn't have the let statement that I had in my code. For some reason that was suppressing output of the variable names. Thank you!
April 27 – 30 | Gaylord Texan | Grapevine, Texas
Walk in ready to learn. Walk out ready to deliver. This is the data and AI conference you can't afford to miss.
Register now and lock in 2025 pricing—just $495!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.