SAS University Edition
So I'm trying to import some data into SAS from an excel file. (You can see the data in the xls file I've attached) SAS is unfortunately not using the years like 1975,1976 etc as column headers which is what I want - it is using B C D E like excel sheets have by default.
I used this code to get the data into SAS:
PROC IMPORT DATAFILE='/folders/myfolders/Historic Mortality data/Historic Mortality data/deathsbysingleyearofage.xls'
DBMS=XLS
OUT=WORK.DeathsByAgeWomenUK;
GETNAMES=no;
SHEET="Table 2";
RANGE="MyRange2";
NAMEROW=4;
RUN;
As previously mentioned - my data is meant to have years such as "1975" and "1976" as the column headers but it isn't, it's using "Age" yes but then jumping straight to using B,C,D etc. from excel.
I was thinking of doing this:
DATA DeathsByAgeWomenUKEDIT;
SET WORK.DeathsByAgeWomenUK;
RENAME B="1974" C="1975" D="1976";
RUN;
But obviously for every single column, which would probably take too much time.
I was thinking there's gotta be a way to do this easier - but I cannot find anything anywhere on how to do it.
Any help with this would be hugely appreciated. Also feel free to ask any question if I haven't made this entirely clear 🙂
If (1)you can save the workbook as an Excel workbook (i.e., .xlsx) and (2) variable names like _1984 and _1985 etc. will suffice, then you could get what you want with something like:
PROC IMPORT DATAFILE='/folders/myfolders/Historic Mortality data/Historic Mortality data/deathsbysingleyearofage.xlsx'
DBMS=XLSx replace
OUT=WORK.DeathsByAgeWomenUK;
RANGE="MyRange2";
RUN;
Of course, if you use "options validvarname=any;", the above import would provide the variables as 1985, 1985, etc. but then, like @Tom said, you would have to refer to them as named literals.
Art, CEO, AnalystFinder.com
The problem is that your file does not have valid variable names as its column headers. You cannot name a variable 1974. How would SAS (or anyone) interpret the code using such names?
average = mean(1974,1975);
You could try setting the options VALIDVARNAME to ANY.
But then you would need to use name literals in your code to refer to these strangely named columns.
average = mean("1974"n,'1975'n);
Treat the years as data, not variable names.
Here is a method to get your table into a tall format where the value of YEAR is actual data instead of being trapped in metadata.
PROC IMPORT DATAFILE="&path/deathsbysingleyearofage.xls"
DBMS=XLS
OUT=test1;
SHEET="Table 2";
GETNAMES=no;
RUN;
proc transpose data=test1(obs=1) out=dates ;
by age notsorted;
run;
proc transpose data=test1(firstobs=2) out=values ;
by age notsorted;
run;
proc sql ;
create table want as
select a.col1 as year
, input(b.age,?3.) as age
, b.age as age_label
, b.col1 as deaths
from dates a full join values b
on a._name_ = b._name_
order by 1,2
;
quit;
proc print data=want (obs=20);
where age in (0,1,104,105);
run;
Obs year age age_label deaths 1 1974 0 0 5172 2 1974 1 1 383 105 1974 104 104 48 106 1974 105 105+ 33 107 1975 0 0 4798 108 1975 1 1 346 211 1975 104 104 40 212 1975 105 105+ 31 213 1976 0 0 4070 214 1976 1 1 298 317 1976 104 104 38 318 1976 105 105+ 49 319 1977 0 0 3933 320 1977 1 1 263 423 1977 104 104 37 424 1977 105 105+ 38 425 1978 0 0 3908 426 1978 1 1 287 529 1978 104 104 58 530 1978 105 105+ 47
You can reproduce your source table as a report.
proc report data=want ;
where age in (0,1,104,105) and year in (1974:1980);
columns age age_label deaths,year ;
define age / group noprint;
define age_label / group ;
define year / across ;
run;
If (1)you can save the workbook as an Excel workbook (i.e., .xlsx) and (2) variable names like _1984 and _1985 etc. will suffice, then you could get what you want with something like:
PROC IMPORT DATAFILE='/folders/myfolders/Historic Mortality data/Historic Mortality data/deathsbysingleyearofage.xlsx'
DBMS=XLSx replace
OUT=WORK.DeathsByAgeWomenUK;
RANGE="MyRange2";
RUN;
Of course, if you use "options validvarname=any;", the above import would provide the variables as 1985, 1985, etc. but then, like @Tom said, you would have to refer to them as named literals.
Art, CEO, AnalystFinder.com
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.