BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
JadeTheFirst
Calcite | Level 5

SAS University Edition

So I'm trying to import some data into SAS from an excel file. (You can see the data in the xls file I've attached) SAS is unfortunately not using the years like 1975,1976 etc as column headers which is what I want - it is using B C D E like excel sheets have by default. 

I used this code to get the data into SAS:

PROC IMPORT DATAFILE='/folders/myfolders/Historic Mortality data/Historic Mortality data/deathsbysingleyearofage.xls'
	DBMS=XLS
	OUT=WORK.DeathsByAgeWomenUK;
	GETNAMES=no;
	SHEET="Table 2";
	RANGE="MyRange2";
	NAMEROW=4;
RUN;

As previously mentioned - my data is meant to have years such as "1975" and "1976" as the column headers but it isn't, it's using "Age" yes but then jumping straight to using B,C,D etc. from excel. 

I was thinking of doing this:

DATA DeathsByAgeWomenUKEDIT;
	SET WORK.DeathsByAgeWomenUK; 
	RENAME 	B="1974" C="1975" D="1976";
RUN;

But obviously for every single column, which would probably take too much time. 
I was thinking there's gotta be a way to do this easier - but I cannot find anything anywhere on how to do it. 

Any help with this would be hugely appreciated. Also feel free to ask any question if I haven't made this entirely clear 🙂

1 ACCEPTED SOLUTION

Accepted Solutions
art297
Opal | Level 21

If (1)you can save the workbook as an Excel workbook (i.e., .xlsx) and (2) variable names like _1984 and _1985 etc. will suffice, then you could get what you want with something like:

PROC IMPORT DATAFILE='/folders/myfolders/Historic Mortality data/Historic Mortality data/deathsbysingleyearofage.xlsx'
    DBMS=XLSx replace
    OUT=WORK.DeathsByAgeWomenUK;
  RANGE="MyRange2";
RUN;

Of course, if you use "options validvarname=any;", the above import would provide the variables as 1985, 1985, etc. but then, like @Tom said, you would have to refer to them as named literals.

 

Art, CEO, AnalystFinder.com

 

View solution in original post

4 REPLIES 4
Tom
Super User Tom
Super User

The problem is that your file does not have valid variable names as its column headers.  You cannot name a variable 1974.  How would SAS (or anyone) interpret the code using such names?

average = mean(1974,1975);

You could try setting the options VALIDVARNAME to ANY.

But then you would need to use name literals in your code to refer to these strangely named columns.

average = mean("1974"n,'1975'n);

 

Tom
Super User Tom
Super User

Treat the years as data, not variable names.

Here is a method to get your table into a tall format where the value of YEAR is actual data instead of being trapped in metadata.

PROC IMPORT DATAFILE="&path/deathsbysingleyearofage.xls"
	DBMS=XLS
	OUT=test1;
	SHEET="Table 2";
	GETNAMES=no;
RUN;

proc transpose data=test1(obs=1) out=dates ;
  by age notsorted;
run;
proc transpose data=test1(firstobs=2) out=values ;
  by age notsorted;
run;

proc sql ;
  create table want as 
    select a.col1 as year
         , input(b.age,?3.) as age
         , b.age as age_label
         , b.col1 as deaths
    from dates a full join values b
    on a._name_ = b._name_ 
    order by 1,2
  ;
quit;

proc print data=want (obs=20); 
  where age in (0,1,104,105);
run;
 Obs    year    age    age_label       deaths

   1    1974      0    0                 5172
   2    1974      1    1                  383
 105    1974    104    104                 48
 106    1974    105    105+                33
 107    1975      0    0                 4798
 108    1975      1    1                  346
 211    1975    104    104                 40
 212    1975    105    105+                31
 213    1976      0    0                 4070
 214    1976      1    1                  298
 317    1976    104    104                 38
 318    1976    105    105+                49
 319    1977      0    0                 3933
 320    1977      1    1                  263
 423    1977    104    104                 37
 424    1977    105    105+                38
 425    1978      0    0                 3908
 426    1978      1    1                  287
 529    1978    104    104                 58
 530    1978    105    105+                47

You can reproduce your source table as a report.

proc report data=want ;
 where age in (0,1,104,105) and year in (1974:1980);
 columns age age_label deaths,year ;
 define age / group noprint;
 define age_label / group ;
 define year / across ;
run;

image.png

JadeTheFirst
Calcite | Level 5
I'm having an issue with the year column using your solution, they aren't the years for me anymore, they have changed to numbers such as "1488" onwards... any idea as to why? 🙂
art297
Opal | Level 21

If (1)you can save the workbook as an Excel workbook (i.e., .xlsx) and (2) variable names like _1984 and _1985 etc. will suffice, then you could get what you want with something like:

PROC IMPORT DATAFILE='/folders/myfolders/Historic Mortality data/Historic Mortality data/deathsbysingleyearofage.xlsx'
    DBMS=XLSx replace
    OUT=WORK.DeathsByAgeWomenUK;
  RANGE="MyRange2";
RUN;

Of course, if you use "options validvarname=any;", the above import would provide the variables as 1985, 1985, etc. but then, like @Tom said, you would have to refer to them as named literals.

 

Art, CEO, AnalystFinder.com

 

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 4 replies
  • 2258 views
  • 1 like
  • 3 in conversation