DATA Step, Macro, Functions and more

Import a xlsx file with years in the 8th row

Reply
Occasional Contributor
Posts: 13

Import a xlsx file with years in the 8th row

Hi there,

 

I need to import a xlsx data into SAS. The file has data for different industries and different years, and useful data begins in the 8th row. The below is an example. 

 

Line  19481949195019511952
1      Compensation of employeesA033RC0144,470144,512158,465185,927201,341
2Domestic industriesA4002C0144,384144,454158,387185,913201,339
3  Private industriesA4003C0121,529119,169131,318150,726161,201
4    Agriculture, forestry, and fishingA4004BC03,3183,1423,1833,3393,313
5      FarmsB4005BC03,0002,8172,8212,9322,868
6      Agricultural services, forestry, and fishingJ4006C0319326363408446
7    MiningA4007BC03,5963,1813,5023,9874,056

 

When I try to import it with code:

 

proc import out = mydataset
datafile = "&path\Section6All_Hist"
dbms = xlsx replace;
range = 'C8:X92';
getnames=yes;
run;

 

SAS reports:

 

NOTE: Variable Name Change. 1948 -> _1948

NOTE: Variable Name Change. 1949 -> _1949
NOTE: Variable Name Change. 1950 -> _1950

...

 

Moreover, I find it hard to transpose the dataset into normal shape, i.e., years on the first column and industry names as variable names. I attached a sample of the xlsx file. 

 

Esteemed Advisor
Posts: 5,401

Re: Import a xlsx file with years in the 8th row

1948 is not a valid SAS variable name, thats why it is given a "_" prefix. The commands to get a transposed version should look like:

 

proc transpose data=myDataset out=myNormalShape  name=yearStr;
id industry;
var _: ;
run;

data myNormalShapeDataset;
set myNormalShape;
year = input(substr(yearStr,2), best.);
drop yearStr;
run; 
PG
Super Contributor
Posts: 500

Re: Import a xlsx file with years in the 8th row

Depending on what you want to do with the data after transposing, i would almost always keep the in the industry names as values. This makes group-processing easier.

SAS Super FREQ
Posts: 343

Re: Import a xlsx file with years in the 8th row

Use the VALIDVARNAME and VALIDMEMNAME options to support non-SAS names:

 

options validvarname=any validmemname=extend;

%let PATH=path-to-your-file;

proc import out=work.mydataset
  datafile="&PATH\Sample.xlsx"
  dbms=xlsx
  replace;
  range='Sheet1$c8:y100';
  getnames=yes;
run; quit;

 

Vince DelGobbo

SAS R&D

 

Ask a Question
Discussion stats
  • 3 replies
  • 180 views
  • 0 likes
  • 4 in conversation