Hi there,
I need to import a xlsx data into SAS. The file has data for different industries and different years, and useful data begins in the 8th row. The below is an example.
Line | 1948 | 1949 | 1950 | 1951 | 1952 | ||
1 | Compensation of employees | A033RC0 | 144,470 | 144,512 | 158,465 | 185,927 | 201,341 |
2 | Domestic industries | A4002C0 | 144,384 | 144,454 | 158,387 | 185,913 | 201,339 |
3 | Private industries | A4003C0 | 121,529 | 119,169 | 131,318 | 150,726 | 161,201 |
4 | Agriculture, forestry, and fishing | A4004BC0 | 3,318 | 3,142 | 3,183 | 3,339 | 3,313 |
5 | Farms | B4005BC0 | 3,000 | 2,817 | 2,821 | 2,932 | 2,868 |
6 | Agricultural services, forestry, and fishing | J4006C0 | 319 | 326 | 363 | 408 | 446 |
7 | Mining | A4007BC0 | 3,596 | 3,181 | 3,502 | 3,987 | 4,056 |
When I try to import it with code:
proc import out = mydataset
datafile = "&path\Section6All_Hist"
dbms = xlsx replace;
range = 'C8:X92';
getnames=yes;
run;
SAS reports:
NOTE: Variable Name Change. 1948 -> _1948
NOTE: Variable Name Change. 1949 -> _1949
NOTE: Variable Name Change. 1950 -> _1950
...
Moreover, I find it hard to transpose the dataset into normal shape, i.e., years on the first column and industry names as variable names. I attached a sample of the xlsx file.
1948 is not a valid SAS variable name, thats why it is given a "_" prefix. The commands to get a transposed version should look like:
proc transpose data=myDataset out=myNormalShape name=yearStr;
id industry;
var _: ;
run;
data myNormalShapeDataset;
set myNormalShape;
year = input(substr(yearStr,2), best.);
drop yearStr;
run;
Depending on what you want to do with the data after transposing, i would almost always keep the in the industry names as values. This makes group-processing easier.
Use the VALIDVARNAME and VALIDMEMNAME options to support non-SAS names:
options validvarname=any validmemname=extend;
%let PATH=path-to-your-file;
proc import out=work.mydataset
datafile="&PATH\Sample.xlsx"
dbms=xlsx
replace;
range='Sheet1$c8:y100';
getnames=yes;
run; quit;
Vince DelGobbo
SAS R&D
Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.
Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.