10-28-2017 01:03 AM
I need to import a xlsx data into SAS. The file has data for different industries and different years, and useful data begins in the 8th row. The below is an example.
|1||Compensation of employees||A033RC0||144,470||144,512||158,465||185,927||201,341|
|4||Agriculture, forestry, and fishing||A4004BC0||3,318||3,142||3,183||3,339||3,313|
|6||Agricultural services, forestry, and fishing||J4006C0||319||326||363||408||446|
When I try to import it with code:
proc import out = mydataset
datafile = "&path\Section6All_Hist"
dbms = xlsx replace;
range = 'C8:X92';
NOTE: Variable Name Change. 1948 -> _1948
NOTE: Variable Name Change. 1949 -> _1949
NOTE: Variable Name Change. 1950 -> _1950
Moreover, I find it hard to transpose the dataset into normal shape, i.e., years on the first column and industry names as variable names. I attached a sample of the xlsx file.
10-28-2017 01:19 AM
1948 is not a valid SAS variable name, thats why it is given a "_" prefix. The commands to get a transposed version should look like:
proc transpose data=myDataset out=myNormalShape name=yearStr; id industry; var _: ; run; data myNormalShapeDataset; set myNormalShape; year = input(substr(yearStr,2), best.); drop yearStr; run;
10-28-2017 05:51 AM
Depending on what you want to do with the data after transposing, i would almost always keep the in the industry names as values. This makes group-processing easier.
10-29-2017 03:35 PM
Use the VALIDVARNAME and VALIDMEMNAME options to support non-SAS names:
options validvarname=any validmemname=extend; %let PATH=path-to-your-file; proc import out=work.mydataset datafile="&PATH\Sample.xlsx" dbms=xlsx replace; range='Sheet1$c8:y100'; getnames=yes; run; quit;