07-27-2011 12:20 PM
I am importing )Proc Import) ~50 spreadsheets to SAS. There are couple of challanges.
(1) variable names are in Row 9 and data begins in row 10. I tried to use datarow= option with getnames=. My variable names appear as first row instead of header in SAS. If I use datarow=10 and use getnames=yes, I get the data without any variable names
(2) Data has footnotes. How to delete them?
Since I am importing multiple workbooks, the row that footnote appears varies. E.g One.xlx has footnote in row 15 and two.xls has footnote in row 30. But the variable name and the first row of data alwys are in row 9 and 10. Below is a snap shot of my spreadsheet
V1 V2 V3 V4
I would like to use a macro instead of using 50 proc import statement. Any suggestions would help.
07-27-2011 12:33 PM
50 workbooks(xls files) or 50 sheets in a single workbook. Either way I think NAMED RANGES will be the key. How you define them is another question, manually or automated.
Post example workbook(s).
07-27-2011 01:08 PM
If you always have the same variables in each sheet, then the task can be simplified. See if something like the following might work (Note: the range statement includes the row with the variable names):
PROC IMPORT OUT= WORK.TESTxl
if nmiss(of V1-V4) then stop;
07-27-2011 01:30 PM
Thanks. But it didn't work. With the code I get additional columns other than the variable name I need and there is no data.
I used if nmiss(of name age sex) then stop; Since my variables are not named as v1-v4. I used them for e.g. purposed. But the variables in all the worksheets are same.
I partly solved the problem with below code.
Proc import out=x
namerow=9; /*this helped to use the values as variables names*/
With this code my data looks
Name Age Sex
I am yet to figure out how to remove the footnote.
Also since there are several datasets that I am importing and the row with footnote is not consistent I am not sure how to put them in macro.
07-27-2011 02:35 PM
Doe the footnote always only take up one cell? And, are name and sex, or name and age always present? If so, in the datastep you might be able to use if nmiss(name sex) ge 1 then stop;
07-27-2011 03:35 PM
Yes, Footnote always takes up one cell.
Yes, Name, age, and sex all are always present.
Still doesn't work. If I use
if nmiss(name sex age) ge 1 then stop;
I get Error 388-185: expecting an arithmetic operator
If I use
if nmiss(name) ge 1 then stop; or if nmiss(name, age, sex) ge 1 then stop;
I get NOTE: variable name is uninitialized and no data
I am replacing the variable names, I am not sure that is causing these errors or something else that I am not seeing.
Thanks once again.
07-27-2011 03:47 PM
Try the cmiss function. e.g.,
input (name sex) ($) age;
John M 20
Mary F 25
Harry M 40
if cmiss(name,sex,age) gt 1 then stop;
07-28-2011 03:46 PM
Thanks. Still did not work. I find a way around. I used the namerow and datarow options in Proc Import. The data had footnote. Then in dataset I retained obs for the id variables that were not missing. this helped to get the desired result.
thanks a lot again for all who responded
07-29-2011 08:21 AM
Sure. Below is my code.
Proc import out=x;
Where variable1 ne ' ';
For my case footnotes in all the 50 spreadsheets starts in column2 instead of 1. If my footnotes were to start in column1 I was planning to see if I could use 'nodigit' function since my column1 has numbers.