Thanks for your interesting code. But maybe I did not present my issue clear enough. What I am looking for is to elminating all the varaibles(not variable value) from a dataset when ALL their values are blank or missing. For example, If I am reading from an excel spreadsheet with
I want to elminate all the excel blank columns in my dataset file3.
Another possible method would be to use PROC TRANSPOSE.
First read all the rows and eliminate all blank rows which can be easily achieved, as suggested by Chuck.
The you can use PROC TRASNPOSE which will convert the columns into rows, so again you have the same problem of eliminating blank rows. That being done you can transpose again to get back your original data.
A bit of attention should be paid while handling the header rows, that is all.
Eliminating blank rows is easiest and can be done in the initial read.
Test if each column is missing/blank, if any column is not missing/blank, then output that row.
You should know in advance which columns are of interest so that you don't have to test all 255 Excel columns.
SAS can easily control which columns are in a table through the use of keep and drop (both statements and dataset options). But, programmatically determining which to keep or drop is non-trivial. I expect this will require some MACRO programming to create a KEEP or DROP statement to use in a DATA step or a SELECT statement to use with PROC SQL; CREATE TABLE _________ as ...;
Thanks for your reply. I just hoping for some easier way to get this done but I guess I have to settle to converting the dataset variables into an macor array and then examine them one by one.in a macro do loop.
One more thing, even I define to read in 255 column and 64000 rows, SAS do eliminated a lot of blank columns and rows. In this case I was ended up with 11 columns and 300 rows in the dataset file3. However, out of the 11 columns there are about 5 blank colums (those columns SAS assign variables name such F2,F3..) and about 75 blank rows. The excel spread sheet actual has 6 column and 225 rows of data. I do not know how SAS can eliminae most rows and columns and not the others. It must have something to do with the way excel was set up that SAS think it is not an empty columns/row. Do you have any experience in readin excel in ?
> Yes, we do read in Excel spreadsheets.
> If your code is correct, it will drop all the columns
> and all the rows properly.
I find that within a spreadsheet if all the cell values in a column/row was delete without deleting the column/row, then SAS will read in the coulmn as a valid SAS variable and the row as valid obs even they are both total empty. So if I was handed a spreadsheet to read into SAS without knowing some of the column/row were previous non blank. How can I instruct SAS not to read those empty but previously non-empty column/row ?
Can't, you'll have to filter those out after the fact.
That is, you will have to filter the empty rows while reading in the data, and then filter out empty columns later.
We use Excel spreadsheets as a simple mechanism to hold configuration information. It is easy to edit and print the information. The first row contains the field names, and all non-used rows and columns are "deleted" or "empty" so that SAS considers them "missing" and doesn't read them. Since the first row contains the field names, SAS automatically assigns variables (variable names and labels) to those columns, and all subsequent rows are read in as data. Our code then assumes specific column names. If you are being handed amorphous Excel spreadsheets, then your job is going to be a lot harder, but I would also question as to "why?" what is the purpose? The point of SAS is to do data analysis against either massive amounts of data, or in very sophisticated manners, perhaps applying highly sophisticated (and preprogrammed in "proc"s) analyses to the data. If the input are unknown Excel spreadsheets, then how is the system going to know what to do with the data?
Message was edited by: Chuck
I have two suggestions, one manual and one hopefully automated.
When you get an Excel file and find these empty columns occurring in your SAS import, then open up the Excel file in MS Excel and delete the empty columns (don't just delete the cells again, actually delete the column). Resave the file and SAS should work better.
From one of the earlier posts, it was mentioned that SAS assigns it's own column names (e.g. F2, F3, . . . ) to the blank columns. Maybe you can just drop all the column names that fit this format. For example.
infile cards missover;
input FirstCol SecondCol F1 F2 F3;
drop F1 - F256; *Drop any columns that start with 'F' followed by a number up to 256;
Be careful not to drop valid column names that start with 'F' in this example. You might have to tweak your column names to handle this properly.