Hi all, first time poster here. I did try to searching Google and this website in particular for something related to my issue which I imagine must be basic but I couldn't find anything. Essentially I have a dataset that I cleaned in Excel and uploaded to SAS. When I run Summary Statistics it says 4186 observations (as it should) but then it shows over a million missing values which are the blanks underneath the data. Is there some setting that I have wrong?
This is my summary statistics:
Run a data step after the import with a WHERE condition that filters the empty values out.
You did nothing wrong. The designers of Excel did, at least in my opinion.
Excel files often have what I call "phantom" data. At some time the file likely had some values that were deleted. These could be entire rows or columns. However Excel considered these cells "used" and the engines used to import this data creates missing values in SAS.
I have received data with upwards of 100,000 rows of such garbage because of repeated manual edits of the files.
Options are either to Save the data as CSV, write a data step to read the data and either edit the CSV to remove the rows of commas that result or add a statement like this after the input.
If _infile_ =: ',,,,,' then delete;
Or take a pass at the data with a data step to keep records where the first few variables are all missing (use your variable names of course)
If you get a bunch of variables from reading the data as Excel with names like Var15 Var16 Var17 (or similar) that can be an indication of columns with missing values and after checking the data you can drop those variables.
data have; set have; if missing(var1) and Missing(var2) and missing(var3) then delete; run;
If you want to test this follow these simple steps:
Create a new Excel file.
Enter data in the first column for 5 to 10 rows. First row should be column heading like a variable name.
Delete all the data except the first two rows.
Import that file.
See if the number of observations matches your initial rows with all but one missing values.
Save $250 on SAS Innovate and get a free advance copy of the new SAS For Dummies book! Use the code "SASforDummies" to register. Don't miss out, May 6-9, in Orlando, Florida.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.