- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Dear,
I need to work with PISA data. This is a large data set with a lot of observations and variables.
I want to delete observations (rows) with missing values, by using listwise deletion.
I tried to use proc corr data = pisastu2015vla nomiss;
But when I do this I get an ERROR: The NOMISS option is specified, but all variables have at least one missing value for all
observations in the input data set. I don't understand how this is possible. Because my teacher told me to use listwise deletion and delete rows with missing variables.
Can somebody help me?
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
@FraukevdRijt wrote:
Dear,
I need to work with PISA data. This is a large data set with a lot of observations and variables.
I want to delete observations (rows) with missing values, by using listwise deletion.
I tried to use proc corr data = pisastu2015vla nomiss;
But when I do this I get an ERROR: The NOMISS option is specified, but all variables have at least one missing value for all
observations in the input data set. I don't understand how this is possible. Because my teacher told me to use listwise deletion and delete rows with missing variables.
Can somebody help me?
So where did you delete the observations? Show the code you used to delete the observations.
When you use the option NOMISS on the Proc Corr statement that says you are excluding any observation with any missing analysis variables. Since you did not list any variables on a VAR statement then all numeric variables were treated as analysis variables.
I think you should describe what you intend by 'listwise deletion'.
Typically delete means to remove from a data set such as with pseudo code:
data new; set old; if <some condition is met> then delete. run;
and then use the NEW data set.
If your interest is to run proc corr then do not use the NOMISS option (at least with this data set) OR use data set options and/or where statement to restrict analysis to records where specific variables are not missing.
BTW instead of taking the time to make an image of your log or editor you can copy directly from the LOG or the Editor and paste into a code box opened using either the {I} or "running man" icon found in forum's message window menu.
That way we copy and modify text to indicate corrections or highlight items to discuss. Or even run data steps using inline data to create data sets to manipulate.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Try
data test;
set pisastu2015vla;
if not cmiss(_numeric_);
run;
and see what comes out.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
@PGStats wrote:
Try
data test; set pisastu2015vla; if not cmiss(_numeric_); run;
and see what comes out.
I think a missing "OF" in the cmiss call is needed.
From the introduction to the question we have something like:
data junk; input x y z; datalines; . 2 3 1 . 3 1 2 . ; run; data test; set junk; if not cmiss(of _numeric_); run;
which yields:
NOTE: There were 3 observations read from the data set USER.JUNK. NOTE: The data set USER.TEST has 0 observations and 3 variables.