BookmarkSubscribeRSS Feed
deleted_user
Not applicable
Hello All,

I want to remove an observation from the data.Suppose that I'm predicting Y from A,B and C and that I initially removed say D from the analysis.Now , if I want to not have the observation, say 50 in my analysis then in the WHERE condition what variable I must use.

The code I used is :-

proc reg data="c:\sasreg\crime";
model Y=A B C;
where D ne "50";
run;
quit;

My question is that since D is not in my model then will that delete observation 50 from the model I'm interested in.

Kind Regards,
markc
1 REPLY 1
Cynthia_sas
SAS Super FREQ
If variable D holds the observation number, then your logic would be correct. However, if D was originally one of your analysis variables, I find it hard to understand what purpose would have been served by having the observation number in your model.

Generally, to use WHERE logic to remove an observation from being used by a SAS procedure, you use criteria other than observation number - -because, for example, the observation number could change if you performed a sort on the dataset. What was observation 50 in one sorted order could become observation 214 in another sorted order. So, for example, if I did this:
[pre]
proc reg data=sashelp.class;
model age = weight;
where name ne 'Alfred';
run;
quit;
[/pre]

Then the observations where name was not equal to "Alfred" would be the only ones passed to PROC REG. As you can see, if you run the program, even though the NAME variable is not in the model, the WHERE statement will do its job and PROC REG will only get 18 observations to process. SASHELP.CLASS has 19 observations -- so with the exclusion of the observation for Alfred, only 18 observations get passed to PROC REG.

You might have to find another way to exclude observation 50, such as finding a unique combination of conditions that would identify that observation and only that observation. So the answer to your question is 1) the variable in your WHERE statement does not have to be in your MODEL statement in order to exclude observations; however 2) your particular WHERE statement will only delete observation #50 if variable D is holding the observation number or ID number that uniquely identifies observation #50 and only observation #50.

You are the only one who can answer #2 -- it's your data -- does variable D uniquely identify observation 50 and only observation 50??? What does variable D represent? What are the range of values in variable D?

cynthia

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 1 reply
  • 1285 views
  • 0 likes
  • 2 in conversation