BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
RahulM
Fluorite | Level 6

Hi There,

 

I am running a t-test on a set of observations. There is one observation that is a massive outlier and I wanted to rerun the test without this observation. However, I don't want to go through the rigmarole of completely deleting the data point. Is there a way to get SAS to run the t-test without a certain observation? 

 

Cheers!

1 ACCEPTED SOLUTION

Accepted Solutions
RahulM
Fluorite | Level 6

For anyone wondering, I just managed to do this by adding the following to my T-test procedure. 

 

WHERE pt_no NE 10; 

View solution in original post

6 REPLIES 6
Reeza
Super User

If you have some way to uniquely identify the observation then yes, use a WHERE statement in your proc ttest. 

Typically a WHERE statement can be included in almost all Procs. 

 

Where obs_id = 10;
RahulM
Fluorite | Level 6

So do I include a statement after the WHERE statement to ask SAS to remove it from the analysis? Or do I simply include that in the proc and it will run without observation 10 

 

PROC TTEST data=ctpeat3;
CLASS CTPPP;
var eatv;
RUN;

 

Thanks Reeza, you're a beast. 

Reeza
Super User

Do you observation IDs? 

 

You our include the WHERE in your proc and it will automatically exclude it from analysis. 

 


PROC TTEST data=ctpeat3;
Where obs ne 10;;
CLASS CTPPP;
var eatv;
RUN;
RahulM
Fluorite | Level 6

Hey Reeva,

 

Can you exclude multiple observations using this method? It seems where statements can't be combined with an OR statement, is that right? 

 

Cheers

FreelanceReinh
Jade | Level 19

If I may step in here, logical operators such as AND, OR, NOT are commonly used in WHERE statements. As always in programming, you have to use the correct syntax, though.

 

Example: Let's say, you want to exclude a patient if they have patient number 10 or 13. In SAS, the OR operator must not be placed between the numbers 10 and 13 like it's possible in human language, but it must be placed between two expressions which evaluate as true or false:

where not (pt_no=10 or pt_no=13);

This is logically equivalent to:

where pt_no ne 10 and pt_no ne 13;

But typically, this would be written (again equivalently) using the IN operator:

where pt_no not in (10, 13);

Or even shorter:

where pt_no ~in (10 13);

(The abbreviation of NOT as tilde (~) might not available on all keyboards, but you can use the caret (^) instead. The comma is optional in lists used with the IN operator. These lists may contain more than two values.)

 

You wrote that the observation to be excluded was a "massive outlier." In this case it could be an alternative to replace the "hard-coded" condition on PT_NO by a condition on the measurement variable(s) which characterize the data point as an extreme outlier (e.g. where 10 <= bmi <= 60;).

RahulM
Fluorite | Level 6

For anyone wondering, I just managed to do this by adding the following to my T-test procedure. 

 

WHERE pt_no NE 10; 

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 6 replies
  • 996 views
  • 2 likes
  • 3 in conversation