Help using Base SAS procedures

Proc Corr not excluding rows with zeroes

Accepted Solution Solved
Reply
New Contributor
Posts: 2
Accepted Solution

Proc Corr not excluding rows with zeroes

Firstly, I just started using SAS two weeks ago. Apologies if this question is basic. Secondly, I am using Enterprise Guide 7.1. Thirdly, I am not getting errors.

 

I am trying to correlate four independent variables (IV) agaisnt a depedent variable (DV). I know that the independent variables have zeroes and I don't want to include those values when the Pearson correlation coefficients are calculated. 

 

The dependent variable contains only non negative rational numbers, and so do the independent variables (include zeroes though).

 

This is what I am running:

 

ODS GRAPHICS ON;

PROC SORT DATA=WORK.mf15126(KEEP= DV IV1 IV2 IV3 IV4 ProductCode) OUT=WORK.SORTTempTableSorted ; BY ProductCode; RUN; PROC CORR DATA=WORK.SORTTempTableSorted PLOTS=SCATTER PEARSON EXCLNPWGT VARDEF=DF ; BY ProductCode; WHERE ProductCode eq "blah"; VAR DV; WITH IV1 IV2 IV3 IV4; RUN;

That spits some values. But when I try the same code without EXCLNPWGT, the correlation coefficients are the same. Also, the scatterplot shows me that SAS is considering zeros for the IVs for the plotting.

 

I the went to R, separated all my DV, IV pairs into different dfs, then drop rows with zeroes, then run the corr function, and the results were different.

 

Can someone please tell me if I am doing something wrong in the code (or if I am not using the proc corr the way I am supposed to)?

 

Thank you!


Accepted Solutions
Solution
3 weeks ago
Super User
Posts: 10,538

Re: Proc Corr not excluding rows with zeroes

To exclude values from calculations you would need to assign a value of missing instead of 0. Or you could exclude rows with zero for the offending variable. If you use this approach you would want to do one variable at a time:

PROC CORR DATA=WORK.SORTTempTableSorted
	PLOTS=SCATTER
	PEARSON
	EXCLNPWGT
	VARDEF=DF
	;
	WHERE ProductCode eq "blah" and IV1>0;
	VAR DV;
	WITH IV1 ;
RUN;

If you exclude on two variables you would likely remove valid values for one or the other for Corr calculations.

 

Note that if you have a Where ProductCode = 'blah' then the BY ProductCode is meaningless. By really is more useful with 2 or more levels fo the BY variable (especially in procs).

 

 

Without seeing actual data, actual output and desired output it is hard to say what else may be going on.

 

Likely to work an example by hand you don't want to work with may rows of data and may only want one of the IV variables.

 

Instructions here: https://communities.sas.com/t5/SAS-Communities-Library/How-to-create-a-data-step-version-of-your-dat... will show how to turn an existing SAS data set into data step code that can be pasted into a forum code box using the {i} icon or attached as text to show exactly what you have and that we can test code against.

View solution in original post


All Replies
Solution
3 weeks ago
Super User
Posts: 10,538

Re: Proc Corr not excluding rows with zeroes

To exclude values from calculations you would need to assign a value of missing instead of 0. Or you could exclude rows with zero for the offending variable. If you use this approach you would want to do one variable at a time:

PROC CORR DATA=WORK.SORTTempTableSorted
	PLOTS=SCATTER
	PEARSON
	EXCLNPWGT
	VARDEF=DF
	;
	WHERE ProductCode eq "blah" and IV1>0;
	VAR DV;
	WITH IV1 ;
RUN;

If you exclude on two variables you would likely remove valid values for one or the other for Corr calculations.

 

Note that if you have a Where ProductCode = 'blah' then the BY ProductCode is meaningless. By really is more useful with 2 or more levels fo the BY variable (especially in procs).

 

 

Without seeing actual data, actual output and desired output it is hard to say what else may be going on.

 

Likely to work an example by hand you don't want to work with may rows of data and may only want one of the IV variables.

 

Instructions here: https://communities.sas.com/t5/SAS-Communities-Library/How-to-create-a-data-step-version-of-your-dat... will show how to turn an existing SAS data set into data step code that can be pasted into a forum code box using the {i} icon or attached as text to show exactly what you have and that we can test code against.

New Contributor
Posts: 2

Re: Proc Corr not excluding rows with zeroes

 

Thanks for that explanation. That's exactly what I did in R. I was hoping SAS had a more automatic way of doing that but I guess not. 

 

I was trying to graph different variables as part of my EDA to verify some sort of correlation before inputting the dataframe into the algorithm. 

 

Also, thanks for the tip with the By statement!

 

Thank you!

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 2 replies
  • 183 views
  • 2 likes
  • 2 in conversation