Re: Question on proc corr output

Amali6 · Posted 05-22-2020 02:54 PM

Hi all,

I have tried the proc corr for the first time, to find relationship between two variables in my dataset. I used the following code and i got this output. I don't understand is this the increasing trend or decreasing mainly I wanted to check

ods graphics on;
title'To find relation between variables';
proc corr data=hotel.Hotel_bookings plots(MAXPOINTS=NONE)=all;
var lead_time booking_changes;
run;
ods graphics off;

whether is the right output please?

Can someone please explain is this the right plots for this variables and this how it looks like?

Thank you

PGStats · Posted 05-22-2020 03:06 PM

What this statistic is telling you is that there is a very weak, but not statistically significant, tendency of booking changes to increase with lead time.

The scatter graph might be more visually informative if you added some jitter to the discrete booking change values and made the aspect ratio of the graph closer to a square.

PG

Reeza · Posted 05-22-2020 03:58 PM

I suspect it's only statistically significant because you have a large number of observations, 119330 obs.

However, there is no clear linear relationship present between lead time and hotel bookings given the graph shown.

Amali6 · Posted 05-22-2020 04:48 PM

Thank you very much for the response. I am trying to understand the correlation so i wanted to know in this scatter graph the p-value is 0.9590 how this is very weak, sorry i cant understand on what aspect to say when a plot correlation graph is strong or weak. I tried with some other variables in my dataset but the output looks like a line between two variables in between. Also could you please explain why the minimum value is 0.

Thank you

lvm · Posted 05-22-2020 04:56 PM

There is no observable relationship between the two variables. And not significant.

But be careful: with so many observations, it is easy to find significant correlations even when there is no meaningful or predictable relationship.

Amali6 · Posted 05-22-2020 05:03 PM

Thank you. since mine is big dataset, the plots are showing very close and i cant understand clearly. Could you please explain how to say how to say there is relationship and not related?

Thanks.

Reeza · Posted 05-22-2020 05:03 PM

There is no linear relationship between your two variables.

Reeza · Posted 05-22-2020 05:00 PM

Your correlation is 0.00015 so nearly zero and your p-value is 0.9590.

This means you have no linear correlation and it's not statistically significant.

Amali6 · Posted 05-22-2020 05:23 PM

Thank you very much for explaining. I have tried with other variables in there i got a line between left and right side of the graph, can you help what kind of relationship to say in that case please?

Thanks

Reeza · Posted 05-22-2020 05:36 PM

@Amali6 wrote:
Thank you very much for explaining. I have tried with other variables in there i got a line between left and right side of the graph, can you help what kind of relationship to say in that case please?

Thanks

Not without seeing the graphs and numbers.

Amali6 · Posted 05-22-2020 05:42 PM

Sorry this is what i got.

Thanks

ballardw · Posted 05-22-2020 05:43 PM

You might try SPEARMAN correlation which is more concerned with direction of change than magnitude.

A Spearman correlation close to 1 means that as one variable increases in value so does the other, or a value close to -1 means that when one variable increases the other decreases.

You can see if this is interesting by adding the option SPEARMAN to the Proc Corr statement.

Many times the plots the statistics procs generate are not the clearest.

Try this with your data

proc sgplot data=hotel.hotel_bookings;
   scatter x=lead_time y=booking_changes / 
                      markerattrs=(symbol=circlefilled size=3pt) 
                      transparency=.9 
   ;
run;

The transparency setting close to 1 means that the markers will be very faint. But when multiples are drawn in the same location the color density gets stronger. So sometimes you can see underlying patterns inside the data.

Amali6 · Posted 05-22-2020 06:07 PM

Thank you. Can you explain the condition on correlation how to predict that two variables are correlated or not please?

Reeza · Posted 05-22-2020 06:33 PM

The free SAS Stats course likely covers this topic, as does Khan Academy or a variety of free resources online.

https://www.khanacademy.org/math/statistics-probability/describing-relationships-quantitative-data/i...

https://stats.idre.ucla.edu/sas/output/proc-corr/

Amali6 · Posted 05-22-2020 06:52 PM

Thank you very much this helps a lot!!

Registration is open