BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
scarico
Calcite | Level 5

Dear SAS-Community!

I 'm struggling with counting significant p-values separatly for positive and negative correlation coefficientss.

After running "proc corr" with "ods output PearsonCorr" the following stylized table will be generated:

Variablex1x2x3p1p2p3
x110.8-0.65-0.050.023
x20.81-0.40.05-0.12
x3-0.65-0.410.0230.12-

Despite, I have an idea to count all significant p-values just by keeping variables p1,p2,p3 I have no clue to distinguish if it counts for a positive or negative correlation coeffient.

I'm thankful for every idea/code.

Best

Holger

1 ACCEPTED SOLUTION

Accepted Solutions
steve468
Calcite | Level 5

Not sure how fancy you want to be or the simple way would be something like using arrays..

data x;

set corout;

array ax{*} x1-x3;

array px{*}  p1-p3;

row+1;

do j=row to len(px)

   if px{j}<0.05 then do;

      if ax{j}<0 then sumneg+1;

     if ax{j}>0 then sumpos+1;

end;

run;

Since I am at home and don't have sas this is just a guess to the syntax. Hopefully you know how to do arrays.

What you want to do is in the first row compare x1 to x2 and x1 to x3, count the number significant p values then check the correlations to see if they are negative or positive.

The second record  you want to start with x2 then compare with x3 then check for significance the neg and positive.

Hope this helps.

View solution in original post

8 REPLIES 8
steve468
Calcite | Level 5

Not sure how fancy you want to be or the simple way would be something like using arrays..

data x;

set corout;

array ax{*} x1-x3;

array px{*}  p1-p3;

row+1;

do j=row to len(px)

   if px{j}<0.05 then do;

      if ax{j}<0 then sumneg+1;

     if ax{j}>0 then sumpos+1;

end;

run;

Since I am at home and don't have sas this is just a guess to the syntax. Hopefully you know how to do arrays.

What you want to do is in the first row compare x1 to x2 and x1 to x3, count the number significant p values then check the correlations to see if they are negative or positive.

The second record  you want to start with x2 then compare with x3 then check for significance the neg and positive.

Hope this helps.

scarico
Calcite | Level 5

Hi steve!

Thank you for the reply! I coded it and it works with the stylized table.

I'm goint to run lots of simulations, therefore the amount of variables and their names change in every loop.

I think the problem of the code below is, that the array ax consists of all variables beginning with "_" and "P". But it should only include the "P..." variables.

data B;                                                                                                                                

set A (keep = _:);                                                                                                                     

array ax{*} _numeric_;                                                                                                                 

                                                                                                                                       

set A (keep =P:);                                                                                                                      

array px{*}                                                                                                                            

_numeric_;                                                                                                                            

row +1;                                                                                                                                

                                                                                                                                       

retain sumneg . sumpos .;                                                                                                              

                                                                                                                                       

do j=row to dim(px);                                                                                                                   

        if not missing(px{j}) and px{j} <0.05 then do;                                                                                 

                if ax{j}<0 then sumneg+1;                                                                                              

                if ax{j}>0 then sumpos+1;                                                                                              

                                                                                                                                       

        end;                                                                                                                           

end;                                                                                                                                   

run; 

Reeza
Super User

You have two set statements and keeps? Is that correct?

scarico
Calcite | Level 5

No, I have only the output table of proc corr with correlation coefficients and the corresponding p-values. The output looks like the table in my first post. (for three variables).

Steve468 suggested to code:

array ax{*} x1-x3;

array px{*}  p1-p3;

This is correct for this little example.

The amount of variables of a real data set varies between 1200 - 2000. Simulation example:  _1, _2, _12, _13, .... ,_1213, P_1, P_2, P_12, P_13, .... , P_1213

It is important to load the variables beginning with "_" and "P" in the two different arrays.

Reeza
Super User

I don't understand. You have to "set A" in your code above, why are you doing that?

scarico
Calcite | Level 5

In "set A" is all the data i need. A is the correlation output. It think it will be right to set it only once. (Like Steve468 did it). But, I don't know how to assign all the variables beginning with "_" and "P" to the different arrays.

The procedure should work for a varying amount of variables "_" "P". I tried to set A twice to isolating all the variables beginning wirh "_" and "P" but that doesn't work.

My Code is definitively wrong at this point.

Is there a way to select all variables with a common prefix for an array?

scarico
Calcite | Level 5

Well, I figured it out...

Data B;

set A;

array px{*} P:;                                                                                                                        

array ax{*} _:; 

Thanks everybody!

Rick_SAS
SAS Super FREQ

Maybe I'm missing something, but I think there is an easier way.

Turn on the FISHER option and use ODS output to save the FisherPearsonCorr table.

The FisherPearsonCorr table is stored in "long" format instead of "wide" format. Thus it is trivial to count the significant p-values and to know which are associate with positive and negative correlations. No arrays required.

proc corr data=sashelp.cars outp=out fisher;
   ods output FisherPearsonCorr=fisher;
run;


proc print data=Fisher(obs=5);
var Var WithVar Corr pValue;
run;

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 8 replies
  • 1262 views
  • 0 likes
  • 4 in conversation