BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
JAR
Obsidian | Level 7 JAR
Obsidian | Level 7

Hi,

I have two datasets

Original
Date x y
15-08-2021 1 7
16-08-2021 2 4
17-08-2021 3 3
18-08-2021 4 3
     
     
Final
Date x y
15-08-2021 8 6
16-08-2021 7 3
17-08-2021 6 2
18-08-2021 4 1

 

I would like to find the correlation between x of original and x of final. Similarly for each corresponding pair. 

Please advise me how to do this. 

 

Jijil Ramakrishnan

1 ACCEPTED SOLUTION

Accepted Solutions
ballardw
Super User

You can use a double transpose to put a prefix in front of the variables if your variable names are not so long that additional characters won't make the names too long.

Example adding Final_ as prefix:

data junk;
  input a $ x y;
datalines;
a  1 2
b  11 22
c  111 222
;

Proc transpose data=junk  out=trans
   ;
   by a;
run;

proc transpose data=trans out=finalwant (drop=_name_)
      Prefix=Final_;
   by a;
   id  _name_;
   var col1;
run;

This will only rename numeric variables which for Proc Corr is sufficient because you cannot calculate correlation for character valued variables.

You could do the same thing, obviously using a different output name for the second Proc Transpose with a different prefix for the "original" data.

The merge would not need the rename step now.

In the Proc Corr you could use a prefix variable list in the Var and With statements. Assuming you used "Old_" as the prefix for the original data you would use

 

Proc corr data=merged;
   var Old_: ;
   with Final_: ;
run;

The construct Old_:  with the colon immediately after a prefix tells SAS to use all the variables that start with that prefix almost any place a list of variables can appear.

 

 

 

 

View solution in original post

5 REPLIES 5
ballardw
Super User

@JAR wrote:

Hi,

I have two datasets

Original
Date x y
15-08-2021 1 7
16-08-2021 2 4
17-08-2021 3 3
18-08-2021 4 3
     
     
Final
Date x y
15-08-2021 8 6
16-08-2021 7 3
17-08-2021 6 2
18-08-2021 4 1

 

I would like to find the correlation between x of original and x of final. Similarly for each corresponding pair. 

Please advise me how to do this. 

 

Jijil Ramakrishnan


I think you are looking for something like:

/* assumes 1) no duplicates of date
   2) both the data sets are sorted by date
*/
data need;
   merge original
         final (rename=(x=xfinal y=yfinal))
   ;
   by date;
run;

proc corr data=need;
   var x y;
   with xfinal yfinal;
run;

However if you have multiples for the date variable in both data sets you need to provide a more complete example of both data sets.

JAR
Obsidian | Level 7 JAR
Obsidian | Level 7
Thanks this helps. However, I have over 100 variables. Is there any shortcut to
1) Rename them all
2) Pair them up in proc corr
ballardw
Super User

You can use a double transpose to put a prefix in front of the variables if your variable names are not so long that additional characters won't make the names too long.

Example adding Final_ as prefix:

data junk;
  input a $ x y;
datalines;
a  1 2
b  11 22
c  111 222
;

Proc transpose data=junk  out=trans
   ;
   by a;
run;

proc transpose data=trans out=finalwant (drop=_name_)
      Prefix=Final_;
   by a;
   id  _name_;
   var col1;
run;

This will only rename numeric variables which for Proc Corr is sufficient because you cannot calculate correlation for character valued variables.

You could do the same thing, obviously using a different output name for the second Proc Transpose with a different prefix for the "original" data.

The merge would not need the rename step now.

In the Proc Corr you could use a prefix variable list in the Var and With statements. Assuming you used "Old_" as the prefix for the original data you would use

 

Proc corr data=merged;
   var Old_: ;
   with Final_: ;
run;

The construct Old_:  with the colon immediately after a prefix tells SAS to use all the variables that start with that prefix almost any place a list of variables can appear.

 

 

 

 

Tom
Super User Tom
Super User

Do you really want to "correlate" them?  Or just check for differences?

If the later look at PROC COMPARE

proc compare data=original compare=final;
  id date;
run;
JAR
Obsidian | Level 7 JAR
Obsidian | Level 7
I do want to do statistical correlation. Thanks for your help.

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 5 replies
  • 1466 views
  • 3 likes
  • 3 in conversation