BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
JAR
Obsidian | Level 7 JAR
Obsidian | Level 7

Hi,

I have two datasets

Original
Date x y
15-08-2021 1 7
16-08-2021 2 4
17-08-2021 3 3
18-08-2021 4 3
     
     
Final
Date x y
15-08-2021 8 6
16-08-2021 7 3
17-08-2021 6 2
18-08-2021 4 1

 

I would like to find the correlation between x of original and x of final. Similarly for each corresponding pair. 

Please advise me how to do this. 

 

Jijil Ramakrishnan

1 ACCEPTED SOLUTION

Accepted Solutions
ballardw
Super User

You can use a double transpose to put a prefix in front of the variables if your variable names are not so long that additional characters won't make the names too long.

Example adding Final_ as prefix:

data junk;
  input a $ x y;
datalines;
a  1 2
b  11 22
c  111 222
;

Proc transpose data=junk  out=trans
   ;
   by a;
run;

proc transpose data=trans out=finalwant (drop=_name_)
      Prefix=Final_;
   by a;
   id  _name_;
   var col1;
run;

This will only rename numeric variables which for Proc Corr is sufficient because you cannot calculate correlation for character valued variables.

You could do the same thing, obviously using a different output name for the second Proc Transpose with a different prefix for the "original" data.

The merge would not need the rename step now.

In the Proc Corr you could use a prefix variable list in the Var and With statements. Assuming you used "Old_" as the prefix for the original data you would use

 

Proc corr data=merged;
   var Old_: ;
   with Final_: ;
run;

The construct Old_:  with the colon immediately after a prefix tells SAS to use all the variables that start with that prefix almost any place a list of variables can appear.

 

 

 

 

View solution in original post

5 REPLIES 5
ballardw
Super User

@JAR wrote:

Hi,

I have two datasets

Original
Date x y
15-08-2021 1 7
16-08-2021 2 4
17-08-2021 3 3
18-08-2021 4 3
     
     
Final
Date x y
15-08-2021 8 6
16-08-2021 7 3
17-08-2021 6 2
18-08-2021 4 1

 

I would like to find the correlation between x of original and x of final. Similarly for each corresponding pair. 

Please advise me how to do this. 

 

Jijil Ramakrishnan


I think you are looking for something like:

/* assumes 1) no duplicates of date
   2) both the data sets are sorted by date
*/
data need;
   merge original
         final (rename=(x=xfinal y=yfinal))
   ;
   by date;
run;

proc corr data=need;
   var x y;
   with xfinal yfinal;
run;

However if you have multiples for the date variable in both data sets you need to provide a more complete example of both data sets.

JAR
Obsidian | Level 7 JAR
Obsidian | Level 7
Thanks this helps. However, I have over 100 variables. Is there any shortcut to
1) Rename them all
2) Pair them up in proc corr
ballardw
Super User

You can use a double transpose to put a prefix in front of the variables if your variable names are not so long that additional characters won't make the names too long.

Example adding Final_ as prefix:

data junk;
  input a $ x y;
datalines;
a  1 2
b  11 22
c  111 222
;

Proc transpose data=junk  out=trans
   ;
   by a;
run;

proc transpose data=trans out=finalwant (drop=_name_)
      Prefix=Final_;
   by a;
   id  _name_;
   var col1;
run;

This will only rename numeric variables which for Proc Corr is sufficient because you cannot calculate correlation for character valued variables.

You could do the same thing, obviously using a different output name for the second Proc Transpose with a different prefix for the "original" data.

The merge would not need the rename step now.

In the Proc Corr you could use a prefix variable list in the Var and With statements. Assuming you used "Old_" as the prefix for the original data you would use

 

Proc corr data=merged;
   var Old_: ;
   with Final_: ;
run;

The construct Old_:  with the colon immediately after a prefix tells SAS to use all the variables that start with that prefix almost any place a list of variables can appear.

 

 

 

 

Tom
Super User Tom
Super User

Do you really want to "correlate" them?  Or just check for differences?

If the later look at PROC COMPARE

proc compare data=original compare=final;
  id date;
run;
JAR
Obsidian | Level 7 JAR
Obsidian | Level 7
I do want to do statistical correlation. Thanks for your help.

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 5 replies
  • 1435 views
  • 3 likes
  • 3 in conversation