Hi,
I have two datasets
Original | ||
Date | x | y |
15-08-2021 | 1 | 7 |
16-08-2021 | 2 | 4 |
17-08-2021 | 3 | 3 |
18-08-2021 | 4 | 3 |
Final | ||
Date | x | y |
15-08-2021 | 8 | 6 |
16-08-2021 | 7 | 3 |
17-08-2021 | 6 | 2 |
18-08-2021 | 4 | 1 |
I would like to find the correlation between x of original and x of final. Similarly for each corresponding pair.
Please advise me how to do this.
Jijil Ramakrishnan
You can use a double transpose to put a prefix in front of the variables if your variable names are not so long that additional characters won't make the names too long.
Example adding Final_ as prefix:
data junk; input a $ x y; datalines; a 1 2 b 11 22 c 111 222 ; Proc transpose data=junk out=trans ; by a; run; proc transpose data=trans out=finalwant (drop=_name_) Prefix=Final_; by a; id _name_; var col1; run;
This will only rename numeric variables which for Proc Corr is sufficient because you cannot calculate correlation for character valued variables.
You could do the same thing, obviously using a different output name for the second Proc Transpose with a different prefix for the "original" data.
The merge would not need the rename step now.
In the Proc Corr you could use a prefix variable list in the Var and With statements. Assuming you used "Old_" as the prefix for the original data you would use
Proc corr data=merged; var Old_: ; with Final_: ; run;
The construct Old_: with the colon immediately after a prefix tells SAS to use all the variables that start with that prefix almost any place a list of variables can appear.
@JAR wrote:
Hi,
I have two datasets
Original Date x y 15-08-2021 1 7 16-08-2021 2 4 17-08-2021 3 3 18-08-2021 4 3 Final Date x y 15-08-2021 8 6 16-08-2021 7 3 17-08-2021 6 2 18-08-2021 4 1
I would like to find the correlation between x of original and x of final. Similarly for each corresponding pair.
Please advise me how to do this.
Jijil Ramakrishnan
I think you are looking for something like:
/* assumes 1) no duplicates of date 2) both the data sets are sorted by date */ data need; merge original final (rename=(x=xfinal y=yfinal)) ; by date; run; proc corr data=need; var x y; with xfinal yfinal; run;
However if you have multiples for the date variable in both data sets you need to provide a more complete example of both data sets.
You can use a double transpose to put a prefix in front of the variables if your variable names are not so long that additional characters won't make the names too long.
Example adding Final_ as prefix:
data junk; input a $ x y; datalines; a 1 2 b 11 22 c 111 222 ; Proc transpose data=junk out=trans ; by a; run; proc transpose data=trans out=finalwant (drop=_name_) Prefix=Final_; by a; id _name_; var col1; run;
This will only rename numeric variables which for Proc Corr is sufficient because you cannot calculate correlation for character valued variables.
You could do the same thing, obviously using a different output name for the second Proc Transpose with a different prefix for the "original" data.
The merge would not need the rename step now.
In the Proc Corr you could use a prefix variable list in the Var and With statements. Assuming you used "Old_" as the prefix for the original data you would use
Proc corr data=merged; var Old_: ; with Final_: ; run;
The construct Old_: with the colon immediately after a prefix tells SAS to use all the variables that start with that prefix almost any place a list of variables can appear.
Do you really want to "correlate" them? Or just check for differences?
If the later look at PROC COMPARE
proc compare data=original compare=final;
id date;
run;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.