Hi,
I have two datasets
Original | ||
Date | x | y |
15-08-2021 | 1 | 7 |
16-08-2021 | 2 | 4 |
17-08-2021 | 3 | 3 |
18-08-2021 | 4 | 3 |
Final | ||
Date | x | y |
15-08-2021 | 8 | 6 |
16-08-2021 | 7 | 3 |
17-08-2021 | 6 | 2 |
18-08-2021 | 4 | 1 |
I would like to find the correlation between x of original and x of final. Similarly for each corresponding pair.
Please advise me how to do this.
Jijil Ramakrishnan
You can use a double transpose to put a prefix in front of the variables if your variable names are not so long that additional characters won't make the names too long.
Example adding Final_ as prefix:
data junk; input a $ x y; datalines; a 1 2 b 11 22 c 111 222 ; Proc transpose data=junk out=trans ; by a; run; proc transpose data=trans out=finalwant (drop=_name_) Prefix=Final_; by a; id _name_; var col1; run;
This will only rename numeric variables which for Proc Corr is sufficient because you cannot calculate correlation for character valued variables.
You could do the same thing, obviously using a different output name for the second Proc Transpose with a different prefix for the "original" data.
The merge would not need the rename step now.
In the Proc Corr you could use a prefix variable list in the Var and With statements. Assuming you used "Old_" as the prefix for the original data you would use
Proc corr data=merged; var Old_: ; with Final_: ; run;
The construct Old_: with the colon immediately after a prefix tells SAS to use all the variables that start with that prefix almost any place a list of variables can appear.
@JAR wrote:
Hi,
I have two datasets
Original Date x y 15-08-2021 1 7 16-08-2021 2 4 17-08-2021 3 3 18-08-2021 4 3 Final Date x y 15-08-2021 8 6 16-08-2021 7 3 17-08-2021 6 2 18-08-2021 4 1
I would like to find the correlation between x of original and x of final. Similarly for each corresponding pair.
Please advise me how to do this.
Jijil Ramakrishnan
I think you are looking for something like:
/* assumes 1) no duplicates of date 2) both the data sets are sorted by date */ data need; merge original final (rename=(x=xfinal y=yfinal)) ; by date; run; proc corr data=need; var x y; with xfinal yfinal; run;
However if you have multiples for the date variable in both data sets you need to provide a more complete example of both data sets.
You can use a double transpose to put a prefix in front of the variables if your variable names are not so long that additional characters won't make the names too long.
Example adding Final_ as prefix:
data junk; input a $ x y; datalines; a 1 2 b 11 22 c 111 222 ; Proc transpose data=junk out=trans ; by a; run; proc transpose data=trans out=finalwant (drop=_name_) Prefix=Final_; by a; id _name_; var col1; run;
This will only rename numeric variables which for Proc Corr is sufficient because you cannot calculate correlation for character valued variables.
You could do the same thing, obviously using a different output name for the second Proc Transpose with a different prefix for the "original" data.
The merge would not need the rename step now.
In the Proc Corr you could use a prefix variable list in the Var and With statements. Assuming you used "Old_" as the prefix for the original data you would use
Proc corr data=merged; var Old_: ; with Final_: ; run;
The construct Old_: with the colon immediately after a prefix tells SAS to use all the variables that start with that prefix almost any place a list of variables can appear.
Do you really want to "correlate" them? Or just check for differences?
If the later look at PROC COMPARE
proc compare data=original compare=final;
id date;
run;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.