BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
xyxu
Quartz | Level 8

I have a data set of daily prices for each ID that looks like the following example. 

data have;
	input id $1. date :yymmdd10. price 5.;
	format date yymmdd10.;
	datalines;
A 2000-01-02 3
A 2000-01-03 4
A 2000-01-04 6
A 2000-01-05 4
A 2000-01-06 7
A 2000-01-07 8
A 2000-01-08 9
A 2000-01-09 6
A 2000-01-10 7
A 2000-01-11 5
A 2000-01-12 6
A 2000-01-13 5
B 2000-01-02 4
B 2000-01-03 6
B 2000-01-04 7
B 2000-01-05 8
B 2000-01-06 9
B 2000-01-07 10
B 2000-01-08 12
B 2000-01-09 11
B 2000-01-10 9
B 2000-01-11 8
B 2000-01-12 9
B 2000-01-13 7
C 2000-01-02 3
C 2000-01-03 2
C 2000-01-04 4
C 2000-01-05 3
C 2000-01-06 4
C 2000-01-07 5
C 2000-01-08 4
C 2000-01-09 3
C 2000-01-10 4
C 2000-01-11 6
C 2000-01-12 6
C 2000-01-13 5
;
run;

data have; set have; week = intnx('week', date, 0, 'e'); format week yymmdd10.; run;

I want to calculate correlation between every pair of IDs' prices: i.e., correlation between prices of (A,B), (B,C), (A,C), over each week. The result I want looks like the following:

data want;
	input id1 $ id2 $ week :yymmdd10. corr 5.;
	format week yymmdd10.;
	datalines;
A B 2000-01-08 0.3
A C 2000-01-08 0.5
B C 2000-01-08 0.4
A B 2000-01-15 0.2
A C 2000-01-15 0.4
B C 2000-01-15 0.5
;
run;

Is there a way to do this efficiently, especially when I have A, B, C, ..., Z and many weeks?

1 ACCEPTED SOLUTION

Accepted Solutions
PaigeMiller
Diamond | Level 26

Ok, let's give this a try. You will have to change this code in the third row, so instead of a--c you have a valid SAS list of variable names for your problem, whatever they are, and that they are also in alphabetical order for the code to work. Naturally, we can program them to be in alphabetical order if that's not an easy thing for you to do manually. EDIT: ignore the alphabetical order comment, the code works even on variable lists where the variables are NOT in alphabetical order.

 

data want;
    set corr;
    array x(*) a--c;
    do j=1 to dim(x);
        thisvar=vname(x(j));
        if thisvar>_name_ then do;
             corr=vvaluex(thisvar);
             output;
        end;
    end;
    keep week _name_ thisvar corr;
run;

 

--
Paige Miller

View solution in original post

8 REPLIES 8
PaigeMiller
Diamond | Level 26

Correlations in SAS are defined between variables, not between rows of the data set. So you will have to transpose the data so that you have variables A B and C, and not rows that represent A B and C.

 

proc sort data=have;
	by date;
run;
proc transpose data=have out=have_t;
	by date;
	var price;
	id id;
run;
data have_t;
	set have_t;
	week = intnx('week', date, 0, 'e');
	format week mmddyy10.;
run;
proc corr data=have_t;
	var a b c;
	by week;
run;
--
Paige Miller
xyxu
Quartz | Level 8

When I do

proc corr data = have_t out = corr (where = (_type_ = 'CORR')) noprint;
	var a b c;
	by week;
run;

I get a correlation matrix. Is there a simple way to conver the matrix to what I want?

PaigeMiller
Diamond | Level 26

@xyxu wrote:

 

proc corr data = have_t out = corr (where = (_type_ = 'CORR')) noprint;
	var a b c;
	by week;
run;

I get a correlation matrix. Is there a simple way to conver the matrix to what I want?


I don't know what you mean. I don't know what you want, or why the above code doesn't meet your needs.

--
Paige Miller
xyxu
Quartz | Level 8

Sorry if my question wasn't clear. In my original question, there is a dataset "want", which describes my goal. I was wondering how to convert the correlation matrix to that dataset.

PaigeMiller
Diamond | Level 26

It has to be that form, and no other form?

 

Let me ask another real-world question ... are there only three variables A B C or does the real-world problem you are working on have many more than three variables?

--
Paige Miller
xyxu
Quartz | Level 8

Yeah I am trying to define a pair-level identifier (e.g., "AB", "AC", "BC") and use the pairwise correlation for later steps. In the real-world problem, there are around 20 IDs.

PaigeMiller
Diamond | Level 26

Ok, let's give this a try. You will have to change this code in the third row, so instead of a--c you have a valid SAS list of variable names for your problem, whatever they are, and that they are also in alphabetical order for the code to work. Naturally, we can program them to be in alphabetical order if that's not an easy thing for you to do manually. EDIT: ignore the alphabetical order comment, the code works even on variable lists where the variables are NOT in alphabetical order.

 

data want;
    set corr;
    array x(*) a--c;
    do j=1 to dim(x);
        thisvar=vname(x(j));
        if thisvar>_name_ then do;
             corr=vvaluex(thisvar);
             output;
        end;
    end;
    keep week _name_ thisvar corr;
run;

 

--
Paige Miller
xyxu
Quartz | Level 8
This works amazingly well, despite that I am not 100% sure why it works.

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 8 replies
  • 2359 views
  • 3 likes
  • 2 in conversation