I have 2 years stock market time series data comprising stocks return data. I am averaging returns on monthly basis and then using K means clustering method to get clusters for each month. So for first 1 year I am performing cluster analysis 12 times . However due to volatile stock performances I am getting stocks getting assigned to different clusters. How can I identify all the stocks which remains together in one cluster through whole 12 months period ?
Could you please recommend best method to find most stable cluster over period of 12 months ?
Regards,
Amol Deshmukh
Assuming that you have the licence to SAS/OR, you can do this:
/* Random assignment of 20 Stocks in 4 clusters over 3 months */
data have;
call streaminit(56456);
do stock = 1 to 20;
do month = 1 to 3;
clust = rand("integer", 4);
output;
end;
end;
run;
/* List the stocks that share the same cluster each month */
proc sql;
create table pairs as
select
a.month,
a.stock as stock1,
b.stock as stock2
from
have as a inner join
have as b on a.month=b.month and a.clust=b.clust and a.stock < b.stock;
create table stablePairs as
select
stock1, stock2
from pairs
group by stock1, stock2
having count(*) = (select count(distinct month) from have);
quit;
/* Find the groups of stocks that are always found together */
/* Note: proc optnet is part of SAS/OR */
proc optnet data_links=stablePairs direction=undirected;
data_links_var from=stock1 to=stock2;
clique out=stableGroups(rename=(clique=group node=stock));
run;
/* For example, stocks 4, 7 and 10 are always found in the same cluster */
proc print noobs data=stableGroups; run;
group stock 1 3 1 6 2 4 2 7 2 10 3 5 3 15 4 17 4 18
Assuming that you have the licence to SAS/OR, you can do this:
/* Random assignment of 20 Stocks in 4 clusters over 3 months */
data have;
call streaminit(56456);
do stock = 1 to 20;
do month = 1 to 3;
clust = rand("integer", 4);
output;
end;
end;
run;
/* List the stocks that share the same cluster each month */
proc sql;
create table pairs as
select
a.month,
a.stock as stock1,
b.stock as stock2
from
have as a inner join
have as b on a.month=b.month and a.clust=b.clust and a.stock < b.stock;
create table stablePairs as
select
stock1, stock2
from pairs
group by stock1, stock2
having count(*) = (select count(distinct month) from have);
quit;
/* Find the groups of stocks that are always found together */
/* Note: proc optnet is part of SAS/OR */
proc optnet data_links=stablePairs direction=undirected;
data_links_var from=stock1 to=stock2;
clique out=stableGroups(rename=(clique=group node=stock));
run;
/* For example, stocks 4, 7 and 10 are always found in the same cluster */
proc print noobs data=stableGroups; run;
group stock 1 3 1 6 2 4 2 7 2 10 3 5 3 15 4 17 4 18
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.
Find more tutorials on the SAS Users YouTube channel.