Hi,
I have a table (matrix) like this:
| id | year | var1 | var2 | var3 | …. | var300 |
| 1 | 1997 | 3 | 4 | 5 | 6 | |
| 1 | 1998 | 5 | 2 | 1 | 3 | |
| …… | …… | …… | …… | …… | …… | …… |
| 1 | 2007 | 5 | 3 | 6 | 2 | |
| 2 | 1997 | 1 | 1 | 2 | 0 | |
| …… | …… | …… | …… | …… | …… | …… |
| 2 | 2007 | 3 | 1 | 6 | 0 | |
| 3 | 1997 | 2 | 4 | 5 | 4 | |
| …… | …… | …… | …… | …… | …… | |
| 3 | 2006 | 0 | 4 | 3 | 4 | |
| …… | …… | …… | …… | …… | …… | |
| 5000 | 1997 | 0 | 0 | 2 | 6 | |
| …… | …… | …… | …… | …… | …… | …… |
| 5000 | 2006 | 3 | 1 | 2 | 6 |
That said, I have a lot of observations and variables.
Ideally, I want to calculate pairwise cosine similarity between two observations and output like this:
| d1 | id2 | year | distance |
| 1 | 2 | 1997 | xx |
| 1 | 3 | 1997 | xx |
| … | … | … | |
| 1 | 5000 | 2006 | xx |
| 2 | 1 | 1997 | xx |
| … | … | … | |
| 2 | 5000 | 2006 | xx |
| … | … | … |
I am exploring proc distance and proc iml but have not figured it out yet. I will appreciate it very much if someone can help me out here.
Thanks!
Calling @Rick_SAS
April 27 – 30 | Gaylord Texan | Grapevine, Texas
Walk in ready to learn. Walk out ready to deliver. This is the data and AI conference you can't afford to miss.
Register now and lock in 2025 pricing—just $495!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.