BookmarkSubscribeRSS Feed
Krishnam
Calcite | Level 5
I have Jaccard score in comparing two strings to check the similarity/Dissimlarity using R. 
I tried to replicate the same in SAS but couldn't achieve it.
Can you please let me know if there is function/way to get jaccard score in SAS for
comparing two strings "Krishna" and "Krishna Reddy"

I tried to replicate in SAS with proc distance but no luck.

in R
library(stringdist)
stringdist('krishna', 'krishna reddy', method='jaccard')

result is 0.3636

Specifically looking for Jaccard distance only.

Appreciate any help!
3 REPLIES 3
PGStats
Opal | Level 21

What is the meaning of the Jaccard distance between strings in R? Is it based on the presence/absence of letters, words, sounds, in the strings?

SAS has many specialized functions for computing the distance between strings: COMPGED, COMPLEV, SOUNDEX, SPEDIS, as well as CALL COMPCOST.

PG
Krishnam
Calcite | Level 5
hi,

Here is the illustration with example
say you have two strings 'abcde', 'abdcde', I split them into double letters characters combinations including space in the order and flag the occurrence in sting v1(abcde) and string v2(abdcde).

ab bc cd de dc bd
V1 1 1 1 1 0 0
V2 1 0 1 1 1 1

v1 intersection v2=3
v1 union v2 =6 so
my score is 1 - 3 / 6 =0.5
AnnaBrown
Community Manager

Hi Krishnam,

 

I've moved your post to the Text Analytics Community so that more experts may be able to help out.

 

Anna

sas-innovate-2026-white.png



April 27 – 30 | Gaylord Texan | Grapevine, Texas

Registration is open

Walk in ready to learn. Walk out ready to deliver. This is the data and AI conference you can't afford to miss.
Register now and lock in 2025 pricing—just $495!

Register now

How to choose a machine learning algorithm

Use this tutorial as a handy guide to weigh the pros and cons of these commonly used machine learning algorithms.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 3 replies
  • 2575 views
  • 0 likes
  • 3 in conversation