BookmarkSubscribeRSS Feed
Krishnam
Calcite | Level 5
I have Jaccard score in comparing two strings to check the similarity/Dissimlarity using R. 
I tried to replicate the same in SAS but couldn't achieve it.
Can you please let me know if there is function/way to get jaccard score in SAS for
comparing two strings "Krishna" and "Krishna Reddy"

I tried to replicate in SAS with proc distance but no luck.

in R
library(stringdist)
stringdist('krishna', 'krishna reddy', method='jaccard')

result is 0.3636

Specifically looking for Jaccard distance only.

Appreciate any help!
3 REPLIES 3
PGStats
Opal | Level 21

What is the meaning of the Jaccard distance between strings in R? Is it based on the presence/absence of letters, words, sounds, in the strings?

SAS has many specialized functions for computing the distance between strings: COMPGED, COMPLEV, SOUNDEX, SPEDIS, as well as CALL COMPCOST.

PG
Krishnam
Calcite | Level 5
hi,

Here is the illustration with example
say you have two strings 'abcde', 'abdcde', I split them into double letters characters combinations including space in the order and flag the occurrence in sting v1(abcde) and string v2(abdcde).

ab bc cd de dc bd
V1 1 1 1 1 0 0
V2 1 0 1 1 1 1

v1 intersection v2=3
v1 union v2 =6 so
my score is 1 - 3 / 6 =0.5
AnnaBrown
Community Manager

Hi Krishnam,

 

I've moved your post to the Text Analytics Community so that more experts may be able to help out.

 

Anna


Join us for SAS Community Trivia
SAS Bowl XXIX, The SAS Hackathon
Wednesday, March 8, 2023, at 10 AM ET | #SASBowl

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

How to choose a machine learning algorithm

Use this tutorial as a handy guide to weigh the pros and cons of these commonly used machine learning algorithms.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 3 replies
  • 1500 views
  • 0 likes
  • 3 in conversation