BookmarkSubscribeRSS Feed
Krishnam
Calcite | Level 5
I have Jaccard score in comparing two strings to check the similarity/Dissimlarity using R. 
I tried to replicate the same in SAS but couldn't achieve it.
Can you please let me know if there is function/way to get jaccard score in SAS for
comparing two strings "Krishna" and "Krishna Reddy"

I tried to replicate in SAS with proc distance but no luck.

in R
library(stringdist)
stringdist('krishna', 'krishna reddy', method='jaccard')

result is 0.3636

Specifically looking for Jaccard distance only.

Appreciate any help!
3 REPLIES 3
PGStats
Opal | Level 21

What is the meaning of the Jaccard distance between strings in R? Is it based on the presence/absence of letters, words, sounds, in the strings?

SAS has many specialized functions for computing the distance between strings: COMPGED, COMPLEV, SOUNDEX, SPEDIS, as well as CALL COMPCOST.

PG
Krishnam
Calcite | Level 5
hi,

Here is the illustration with example
say you have two strings 'abcde', 'abdcde', I split them into double letters characters combinations including space in the order and flag the occurrence in sting v1(abcde) and string v2(abdcde).

ab bc cd de dc bd
V1 1 1 1 1 0 0
V2 1 0 1 1 1 1

v1 intersection v2=3
v1 union v2 =6 so
my score is 1 - 3 / 6 =0.5
AnnaBrown
Community Manager

Hi Krishnam,

 

I've moved your post to the Text Analytics Community so that more experts may be able to help out.

 

Anna


Join us for SAS Community Trivia
SAS Bowl XXIX, The SAS Hackathon
Wednesday, March 8, 2023, at 10 AM ET | #SASBowl

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to choose a machine learning algorithm

Use this tutorial as a handy guide to weigh the pros and cons of these commonly used machine learning algorithms.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 3 replies
  • 1492 views
  • 0 likes
  • 3 in conversation