BookmarkSubscribeRSS Feed
Rajaram
Obsidian | Level 7

I am working on finding similarities between two SAS programs (i.e., how much copied from other's code), similar to MOSS (Measure of software similarity), JPlag, CodeMatch these are all not supporting SAS Programming language. I have found 3 papers on the internet, one is based on python (very few details available: Ensuring Programming Integrity with Python - PharmaSUG), the second one has written on SAS (Upholding Ethics and Integrity: A macro-based approach to detect plagiarism in programming) but not much information and the third one has written on PROC GROOVY (Be wise, plagiarize

 

I have also tried using SOUNDEX with the COMPGED functions but not helping (Reference)

 

Is there any code available to use or help me with steps to identify similarity?

 

I need to get the following details:

  1. Similarity Scores
  2. Matched Codes/Lines (Exact Match - COMPARE Function/Diff function in UNIX)
  3. Matched Codes/Lines (Fuzzy Match/Lexical Similarity)
  4. Matched Blocks (Exact Match - COMPARE Function/Diff function in UNIX)
  5. Matched Blocks (Fuzzy Match/Lexical Similarity)

 

 

2 REPLIES 2
Rajaram
Obsidian | Level 7

Thank you @Ksharp. I have referred links provided by you and GitHub, https://lexjansen.com/ before I write to this forum. Today I received mail from PharmaSUG and they have published 2020 papers. I will also look into those papers.

SAS Innovate 2025: Register Now

Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 2 replies
  • 1549 views
  • 0 likes
  • 2 in conversation