BookmarkSubscribeRSS Feed
ChrisHemedinger
Community Manager

[AI-generated Proc Studley][AI-generated Proc Studley]Here's the situation: The adventures of Proc Studley -- a popular Sci-Fi/Fantasy book series -- is under threat by a string of counterfeit/fake versions that have hit the marketplace. Your task is to find the fakes. All of the authentic books have valid ISBN 10-digit numbers, which is how world libraries track books. The counterfeit books have invalid ISBN values.

 

(This challenge first premiered in SAS Analytics Explorers, a special group for customers who want to do more with their SAS learning and earn rewards in the process!)

 

The algorithm for validating ISBN-10 values uses a checksum approach. Here are the steps:

  • Multiply each of the first 9 digits by a number in the descending sequence from 10 to 2, and sum the results.
  • Divide the sum by 11.
  • Subtract the remainder (not the quotient) from 11.
  • For the 10th digit, use the difference from the previous step. If that difference is 11, use the number 0; if 10, use the letter X.

Here are the book titles and their purported ISBN values. Write a the shortest possible SAS program that reads the book list, validates each ISBN value, and create a report of the real and fake books. Include your code and the output in your response! 

 

(Note: obviously all of these books are made up, but the ISBN number scheme and algorithm is a real thing! You can check your work ad hoc with the ISBN Checker.)

 

Here's the data, all ready to run in SAS.

 

 

data isbn;
  infile datalines dsd;
  length title $ 70 isbn $ 10;
  input title isbn;
  datalines;
Proc Studley and the Starship of Destiny,0434488665
The Chronicles of Proc Studley: The Lost Realm,2018166516
Proc Studley and the Quantum Key,9405643837
The Legend of Proc Studley: The Celestial Quest,6032522768
Proc Studley and the Enchanted Nebula,4394205952
The Adventures of Proc Studley: The Galactic Rift,2353276079
Proc Studley and the Time Crystal,6493135591
The Saga of Proc Studley: The Forbidden Planet,6776994355
Proc Studley and the Alien Alliance,2227835451
The Epic of Proc Studley: The Cosmic War,8018735913
Proc Studley and the Dragon of Andromeda,0841779538
The Odyssey of Proc Studley: The Stellar Siege,8730652341
Proc Studley and the Phoenix Star,1594122350
The Journey of Proc Studley: The Nebula Nexus,224320418X
Proc Studley and the Shadow Realm,6857923406
The Quest of Proc Studley: The Celestial Citadel,3967006111
Proc Studley and the Interstellar Insurrection,9537581977
The Trials of Proc Studley: The Quantum Paradox,1283514257
Proc Studley and the Martian Rebellion,566485052X
The Legacy of Proc Studley: The Galactic Guardians,6994588902
Proc Studley and the Eternal Eclipse,9236137644
The Chronicles of Proc Studley: The Andromeda Enigma,7649918275
Proc Studley and the Infinite Horizon,458574645X
The Adventures of Proc Studley: The Alien Dawn,7601111520
Proc Studley and the Black Hole Conspiracy,1911465988
The Legend of Proc Studley: The Cosmic Code,266671036X
Proc Studley and the Celestial Shadows,4287030303
The Saga of Proc Studley: The Stellar Saga,2561407012
Proc Studley and the Quantum Quest,6933010252
The Epic of Proc Studley: The Celestial Conflict,4278675463
Proc Studley and the Galactic Gambit,351049735X
The Odyssey of Proc Studley: The Nebula Knights,0566910450
Proc Studley and the Cosmic Crusade,9802436077
The Journey of Proc Studley: The Andromeda Ascension,195432331X
Proc Studley and the Stellar Struggle,8421179217
The Quest of Proc Studley: The Alien Artifact,0979272564
Proc Studley and the Celestial Saga,3584795834
The Trials of Proc Studley: The Galactic Genesis,0713565068
Proc Studley and the Quantum Conundrum,9074601407
The Legacy of Proc Studley: The Cosmic Odyssey,0168786583
;
run;

 

 

 

 

 

Learn from the Experts! Check out the huge catalog of free sessions in the Ask the Expert webinar series.
9 REPLIES 9
A_Kh
Lapis Lazuli | Level 10
data validation;
	set isbn;
	array _nine [9] (10, 9, 8, 7, 6, 5, 4, 3, 2);
	array nine [9];
	do i=1 to 9;
		nine{i}= input(substr(isbn,i, 1), best.)*_nine{i};
	end; 
	_sum=sum(of nine:);
	_remainder= mod(_sum, 11);
	result= 11-_remainder;
	if result eq 10 then isbn_validated= substr(isbn, 1, 9)||'X';
	else if result eq 11 then isbn_validated= substr(isbn, 1, 9)||'0';
	if isbn=isbn_validated;
	keep title isbn:;
run; 
proc print; run; 

Real ones. 
Capture.PNG
The remaining is fake. 

ChrisHemedinger
Community Manager

Good attempt @A_Kh , but check your work! I think you've allowed only those that end with '0' or 'X'. There are actually several more valid ISBNs in the collection 😉

Learn from the Experts! Check out the huge catalog of free sessions in the Ask the Expert webinar series.
A_Kh
Lapis Lazuli | Level 10

@ChrisHemedinger wrote:

[AI-generated Proc Studley][AI-generated Proc Studley]

 

The algorithm for validating ISBN-10 values uses a checksum approach. Here are the steps:
....

  • For the 10th digit, if the summed result is 11, use the number 0; if 10, use the letter X.

 


What would be the the 10th digit if the summed result is not in 10 or 11? 

ChrisHemedinger
Community Manager

My fault for not making it clear! The final checksum digit is the difference calculated in the previous step. If it's a 2-digit difference: for 11, use 0; for 10, use X. I edited the problem for clarity.

Learn from the Experts! Check out the huge catalog of free sessions in the Ask the Expert webinar series.
A_Kh
Lapis Lazuli | Level 10

Edited based on the clarification:

data validation;
	set isbn;
	array _nine [9] (10, 9, 8, 7, 6, 5, 4, 3, 2);
	array nine [9];
	do i=1 to 9;
		nine{i}= input(substr(isbn,i, 1), best.)*_nine{i};
	end; 
	_sum=sum(of nine:);
	_remainder= mod(_sum, 11);
	result= 11-_remainder;
	if result eq 10 then checker= substr(isbn, 1, 9)||'X';
	else if result eq 11 then checker= substr(isbn, 1, 9)||'0';
	else checker= cats(substr(isbn, 1, 9), result);
	if isbn=checker then isbn_validated= 'Yes';
	else isbn_validated='No'; 
	if isbn_validated='No'; 
	keep title isbn:;
run; 

proc print; run; 

Fake books  (N=19):
Capture.PNG

FreelanceReinh
Jade | Level 19

Here's another suggestion:

data;
set isbn;
s=0;
do j=1 to 9;
  s+j*input(char(isbn,j),1.);
end;
m=mod(s,11);
c=put(ifn(m=j,.X,m),1.)=char(isbn,j);
put isbn c;
run;

The report is only suitable for internal purposes as it contains just the ISBNs and a "correctness flag" c, i.e., c=0 indicates fake ISBNs, c=1 correct ones:

0434488665 0
2018166516 0
9405643837 0
6032522768 1
4394205952 0
2353276079 0
6493135591 1
6776994355 0
2227835451 1
8018735913 1
0841779538 1
8730652341 0
1594122350 1
224320418X 0
6857923406 0
3967006111 0
9537581977 1
1283514257 1
566485052X 0
6994588902 0
9236137644 1
7649918275 0
458574645X 0
7601111520 1
1911465988 1
266671036X 1
4287030303 1
2561407012 1
6933010252 1
4278675463 0
351049735X 1
0566910450 0
9802436077 0
195432331X 1
8421179217 1
0979272564 1
3584795834 1
0713565068 0
9074601407 0
0168786583 1

I used a bit of algebra to simplify the formulas.

 

Of course, the code could be shortened further. For example:

  • At the cost of an even uglier report (but now including the book titles) one could replace "isbn c" by "_all_" in the final PUT statement.
  • An ugly log could be achieved by omitting the INPUT function in the sum statement.
  • The name of the input dataset could also be omitted if the code is submitted directly after the DATA step creating dataset ISBN (risky!).
  • Sacrificing code readability, one could write all statements on the same line, without separating blanks, where possible.

However, the resulting DATA step (containing 108 characters) violates so many coding standards that it must be hidden behind a spoiler.

Spoiler
data;set;s=0;do j=1to 9;s+j*char(isbn,j);end;m=mod(s,11);c=put(ifn(m=j,.X,m),1.)=char(isbn,j);put _all_;run;
Quentin
Super User

I know @yabwon is a fan of SAS Code Golf.  And he gave a couple solutions in the Analytics Explorers thread.  Maybe he'll take a swing at golfing this one.

The Boston Area SAS Users Group is hosting free webinars!
Next up: Joe Madden & Joseph Henry present Putting Power into the Hands of the Programmer with SAS Viya Workbench on Wednesday Nov 6.
Register now at https://www.basug.org/events.
yabwon
Onyx | Level 15

@Quentin thanks for calling me out.

 

@FreelanceReinh thanks for the algebra, I forgot that we are in Z11 🙂

 

If we are already in the realm of "breaking good programming practice" your already "skinny" solution could be boiled down by few bytes more, to 92, like this:

data;set;s=0;do j=1to 9;s+j*char(isbn,j);end;c=min(char(isbn,j),10)=mod(s,11);put _all_;run;

Of course the quiet assumption is that the ISBN dataset was created directly before that step;

In the first test I was hoping to use >< operator ( char(isbn,j)><10 ) but it's not "missing-value-proof"...

 

User friendly version:

data test;
  set isbn;

  s=0;
  do j=1to 9;
    s+j*char(isbn,j);
  end;
  c=min(char(isbn,j),10)=mod(s,11);
  put _all_;
run;

The version I started with was using CALL POKELONG to split ISBN variable into an array of values, but all I could do was 116 bytes

data;set;array i[10]$1;CALL POKELONG(ISBN,ADDRLONG(i1),10);
s=0;do r=1to 9;s+i[r]*r;end;b=min(i[r],10)=mod(s,11);run;

and it was with no "put _all_;" in the code, wit PUT it's +10 

User friendly version:

 

data test2; 
  set isbn;

  array i[10]$1;
  CALL POKELONG(ISBN,ADDRLONG(i1),10);

  s=0;
  do r=1to 9;
    s+i[r]*r;
  end;
  b=min(i[r],10)=mod(s,11);
run;

For 32 bit sas it could be 8 bytes shorter, because we could use CALL POKE() and ADDR().

 

Will try to golf a bit more, but that 92 looks really hard.

 

Bart

_______________
Polish SAS Users Group: www.polsug.com and communities.sas.com/polsug

"SAS Packages: the way to share" at SGF2020 Proceedings (the latest version), GitHub Repository, and YouTube Video.
Hands-on-Workshop: "Share your code with SAS Packages"
"My First SAS Package: A How-To" at SGF2021 Proceedings

SAS Ballot Ideas: one: SPF in SAS, two, and three
SAS Documentation



FreelanceReinh
Jade | Level 19

@yabwon: Very clever, as usual! This is a substantial shortening (which comes at the price of not recognizing an "ISBN" like "266671036Y" as incorrect, but this was not a requirement). Other than the trivial replacement of the literal "10" by variable "j" (→ 91 characters) I don't see any room for further "improvement" at the moment.

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 9 replies
  • 662 views
  • 12 likes
  • 5 in conversation