Hi,
I am trying to match 2 tables and generate exceptions.
For example :
Table1.Name : DAVID,RYAN,JOHN,..... (20k unique names)
Table2.Article : THE VISA WAIVER PROGRAM VWP ALLOWS CITIZENS,... (10k of article crawled from google)
I want to generate exceptions where Article like '%NAME%' which is the 20k of unique names from table1.
Appreciate someone can assist on it.
Like this?
select * from NAMES, ARTICLES where ARTICLE contains NAME;
Note this will evaluate 10k*20k = 200m rows. It might run slowly on a small server.
Please provide some sample data. Makes it much easier to provide a usable code answer 🙂
Hi ,
Thanks for the support.
Please find the sample data as below :
DATA Name;
INPUT id $ Name $ ;
DATALINES;
a1 Ryan
a2 Jack
a3 Larry
;
RUN;
Data Article;
length id article $5000;
infile cards dsd dlm='|' truncover ;
Input id Article $5000. ;
cards;
1|"California, I'm not sure you know it, but if you didn't know it you should," the president said Monday evening in Long Beach."
2|Malaysia, I'm not sure Ryan know it, but if you didn't know it you should," Jack said Monday evening in Long Beach. "The eyes of the nation, this is not hyperbole'
3|The front-running Republican who would likely replace Newsom if he's recalled is Larry Elder, a conservative talk show host and vocal supporter of the former president, as well as a proponent of the so-called "big lie'
;
run;
My expected result will be
ID | Article | Name |
2 | |Malaysia, I'm not sure Ryan know it, but if you didn't know it you should," Jack said Monday evening in Long Beach. "The eyes of the nation, this is not hyperbole' | Ryan |
2 | |Malaysia, I'm not sure Ryan know it, but if you didn't know it you should," Jack said Monday evening in Long Beach. "The eyes of the nation, this is not hyperbole' | Jack |
3 | The front-running Republican who would likely replace Newsom if he's recalled is Larry Elder, a conservative talk show host and vocal supporter of the former | Larry |
Like this?
select * from NAMES, ARTICLES where ARTICLE contains NAME;
Note this will evaluate 10k*20k = 200m rows. It might run slowly on a small server.
@RyanWong0510 wrote:
Thanks Chris,
By referring to the sample data I provided, it does not turn out any output.
No output? Post any log in a code box. Copy from the log the entire procedure code and any messages, open a text box on the forum using the </> icon above the message window and paste the copied text.
Likely need to remove trailing blanks in the Name variable:
proc sql; select * from NAME, ARTICLE where ARTICLE contains strip(name) ; quit;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.