Hi,
I am trying to match 2 tables and generate exceptions.
For example :
Table1.Name : DAVID,RYAN,JOHN,..... (20k unique names)
Table2.Article : THE VISA WAIVER PROGRAM VWP ALLOWS CITIZENS,... (10k of article crawled from google)
I want to generate exceptions where Article like '%NAME%' which is the 20k of unique names from table1.
Appreciate someone can assist on it.
Like this?
select * from NAMES, ARTICLES where ARTICLE contains NAME;
Note this will evaluate 10k*20k = 200m rows. It might run slowly on a small server.
Please provide some sample data. Makes it much easier to provide a usable code answer 🙂
Hi ,
Thanks for the support.
Please find the sample data as below :
DATA Name;
INPUT id $ Name $ ;
DATALINES;
a1 Ryan
a2 Jack
a3 Larry
;
RUN;
Data Article;
length id article $5000;
infile cards dsd dlm='|' truncover ;
Input id Article $5000. ;
cards;
1|"California, I'm not sure you know it, but if you didn't know it you should," the president said Monday evening in Long Beach."
2|Malaysia, I'm not sure Ryan know it, but if you didn't know it you should," Jack said Monday evening in Long Beach. "The eyes of the nation, this is not hyperbole'
3|The front-running Republican who would likely replace Newsom if he's recalled is Larry Elder, a conservative talk show host and vocal supporter of the former president, as well as a proponent of the so-called "big lie'
;
run;
My expected result will be
ID | Article | Name |
2 | |Malaysia, I'm not sure Ryan know it, but if you didn't know it you should," Jack said Monday evening in Long Beach. "The eyes of the nation, this is not hyperbole' | Ryan |
2 | |Malaysia, I'm not sure Ryan know it, but if you didn't know it you should," Jack said Monday evening in Long Beach. "The eyes of the nation, this is not hyperbole' | Jack |
3 | The front-running Republican who would likely replace Newsom if he's recalled is Larry Elder, a conservative talk show host and vocal supporter of the former | Larry |
Like this?
select * from NAMES, ARTICLES where ARTICLE contains NAME;
Note this will evaluate 10k*20k = 200m rows. It might run slowly on a small server.
@RyanWong0510 wrote:
Thanks Chris,
By referring to the sample data I provided, it does not turn out any output.
No output? Post any log in a code box. Copy from the log the entire procedure code and any messages, open a text box on the forum using the </> icon above the message window and paste the copied text.
Likely need to remove trailing blanks in the Name variable:
proc sql; select * from NAME, ARTICLE where ARTICLE contains strip(name) ; quit;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.