Hello everyone,
I have a question about Text Mining.
We have a data that includes whole job applications with candidate information and we have also data includes whole job postings with employer information and these datas are historical. These data sets will be a live data in the future but we currently proceed over a specific reporting date(snapshot).
For Job Postings we have two information;
Employer ID
Job Description
We created a single column that is called Job Description by pulling the following values;
Gender
Location
Military Status
Educational Status
Social Status (Disabled, Veteran etc.),
Marital Status
Driver's License
Foreign Language
For Job Applications we have two information;
Employee ID
Candidate Information
We created a single column that is called Candidate Information by pulling the following values;
Gender
Location
Military Status
Educational Status
Social Status (Disabled, Veteran etc.),
Marital Status
Driver's License
Foreign Language
As you know, the nodes I can use for Text Miner on Enterprise Miner are as follows;
Import, Parsing, Filter, Topic, Cluster, Profile and Rule Builder
The targeted operations with Text Mining are as follows;
- Our goal is to find out how well the Candidate Information and the Job Description table match.
- Assigning a score according to the match rate.
- We want to protect the top 30 candidates with the best score by avoiding multiplication.
As a result we want to get an output similar to the table below.
KEY | RANK | Employer_ID | JOB_DESCRIPTION (TEXT) | Employee_ID | CANDIDATE_INFORMATION (TEXT) | MATCH SCORE |
1 | 1 | 1 | A B c d e | 10 | A B C D E F | 100 |
2 | 2 | 1 | A B c d e | 20 | A B C D E | 100 |
3 | 3 | 1 | A B c d e | 30 | A B C | 60 |
4 | 4 | 1 | A B c d e | 70 | C UK | 20 |
: | : | 1 | : | : | : | : |
30 | 30 | 1 | A B c d e | 40 | A | 20 |
31 | 1 | 2 | TR uK usA ITA | 90 | TR UK USA ITA | 100 |
32 | 2 | 2 | TR uK usA ITA | 50 | TR UK | 50 |
33 | 3 | 2 | TR uK usA ITA | 60 | TR USA | 50 |
34 | 4 | 2 | TR uK usA ITA | 80 | TR | 25 |
35 | : | 2 | : | : | : | : |
: | 30 | 2 | TR uK usA ITA | 70 | C UK | 25 |
It would be nice if we could get guidance on this case with the Text Miner perspective.