An off-topic spot to chat about your musings of the day

Knowledge Discovery Process

Occasional Contributor
Posts: 16

Knowledge Discovery Process

As part of my course curriculum for MS in Analytics program, there was an assignment to become familiar with the intent of Knowledge Discovery Process (KDP). I wanted to share this brief outline of KDP with sas community.




KDP transforms large amounts of raw data to actionable knowledge (translating to $) that was unknown before the knowledge discovery endeavor. Data mining is a sub-process in the process of transformation of data to useful knowledge.


Example: Per Gunelius (2014) scholarly blog, in 2014, Facebook users shared 2.5 million pieces of content, Twitter users tweeted nearly 300,000 times, Email users send over 200 million messages, and Amazon generates over $80,000 revenue in online sales every minute and Google processed 20 petabytes of information per day.


When the volume, velocity, and variety change the current knowledge by the minute, collected data need to be transformed into useful information for actionable decisions by organizations.


Steps involved in transformation of data into Knowledge Discovery process are:

Per Silwattananusarn, and Tuamsuk (2012),

  1. Data Selection: For addressing the business problem of interest, source data needs to be selected that is relevant to the problem.
  2. Data Cleaning/Pre-processing (Data Quality Improvement): This step includes review of existing raw data to verify data inconsistencies, missing data, duplicate observations, and combining data from various sources.
  3. Data Transformation: This step involves transforming data to usable form i.e., if we are working with a specific software to gain insights such as SAS, different data types such as file types csv, txt, or other types would be transformed to SAS datasets for ease of data organization.
  4. Data Mining: Identifying patterns in data using algorithms, to model data with training data as the source to test on target data to validate the model prior to user testing.
  5. Interpretation/Evaluation: Data analysts / Data Scientists interpret models and data alignment with models to interpret patterns or insights and present to stakeholders in a useful format for organizational decision making.


Key benefits in adopting KDP in data mining:

  • Per Han, Kamber, and Pei (2012), large data repositories currently available to organizations are not used for critical organizational decision making and yet, most organizations use leaders’ intuition and ‘gut feel’ for this initiative which might result in unintended and undesirable consequences leading to known and unknown losses. If the same organizations used structured knowledge discovery process and mine existing data to excavate “golden nuggets”, the decision making is effective, fact based, and serves as foundation for current or future tactical and strategic decisions.


  • The volume, variety, and velocity of data is increasing by the second, and exceeds human capacity to mine this data. Knowledge discovery process and technologies that process vast amounts of data to provide insights from data to avoid ‘data rich, information poor’ predicament.



Silwattananusarn, T., & Tuamsuk, K. (2012). Data mining and its applications for knowledge management: A literature review from 2007 to 2012. International Journal of Data Mining & Knowledge Management Process, 2(5), 13–24.

Gunelius, Susan (2014) The Data Explosion in 2014 Minute by Minute – Infographic.

Han, J., Kamber, M., & Pei, J. (2012). Data mining: Concepts and techniques (3rd ed.). Waltham, MA: Morgan Kaufmann.

Occasional Contributor
Posts: 16

Re: Knowledge Discovery Process

[ Edited ]

Chris Hemedinger, Thank you for your review and encouragement. I appreciate it.


Murali Sastry

Ask a Question
Discussion stats
  • 1 reply
  • 1 in conversation