crisp dm methodology

model for datamining process standardization, the goal is to set common names and steps to organized workloads inside the team

phases

business understending

data understending

in this steps we are interested in understand what data are available for the project describe them and verifying the quality

data preparation

this is the most important phase and the most time consuming one, in this fase we want to avoid possible data leaks that can break our datamining model

in this phase data are manipulated to improve quality as follows:

every step of this phase should be modeled to be reproducible

data modeling

in this phase we apply machine learning techniques to bring to light patterns hided into the data as follows

evaluation step

in this step we assets the results of the data mining process, we want to evaluate the accuracy of the derived model and the impact in the decision making process as follows:

deployment

in this phase the results of the process are put into production to gain profit from the investments as follows:

ToC