Practical Predictive Analytics
上QQ阅读APP看书,第一时间看更新

Process understanding

An understanding of how data is generated in your domain as well as how it has been generated in the past is important. This means understanding how data has been transformed from its original raw state to how it is consumed by the ultimate user of the data. Understanding the imperfections of the current process will enable you to get a sense of which data can be consistently relied upon. Additionally, try to gather a history of what data changes and methodologies have been attempted in the past. This is important since the different ways that data used might have been quite different, and the goals might have been different. Knowing what has succeeded and what has failed in the past will prevent you from reinventing the wheel, keep you from replicating the very same mistakes, and help you build upon what has been attempted before.