Integrating updating domain knowledge data mining dating a woman with bipolar disorder
In fraud telephone calls, it helps to find the destination of the call, duration of the call, time of the day or week, etc.
It also analyzes the patterns that deviate from expected norms.
Genome-wide association studies can help identify multi-gene contributions to disease.
As the number of high-density genomic markers tested increases, however, so does the number of loci associated with disease by chance.
Dividing interesting mining results from uninteresting ones still is a laborious task mainly performed by human users.
We propose to employ formalized domain knowledge for assessing the interestingness of mining results.
In this paper we explore the use of biological domain knowledge to supplement statistical analysis and data mining methods to identify genes and pathways associated with disease.
We describe Pathway/SNP, a software application designed to help evaluate the association between pathways and disease.
These methods can, however, be used in creating new hypotheses to test against the larger data populations.
For example, the data mining step might identify multiple groups in the data, which can then be used to obtain more accurate prediction results by a decision support system.
Neither the data collection, data preparation, nor result interpretation and reporting is part of the data mining step, but do belong to the overall KDD process as additional steps.
Performing a brute-force test for the interaction of four or more high-density genomic loci is unfeasible given the current computational limitations.
Heuristics must be employed to limit the number of statistical tests performed.
This usually involves using database techniques such as spatial indices.