Continual growth in computing power has resulted in the collection of
increasing volumes of data. Data Mining is the quantitative science
that seeks to extract knowledge from such information. What is
interesting about the evolution of data mining research is the
increasing complexity of problems being considered and the data
structures that arise. Initial challenges of scaling existing
algorithms and models to increased volumes have given way to a new
generation of problems with complex data structures
that require models to be invented from the ground up.
Drawing inspiration from a number of challenging real-world problems,
members of this team are inventing new and innovative
"statistical learning" tools.
These problems often involve
complex data structures, such as rare target problems in which a
small, valuable part of a population must be identified efficiently,
network mining problems, in which the data consist of transactions
occurring on a network over time, or monitoring complex
processes, which arises when vast quantities of data are collected
from a production process such as automotive manufacturing. In many
of these areas, new data structures emerge, such as very unbalanced
classes in rare target problems, functional data in monitoring, or
transactional data structures in network monitoring.
Some of the problems considered by the group have a more
methodological focus, rather than arising from a specific scientific
problem. These include decision tree modelling and
unsupervised learning of non-standard data.
Data Mining
of Complex Data Structures
Part of the National Institute on Complex Data Structures
We are drowning in information and starving for
knowledge - Rutherford D. Roger