My name is Aijun An. I work
in data mining for research . In data mining, we study how to find 
interesting 
and useful knowledge from large amounts of data
and those knowledge is usually
hidden patterns, hidden behind the data
 and they can be a very useful
for the decision makers
to make informed decisions
to improve their business. Basically, at CIVDDD we have been working with 
a couple of industry partners
one is the Globe and Mail
and the other is IBM. With the Globe and Mail
we are working on data analytics and visualization
for decision making
in news media. One is one user profiling
we have analyzed their very large 
log data, which consists of the clickstreams
of their online customers. This data contains
a lot of information about the online
readers of Globe and Mail,
like the location of the
reader, what section of the
newspaper they read, what
kind of articles they read
and what time they read. It
contains much information about
had the online reader.  And by
analyzing this file, we first tried to
group the their customers, base
on 
the information that we have, into different groups, 
and then try to identify the unique
characteristics
In data mining, we have techniques to  
anonymize the data before
the data is released to the data miners
so usuallyl, the data we have gotten from
companies is already anonymized.
They have data scientists who
hide all the
private information in the data.
Another piece of work
we have done in recently, we have
topicdetection from text
documents. we are given
their corpus and we run our problems
to automatically discover the topics
in those articles and find the trends 
of each topic over time. And based on that result, 
we've further developed 
personalized, content-based recommendation system.
The system basically, based on the topics we discovered
from the colection of documents and from
the history of the online reader what kind of 
topics they have been reading, we
make recommendations of news articles
to our user.
We can  also use data mining
visualization tools to visualize
the data-mining result, so that the
result can be better presented to the user.
As a researcher, we have to 
produce research results through
publications, so we are often working on
research papers together.
The programs that can help us to
analyze data
so we developed those programs. In order to develop
those programs, we need some other tools
to support our development 
activities.  For some
programming
environment, like a Java programming environment, a SDK
and MatLab with is an
interactive programming environment for statisticians, computer scientists
 and engineers.
With IBM, we are basically working on an
application
all their solution, their Big Data
solutions, through data mining problems.
They bring a lot of value to our research
by working with them we can have the opportunity
to see what's happening in the real world,
what they are working on in companies,
what needs...
what do they need to improve their 
product and
what we can help to improve
their business and the we can apply the techniques
that we have developed into real world problems.
