Hi everyone,
I'm Darren, I'm currently taking my master at NUS,
I used to work for SMRT and Agilent Technologies
as a data analyst
I have some experience using data mining tools
and I'm an expert in Qlik Sense as a visualization tool
So today I'm going to share this project on
Market Basket Analysis on this card game called Magic the Gathering
So the purpose of this project is to find out the
association between cards so to help the company to identify which card to be banned
Magic cards have a long history
since 1993 and they have been giving
lots of prizes for the tournament, can be up to $50,000
Each card can be quite expensive
as we can see here from ebay, this card cost $30
The data mining objective
is to identify card to be banned
by using association analysis
So the game works by putting up 60 cards
putting together 60 cards
which consist of land cards as resources
creature cards to fight for you
and spell cards to disrupt your opponents
and also other permanent cards that help you win the game
So the first data I pulled from
is from Gatherer, an online search engine
for the information of all the different cards and their abilities
The second data source I have here is the cards that win the tournament for the players
After gathering all these data, I put the data into Qlik Sense
for data cleaning and also visualization
In this Heat Map, I'm trying to see for each player what are the different cards they played
Each column is a player, each row is a card
You can see these 2 different players actually played the same cards in a deck of 60 cards
After gathering the data, I exported the data into
Rapid Miner for data modelling
I'm creating the association rules to find out
which cards are highly associated with each other
So the result is these 4 cards are
highly associated with each other
with high lift and support
So I did this analysis in May
and this particular card was actually banned in June
So how I did this is
Firstly explore the data using Qlik Sense
So first I look at who are the players who won the tournament
and on which day did they win the tournament
So for this particular day and this particular player
This is the deck that he used and the color of the cards
and this is the total cost of the deck
of 60 cards, or in fact 75 cards
So this is the list of different cards that he used
and the quantity and what are the prices of these cards
at its lowest and at its highest
So there is huge fluctuation for certain cards prices
Here we can look at how the card looks like
and next, I do a scatterplot
to analyze the price of each card, if let's say you buy at the lowest price
and sell it at its highest price, these are the cards that help you
to gain a profit if you have done that
Most of the cards are mythic rare
this is the rarity of the cards
Here we can see most cards
are creature cards that help you earn a profit
and planeswalker, usually will make a loss
This heat map helped me to quickly visualize the association between the cards
Lastly, I formatted the data this way
to be exported into rapid miner
The data sources are as such there are 3 data sources
from Gatherer and from the winning deck lists
First, Gather looks like this and it's an online database
And the winning cards are all here
So after I gathered the data,
the data looks like this which everything is jumbled up
and I have to do some sort of data cleaning to separate up the cards
by identifying all these pattern here
I do a script
to do a subfield to cut up the words
and looped it for 25 times
and finally I can get the data I need for association
In conclusion, I used Qlik Sense as a data cleaning tool and exploration tool
and export the data into rapid miner
for association analysis
The cards with high lift are filtered out
to help decision maker decide on the banned list
and 1 of the cards are banned subsequently
thank you very much
