so who are the most popular characters
in re zero now the most straightforward
way to answer this question
would just be to conduct a poll but i
wanted to take a different look at a
possible way to
rank character popularity and the way i
decided to do this was by
web scraping web scraping is when you
have a bot
go out on the internet find text pages
and bring that text back to you
so here i could use web scraping to
determine the number of times a
character has been mentioned on the
re:zero subreddit to do this i could use
the pros python package which makes web
scraping to the reddit api
really easy unfortunately i did realize
once i started using the
that it has a limit of around a thousand
posts so i wasn't able to collect as
much data as i wanted to initially
i still suspect this is probably in the
one to two months range for the number
of posts i've collected
but i was only able to get about 13 000
comments
we can now look at the comments to see
how many times each character is
mentioned
this is pretty trivial to do but there's
a couple pitfalls that we have to watch
out for
the first is that if we're searching for
the character string Rem
we won't pick up people talking about
rim so we have to force all the comments
to be
lowercase the second is that we want to
avoid false positives
for instance words that contain
character names this would be stuff like
Rem and remember Ramand rambunctious
and Otto
and ottoman empire though i vaguely
doubt anyone is saying ottoman empire
pretty frequently on the Re:Zero
subreddit
anyways we can avoid this just by adding
a space at the beginning and
end of each character's name i also
tried to include any alternate names
characters have like
beatrice betty and beako as they all
refer to the same character
but before we get to the results let's
play a guessing game so for me
i guess that Rem would be the most
frequently mentioned character
and that Volcanica would be the least
frequently mentioned character
let's see how close i was starting from
the characters who were mentioned the
least frequently we can see that i was
kind of right volcanica is mentioned
seven times
slightly edging out Mimi with four
mentions and tying with Ricardo 
who got seven mentions i was actually a
bit surprised that mimi had the lowest
as she gets significantly more screen
time than either volcanica or ricardo
but i still understand people not really
talking about her very often
Anastasia ended up being the least
popular royal selection candidate with
only 22 mentions and was barely ahead of
best girl Patrasche who got only 20
mentions
i was expecting priscilla to be down
here but she's not that far up the list
surpassing otto and elsa with 38
mentions to her name
Ley is surprisingly low considering that
he's a new character
i figured people would be discussing him
more i did also check for rai
which is his romanji name and lye which
is the name the subtitles went with
but i didn't check for people discussing
the cenaric bishop of gluttony or just
gluttony because that could refer to
other characters really surprisingly
actually was
al getting 52 mentions i thought he
would be towards the very bottom of the
list
and again surprisingly wilhelm was also
a lot further down than i thought he
would be with only six dimensions
petra generated quite a bit of
discussion along with petelgeuse
reinhardt beatrice
julius and surprisingly regulus i guess
this means that regulus is a more
popular character than ley
i was also really shocked to see
beatrice down this far as i thought she
was a fan favorite
Felt and Crusch are nearly neck and neck
with 87 and 99 mentions respectively
i was definitely surprised to see felt
this far up as she really isn't in the
anime past the first few episodes
then we get puck Roswaal and our first
big hitter Ram
she earned herself 150 mentions and got
fifth place overall
i actually thought ram would be in
fourth place but instead that goes to
sotelo with 199 mentions
well can you guys guess the order of the
last three we have rim
with 620 mentions Emilia with 689
mentions and subaru absolutely
demolishing the competition with an
astounding
1 357 dimensions holy smokes
as it turns out my guess of Rem coming
in first place was incredibly off let me
know how well you guys did in the
comments down below
now let's examine how this list lines up
with some of the more recent polls
this one lines up pretty well with
subaru and amelia coming in first and
second respectively
and rim and patrasche tying for fourth
hold up did i just say patrasche got
fourth
yeah so my list had protrash near the
literal bottom with only 20 mentions
this just goes to show that even if
people like a character they might not
talk about them very much
then again there really isn't that much
to talk about in respect to patrasche
the problem with this poll was that it
was done head to head where people would
vote on one character or another
a better way of doing this is to have
random matchups for each voter
and then come up with some sort of
aggregate victory for every single
character
it would also mean that characters
wouldn't be tying with each other at all
now i'm not really sure which list is
more accurate this poll or my own
but they do seem to agree in a lot of
places at least but we're not done yet
because i also wanted to look at
sentiment analysis of the characters
using natural language processing
sentiment is a measure of the positivity
and negativity in a sentence
frince the sentence i love amelia is
pretty positive while the sentence
i hate amelia is pretty negative the
idea is that i can create an
average sentiment around each character
by looking for sentences where their
names are mentioned
in the comments to do this i was going
to use the pre-trained sentiment
classifier called
vader that's part of the natural
language toolkit package in python
unfortunately blindly applying this to
people's comments about characters is
kind of like using a blunt axe for brain
surgery and the results were an absolute
disaster
to me it looks like the characters were
just randomly organized with no rhyme or
reason
and if we look at sentences that were
identified as strongly positive or
negative we can quickly see the problem
take this sentence for instance by now
we've seen him get emotional and lose
his cool about subaru disrespecting
krush a couple of times
but this time he's really broken by this
and even wilhelm and krush have to step
in to stop him from taking things too
far
because we are genius humans we can
contextually infer that this sentence is
actually referring to
felix with the pronouns he and him but
because of the stupid way i was doing
this it was assigning the sentiment of
the sentence to subaru krush and wilhelm
so it's not really the classifier's
fault as it was correctly identifying
the sentence as negative
it was my fault because i was falsely
assigning this negative sentiment to the
incorrect
characters honestly you could probably
come up with a better way of making this
work but it would take a lot of time and
effort and probably someone who is
smarter and better at this than i am
full disclosure i'm not very good at
programming and i'm sure if someone
who's good at natural language
processing
looked at what i would do here they
would probably shriek and tear
anyways let me know what you guys
thought about the results and let me
know if you know a better way of doing
this cinnamon analysis than i
did this video did take a while to make
so i'd appreciate if you would like and
subscribe
and thanks for watching
