We’ve already used a bit of probability theory informally when computing the classical time complexity of the Deutsch-Jozsa problem.  That wasn’t too hard to swallow but for the
two remaining algorithms -- and those beyond -- we’d better officially check off the box titled “probability theory.”   Now, that said, this chapter gets pretty mathy in places, so I’m
unofficially declaring it entirely optional, especially for those who are struggling with the earlier topics and need this week for review.  You can get through this course without the
mathematical rigor of probability theory. Except for the advanced students, I’ve set the reading paths to contain mostly optional sections.
But if you’re going to do advanced work after this class, you should come back here at some point, maybe in a few months or a year, and study it carefully.  Quantum algorithms
are all about probabilities, so knowing the fundamentals is important in the long run.  Probability is about numbers, but before we can begin to talk about those, we have to establish
a system for describing something more set-theoretic: events. Events can be described in natural language:  “the event that Mr. Park will shoot a one under par at Pebble
Beach tomorrow.” Probabilities help us understand how likely the events are.
“The probability that Ms. Park will shoot a one under par tomorrow is .3 or 30%.” We’ll study events and probabilities using a quantum coin flip, which is just like ordinary coin flip,
except ... It’s fun, because it’s quantum. So we develop the theory of probability by defining some non-numeric terms: sample space, outcomes, events – and some operations:
unions, intersections, complements. Then I’ll present a technical experiment that we’ll be using next week.  Ten quantum coin-flips, represented as a 10 dimensional vector
with coordinates 0 or 1.   Once we do a few examples and exercises in this 10-qubit coin flip setting, it’ll be time to introduce formal probability language. We’ll first go over the
axioms, then give a computable definition that the probability of some event occurs.  After that, we start computing probabilities, rigorously.
But wait, there’s more.  Usually, one doesn’t apply the definition of an event’s probability directly, but rather uses a few theorems that make the computation easier. Those
theorems bring us to the concept of statistical independence and a powerful tool called  “Bayes law.” We’ll see some alternate notation for unions and intersections, and then bring
it all to a boil by applying everything to one of our past algorithms – Deutsch-Jozsa. We’ll also establish a result we’ll need for our future algorithms. I stated that result in the
form of a theorem about algorithms that use loops.  I’d like everyone to try to read and understand that theorem, even if you skip its proof.
It says that if the probability of success of a single loop pass is always greater than some positive minimum number -- even one as small as one over a billion -- then the algorithm has
constant time complexity.  It does not grow as the number of inputs gets large. When you’re reading this chapter, one idea may help put things into perspective.  Applying probability to
our previous algorithms, like Deutsch-Jozsa, is done just so we can know about the classical time complexity.  The quantum algorithm always completes in a single pass of the
circuit. However, when we use probability in our upcoming algorithms, we’ll be applying it directly to the quantum circuits.  That’s because those circuits do not guarantee
immediate results.  We’ll have to rerun the circuit in a loop until we get enough results to declare victory.  So, looking back, probabilities help us more with the classical estimates.
Looking forward, they’ll be required in our quantum estimates.
