Category Archives: Mathematics
The question is was posted in early Septermber, 2017, here and is a undergraduate level probabilty problem:
You’re hanging out with some friends, shooting the breeze and talking sports. One of them brags to the group that he once made 17 free throws in a row after years of not having touched a basketball. You think the claim sounds unlikely, but plausible. Another friend scoffs, thinking it completely impossible. Let’s give your bragging friend the benefit of the doubt and say he’s a 70-percent free-throw shooter.
So, who’s right? What is the number of free throws that a 70-percent shooter would be expected to take before having a streak of 17 makes in a row? And what if his accuracy was a bit worse?
This lends itself to a binomial probability distribution. The shooter has a 70% chance of making and a 30% chance of missing any given free throw. It should be immediate that for the shooter to make 17/17 the probability would be This is roughly a 0.2% chance that the shooter will make 17 in a row while taking 17 shots.
But that's not the question...
We want the expected number of attempts until the shooter hits a streak of 17 in a row.
Let's look at a similar problem that is a bit easier to wrap our heads around. I could ask how many times would I need to flip a coin before I got a streak of two heads. Clearly the probability of getting two heads in a row is (0.5)*(0.5) = 0.25, or 25%.
The event we are concerned with is the event that we get two heads in a row. This is unknown so let's represent it with a variable, X. Let's say the first flip is a tails, which happens with a probability of 1/2, and is one wasted flip, so now we need X + 1 flips to get a streak of two heads.
Suppose that the first flip was heads but the second flip was tails. This happens with proabbility 1/4 and we have two wasted flips so we now need X + 2 further flips.
The only other possibility is that the two flips were both heads, which happens with probability 1/4 and took 2 flips. Summing these up we have
X = (1/2)(X+1) + (1/4)(X + 2) + (1/4)2
Or X = 6. That is, on average, you should expect to get a streak of 2 heads every 6 flips.
Generalizing this, for the k-th flip to be tails the number of flips would be , and one can deduce the expected number of flips to get a streak of heads would be
What if it's not a fair coin?¶
Let's start with two flips again and let's assume the head comes up with a 70% chance and tails only 30%. Let X be the number of flips needed to get a streak of two heads.
If the first flip is a tail, that happens with a probability of 0.3 and we have wasted a flip so now we need X + 1 flips. If the first flip is a head, but the second flip is a tail, that happens with a probability of (0.7)(0.3) and we need X + 2 flips. If two consecutive flips are heads, which happen with probabilty (0.7)(0.7) we acheived success in two flips. Now our equation looks like:
X = (0.3)(X+1) + (0.7)(0.3)(X + 2) + (0.7)(0.7)2
Solving for X we get
If let X now be 3 heads in a row, we go through the same analysis and end up with an equation that looks like
X = (0.3)(X+1) + (0.7)(0.3)(X + 2) + (0.7)(0.7)(0.3)(X + 3) + (0.7)(0.7)(0.7)3
and solving for X we get that it takes approximately 6.38 flips.
In general, if we let a = probability of heads, then (1 - a) = probability of tails and if we look for a streak of X = 3 heads the equation above becomes,
More generally, the expected number of flips to get heads is
and solving for X we have a nice tidy formula,
We want to explore a 70% free throw shooter. The answer lies below and we use Python to develop the general case. Let's keep in mind there are free throw shooter's in the NBA much worse that 70%.In fact, Andre Drummnd of the Detroit Pistons once was shooting around 35%. How long would it take him to hit 17 in a row? The result is astoundingly funny.
Solving Using Python¶In :
import numpy as np import pandas as pd import matplotlib.pyplot as plt %matplotlib inline def get_expected_value(a,N): # this method returns the expected number # of flips, free throws or whatever num = (1 - a)*np.sum([k*a**(k-1) for k in range(1,N+1)]) + a**N*N den = 1 - (1-a)*np.sum([a**(k-1) for k in range(1,N+1)]) return num/den def hours_and_minutes_to_acieve(attempts): s = 11.0 # assume 11 seconds per free throw return s*attempts/60.0/60.0
There are lots of fun ways to play with your phone: Angry Birds, Snapchat, Facebook, Face-swapping. Maybe you like to play with sensors? Anyone? Anyone? Well, in this post I will share some Python code and a video of how you can stream your Android's accelerometer data to your laptop and then visualize it. In fact, you can do this with any of the sensors in the unit, the accelerometer, gyroscope or the magnetometer.
Derek Zoolander famously was not an ambiturner. He had to go through a hero's journey to be able to turn both left and right. In this post we will take a hero's journey to identify right and left turns from running data using only GPS measurements.
In my last post I described an algorithm to reduce the number of points on a 2D curve that preserves the basic network structure of the curve. This algorithm, the Ramer-Douglas-Peucker (RDP) algorithm helps us to identify abrupt changes in a curve, where abrupt is subjective and up to the user to decide upon.
In this post I present a novel and simple approach to answering the following question: Did I take a left or right turn and how did I take that turn? Smoothly? Roughly? Constantly? The method presented here is new and remarkably simple compared to other methods in the literature. Most methods I saw relied on accelerometer data which needed heavy machinery such as Kalman filters to make similar determinations.
Calculus III to the rescue
The basic vector description of the plane and 3D space always seems like a bore when teaching. It's hard to get students to be fired up about vector addition and the algebraic structure of euclidean space. So finally, here is an interesting application of the dot product and cross product that goes beyond the ubiquitous force equations from physics.
Consider the map below. Let's assume we have identified the turning points from the RDP algorithm as described in my previous post. Identifying whether it is a right or left turn is extraordinarily simple. Given any three points on a plane we can find the angle that subtends the middle point with our friend the dot product,
On the map above we will have something like this,
If the angle is between 0 and 180 it is a right turn and if it is between -180 and 0 it is a left turn. But to solve for the angle we have to use the function which only has values in a range for a range of 180. This is where the vector cross product saves the day. The 2D cross product is simply the determinant. If the determinant is negative, the angle is between 0 and 180, if the determinant is positive it is between -180 and 0. Thus, left turns and right turns.
Finally, our algorithm is the following:
- Apply the RDP to the dataset of lat/lon's. This leaves us with a network G of vertices and n-1 edges.
- For i in range(1,n-1) form vectors for the i-1, i, and i+1 vector (as shown above) and compute .
- Calculate the determinant and determine left turn or right turn.
That's it!!!!! Glorious, glorious Calc III. Here is a map of run I took in Gedera and the results of the algorithm applied. The run starts with the left-most point. This algorithm resulted in 100% accuracy.
Type of turn, angle
Right turn, 36.9872677438
Left turn, -27.8188864289
Right turn, 55.4872049488
Left turn, -114.980809001
Right turn, 148.508392832
Right turn, 113.14225291
Right turn, 163.80922161
Left turn, -150.035986035
Right turn, 175.043761198
Right turn, 161.791261833
Right turn, 83.6023186647
Right turn, 109.851066119
Left turn, -126.533104243
Right turn, 116.820422235
Left turn, -97.4253631752
Right turn, 160.584337619
Right turn, 93.0049662148
Left turn, -164.456048578
Right turn, 139.971999906
Left turn, -142.289251078
Left turn, -165.463072689
Left turn, -143.612951911
Right turn, 119.067520889
Left turn, -155.864187354
Sometimes it may be necessary to simplify a set of points into a network. It may be for qualitative reasons such as determining intersection points and conduits of a network. It may also be for quantitative reasons like shrinking a dataset and speeding up computations. In this post I will go over a case study of GPS data and how I to turn this data into a smaller and more usable dataset.
This past month I taught a course, "Paradoxes & Infinities," in Hong Kong through the Center for Talented Youth, a gifted & talented program through Johns Hopkins University.
My lasting impression, a few days after completion, is that there still are some great ways to teach advanced math to advanced students. The organization and the logistics of the program itself had some some issues; but these were not negative enough to mollify my experience.
The course is an intensive three week program. The curriculum is math and logic based with paradoxes as the connective tissue. In order to discuss many paradoxes one needs to have a basic understanding of set theory (Russell's paradox), probability (Monty Hall paradox) and sequences (Achilles & the Tortoise) with some other topics sprinkled in. We also studied cardinality, symbolic logic, game theory, combinatorial games and the structure of the real numbers. Each day I, or my TA Raymond, would present brain teasers and puzzles. The combination of all these topics and activities lead to lots of fun but was, at times, a bit overwhelming.
What is the safest way to get from point A to point B? A simple question with a not so simple answer. What does it mean to be safe? Is it the feeling of safety or actually being safe? How are of either of these measured? Most importantly, If I've had too much too drink, what is my safest way home?
We begin to answer these abstract questions in a very tangible environment. Classic 1980's arcade games. I love the 80's.
Ms. Pac Man
In 1981 Ms. Pac Man was released forever giving people like me a reason to get out of bed in the morning. In the figure to the left, the small yellow boxes are pills Ms. Pac Man eats for points, and the 4 super pills allow Ms. Pac Man to eat the ghosts. The 4 tunnels on the left and right edges, "wrap" the board. For example, if Ms. Pac Man enters the tunnel in the top left she will come out on the top right.
It's funny what high school students notice. I paid a visit to a high school calculus class and gave two lectures about applied math. Both lectures were variants of a typical themes; what is mathematics and how is it used in today's society. The talk was very visual and used two mathematical ideas, network theory and optimization, to motivate the study and beauty of mathematics.
I began the talk by asking them to consider the following picture from a children's book. I borrowed this from a talk given by Timothy Gowers on "The Importance of Mathematics." The students very quickly realize the water spouting from the lazy elephant is not obeying Newton's law of gravity. From here I go into a short bit on mathematical modeling. I ask them what they know about bacteria and I receive a typical answer; bacteria grows exponentially. We look at the graph and they realize, through a verbal discussion, that . From here the students understand the exponential growth model needs to be refined. We end with the graph of the logistic equation.
Nothing says romance like analyzing WhatsApp text messages using Python. In an effort to understand my relationship with my girlfriend I present, with lovely graphs, our texting habits.
Using Python and some of its standard packages like Numpy, Scipy, Matplotlib, and Pandas, I parsed data from our WhatsApp conversations. The data began on August 12, 2014, at 10:47pm local time in Amsterdam. The last entry used is from May 18, 22:45 Mountain time. There are a total of 27192 texts. The first table gives some statistics regarding the data.