This is an introduction to the mathematical foundations of probability theory. It is intended as a supplement or follow-up to a graduate course in real analysis. The first two sections assume the knowledge of measure spaces, measurable functions, Lebesgue integral, and notions of convergence of functions; the third assumes Fubini’s Theorem; the fifth assumes knowledge of Fourier transform of nice (Schwartz) functions on R; and Section 6 uses the Radon-Nikodym Theorem. The mathematical foundations of probability theory are exactly the same as those of Lebesgue integration. However, probability adds much intuition and leads to different developments of the area. These notes are only intended to be a brief introduction — this might be considered what every graduate student should know about the theory of probability. Probability uses some different terminology than that of Lebesgue integration in R. These notes will introduce the terminology and will also relate these ideas to those that would be encountered in an elementary (by which we will mean pre-measure theory) course in probability or statistics. Graduate students encountering probabilty for the first time might want to also read an undergraduate book in probability