class: center, middle, inverse, title-slide # Introduction to Stats 115 and Bayesian Statistics ### Dr. Dogucu ### 2020-01-06 --- layout: true <div class="my-header"></div> <div class="my-footer"> Copyright © <a href="https://mdogucu.ics.uci.edu">Dr. Mine Dogucu</a>. All Rights Reserved.</div> --- ## Merhaba - Hello - ` `Private Sub Form_Load()` `MsgBox "Hello, World!"` `End Sub` ` - Hallo - مرحبا - `print('Hello world')` - नमस्ते & السلام عليكم - `print("Hello world")` - `<html> Hello world</html>` - ¡Hola! - سلام --- ## About Me Mine Dogucu 📧 <a href="mailto:mdogucu@uci.edu">mdogucu@uci.edu</a> 📆 M 11 am - 12 pm & W 3 - 4 pm 📍 DBH 2204 --- ## Meet and Greet Each Other In groups three or four meet and greet each other. Include - Your name - Your year (most likely senior) - I live ... - My favorite thing about UCI is ... - I am awesome because ... Find something in common between all of you by expanding the conversation. Find a difference. --- # How to be successful in this course - Be punctual - Be organized - Do the work # How to make your professor happy - Be kind - Be honest --- # The most important thing about this course <br> <br> <a href = "http://mdogucu.ics.uci.edu/teaching/stats115-wi20/schedule.html" style ="font-size: 35px; display:block; text-align:center;">The course website </a> --- # What is Bayesian Statistics? Think 🤔- Pair 👫 - Share 🗨️ Have you heard of Bayesian Statistics before? If so, what was the context? What do you know about it already? --- # Key Point in Bayesian Statistics It is ok to have prior ideas and expert knowledge and to use these ideas and weight it against the observed data. --- # Bayesian vs. Frequentist Consider two claims. (1) Zuofu claims that he can predict the outcome of a coin flip. To test his claim, you flip a fair coin 10 times and he correctly predicts all 10! (2) Kavya claims that she can distinguish Dunkin' coffee from Starbucks coffee. To test her claim, you give her 10 coffee samples and she correctly identifies the origin of each! In light of these experiments, what do you conclude? a. You're more confident in Kavya's claim than Zuofu's claim. b. The evidence supporting Zuofu's claim is just as strong as the evidence supporting Kavya's claim. --- # Bayesian vs. Frequentist Your doctor called you to tell you that you have been tested positive for a rare disease. The doctor gives you the opportunity to ask one question. Which of the following would you ask? a. What is the probability that I actually have the disease given that I have tested positive? b. If I do not have the disease, what is the probability that I would have gotten this positive test result? --- # Notes on Bayesian History - Frequentist statistics is more popular and Bayesian statistics is starting to get popular. - Named after Thomas Bayes (1701-1761). - Computing, computing, computing. - It is harder to adopt to newer methods. Thus change is happening slowly. - We can embrace subjectivity. --- # Bayesian vs. Frequentist .pull-left[ __Frequentist__ Probability is based on long-term frequencies. When answering a question, we assume that the null hypothesis is true and then calculate the probability of observing the data if the null were true. ] .pull-right[ __Bayesian__ Probability is a balance of relative plausibility of an event and degree of prior ideas. When an answering a question we calculate the probability that the null hypothesis is true based on the given data. ] --- # Optional Watch [How Data Nerds Found A 131-Year-Old Sunken Treasure](https://fivethirtyeight.com/features/how-data-nerds-found-a-131-year-old-sunken-treasure/) Spoiler alert: They used Bayesian Statistics! --- # Probability Review <style type="text/css"> .tg {border-collapse:collapse;border-spacing:0;} .tg td{font-family:Arial, sans-serif;font-size:14px;padding:10px 5px;border-style:solid;border-width:1px;overflow:hidden;word-break:normal;border-color:black;} .tg th{font-family:Arial, sans-serif;font-size:14px;font-weight:normal;padding:10px 5px;border-style:solid;border-width:1px;overflow:hidden;word-break:normal;border-color:black;} .tg .tg-x5q1{font-size:16px;text-align:left;vertical-align:top} .tg .tg-vox4{font-weight:bold;font-size:16px;text-align:left;vertical-align:top} .tg .tg-cqfb{font-size:16px;text-align:left;vertical-align:middle} </style> <table class="tg" align="center"> <tr> <th class="tg-x5q1"></th> <th class="tg-x5q1" colspan="2">Belief in afterlife</th> <th class="tg-x5q1"></th> </tr> <tr> <td class="tg-cqfb">Taken a college science class</td> <td class="tg-cqfb">Yes</td> <td class="tg-cqfb">No</td> <td class="tg-vox4">Total</td> </tr> <tr> <td class="tg-cqfb">Yes</td> <td class="tg-cqfb">2702</td> <td class="tg-cqfb">634</td> <td class="tg-vox4"><span style="font-weight:700">3336</span></td> </tr> <tr> <td class="tg-cqfb">No</td> <td class="tg-cqfb">3722</td> <td class="tg-cqfb">837</td> <td class="tg-vox4"><span style="font-weight:700">4559</span></td> </tr> <tr> <td class="tg-vox4">Total</td> <td class="tg-vox4">6424</td> <td class="tg-vox4"><span style="font-weight:bold">1471</span></td> <td class="tg-vox4">7895</td> </tr> </table> <p style="font-size: small"> Data from <a href ="https://gssdataexplorer.norc.org"> General Social Survey</a> </p> `\(P(\text{belief in afterlife})\)` = ? `\(P(\text{belief in afterlife and taken a college science class})\)` = ? `\(P(\text{belief in afterlife given taken a college science class})\)` = ? Calculate these probabilities and write them using correct notation. Use `\(A\)` for belief in afterlife and `\(S\)` for college science class. --- --- ## Vocabulary Review `\(P(A^C)\)` is called __complement__ of event A and represents the probability of selecting someone that does not believe in afterlife. --- ## Vocabulary Review `\(P(A)\)` represents a __marginal probability__. So do `\(P(S)\)`, `\(P(A^C)\)` and `\(P(S^C)\)`. In order to calculate these probabilities we could only use the values in the margins of the contingency table, hence the name. --- ## Vocabulary Review `\(P(A \cap S)\)` represents a __joint probability__. So do `\(P(A^c \cap S)\)`, `\(P(A\cap S^c)\)` and `\(P(A^c\cap S^c)\)`. Note that `\(P(A\cap S) = P(S\cap A)\)`. Order does _not_ matter. --- ## Vocabulary Review `\(P(A|S)\)` represents a __conditional probability__. So do `\(P(A^c|S)\)`, `\(P(A | S^c)\)` and `\(P(A^c|S^c)\)`. In order to calculate these probabilities we would focus on the row or the column of the given information. In a way we are _reducing_ our sample space to this given information only. --- ## Note on conditional probability `\(P(\text{attending every class | getting an A}) \neq\)` `\(P(\text{getting an A | attending every class})\)` The order matters! --- # Bayes' Rule for Events You may recall from your previous courses that we define a conditional probability as: `\(P(A|B) = \frac{P(A \cap B)}{P(B)}\)` Note that the denominator is `\(P(B)\)` because we know that event B has happened (it is given to us that it has happened) we can reduce our sample space to just event B. --- # From Conditional Probability to Bayes' Rule for Events Before we build Bayes' rule for events using conditional probability, you may be asking yourself how we can have a WHOLE quarter dedicated to one single formula in statistics. Well, we are going to expand that formula. Also, we will apply it for different probability models (discrete and continuous). For now let's only focus on events. --- ## Example An algorithm is written to detect cat images. When given a cat image the algorithm identifies the image as a cat image 80% of the time. However, when given an image without a cat the algorithm falsely identifies it as a cat image 50% of the time. The algorithm is to be tested with a set of images of which are 7% cat images. What is the probability that an image is a cat image if the algorithm identifies it as a cat image? --- # Write out the given information Let event `\(A\)` represent an event of algorithm identifying an image as a cat image Let event `\(B\)` represent an event of an image being a cat image. --- # Bayes' Rule for Events --- # Total Law of Probability --- # Solving the Cat Problem --- ## Bayes' Rule for Events `\(P(B |A) = \frac{P(A \cap B)}{P(A)} = \frac{P(A|B)P(B)}{P(A)}\)` where, by the Law of Total Probability, `\(P(A) = P(A|B)P(B) + P(A|B^c)P(B^c)\)` --- # Due * Due Tonight (11:59 pm) + All About You Survey on Canvas * Due Wednesday (09:30 am) + Syllabus Quiz