COOKIES: By using this website you agree that we can place Google Analytics Cookies on your device for performance monitoring. |
University of Cambridge > Talks.cam > Economics & Policy Seminars, CJBS > The Law of Large Populations: The return of the long-ignored N and how it can affect our 2020 vision
The Law of Large Populations: The return of the long-ignored N and how it can affect our 2020 visionAdd to your list(s) Download to your calendar using vCal
If you have a question about this talk, please contact Emily Brown. If you can attend this seminar, please respond to Emily Brown: e.brown@jbs.cam.ac.uk For over a century now, we statisticians have successfully convinced ourselves and almost everyone else, that in statistical inference the size of the population N can be ignored, especially when it is large. Instead, we focused on the size of the sample, n, the key driving force for both the Law of Large Numbers and the Central Limit Theorem. We were thus taught that the statistical error (standard error) goes down with n typically at the rate of 1/√n. However, all these rely on the presumption that our data have perfect quality, in the sense of being equivalent to a probabilistic sample. A largely overlooked statistical identity, a potential counterpart to the Euler identity in mathematics, reveals a Law of Large Populations (LLP), a law that we should be all afraid of. That is, once we lose control over data quality, the systematic error (bias) in the usual estimators, relative to the benchmarking standard error from simple random sampling, goes up with N at the rate of √N. The coefficient in front of √N can be viewed as a data defect index, which is the simple Pearson correlation between the reporting/recording indicator and the value reported/recorded. Because of the multiplier √N, a seemingly tiny correlation, say, 0.005, can have detrimental effect on the quality of inference. Without understanding of this LLP , “big data” can do more harm than good because of the drastically inflated precision assessment hence a gross overconfidence, setting us up to be caught by surprise when the reality unfolds, as we all experienced during the 2016 US presidential election. Data from Cooperative Congressional Election Study (CCES, conducted by Stephen Ansolabehere, Douglas River and others, and analyzed by Shiro Kuriwaki), are used to estimate the data defect index for the 2016 US election, with the aim to gain a clearer vision for the 2020 US election and beyond. This talk is part of the Economics & Policy Seminars, CJBS series. This talk is included in these lists:Note that ex-directory lists are not shown. |
Other listsCambridge Carbon Footprint Summer Hebrew Ulpan seminarsOther talksCold cases in Antarctic history How to rediscover a medical secret in eighteenth-century France: the lost recipe of the Chevalier de Guiller's powder febrifuge Debtors’ schedules: a new source for understanding the economy in 18th-century England Bringing Personality Theory Back to Life: On Persons-in-Context, Idiographic Strategies, and Lazarus Stakeholder perceptions across scales of governance: areas of controversy and consensus related to the Indonesian peatland fires ‘Class-work’ in the elite institutions of higher education |