In class on the first day of snow in November we analyzed a piece written by Professor Matthew Wilkens. Firstly, we broke into small groups and broke down the concept of “regressions in the data.” One article we read was “Literary Attention Lag.” Click here for the link : https://mattwilkens.com/2015/01/13/literary-attention-lag/ In this article Matthew Wilkens informed the reader of his two major questions for his investigation. The first question asks ‘How is geographic attention in literary fiction related to the distribution of population at the time the fiction is published?’ The second question asks ‘What do the details of the relation between them tell us about literary memory?’ In order to answer these two questions he collected census data on 23 cities which met his criteria and qualified as well represented in literature and had a significant population size. In order to and answer these questions we have to have a bit of background information on regression models. A large portion of the class have used regression models before, but it was evident that students frequently used them in Stats classes and STEM classes. We defined a linear regression by saying it is an analytical mathematical equation which shows the correlation between given variables within a data set. We have seen this in many other readings for example another student wrote in a blog post “In week 10’s readings from “Everything on Paper Will Be Used Against Me:” by Micki Kaufman, the author goes to great lengths to draw correlations and connections between Henry Kissinger’s telephone conversations and memos”. Another example outside of DCS are Biochem students, they use a linear regression with the line of best fit which is often used to determine things in a predictive way. Another example are Economic students who use a random linear regression. One Econ student in the DCS class explained how he recently used a random linear regression to predict tomorrow’s stock return. In addition, you can make random linear regressions with economic policy and uncertainty index to see how confident people are in the economy. One thing all of these graphs and linear regression models had in common was that all linear models almost never fit the data perfectly, which means they are an approximation. Relating back to the article, the author used his previous knowledge that there’s a decent amount of correlation between the population in a geographic location and the number of literary attention paid to it. Some geographical locations included cities such as New York City, San Francisco as well as Detroit. Next he used around a thousand volumes of U.S. fiction and found when these cities were mentioned. He chose a time period between 1850 and 1875 when the books were published so that he could analyze a specific timeframe. He then made four hypothesis about the relationship. The first informal hypothesis was called “national or deep” which anticipated the literature in the nineteenth century is an accurate representation of the nation in the eighteenth. The second informal hypothesis was called “Formative-psychological” which states that the each author would most likely explain the world as it was throughout the authors formative years. I found this particularly interesting because I am in my formative years. The last two hypothesis were called “Presentist” and “Predictive”. After he collected population data from all his census data prior to 1900 he plotted total literary mentions against “decennial census counts”. Next he ran a linear regression, which is why a major part of the class was devoted to understanding what a linear regression was. The result of the linear regression was equal to an r^2 value of 0.46. After, he plotted with respect to time. I appreciated the two graphs but I found the next section of his article far more intriguing, there was a “Future Exploration” section which talked about it he had more time. In my personal life I have lived abroad for 6 years in London and I was born in China so international comparisons are always second nature to me. Professor Matthew Wilkens shares my interest of international comparisons and hopes to answer the question “How does lag change, if at all, in other national contexts?” The end of his article has three data notes which make him a more credible author in my opinion because he is open and honest about the way the data collected throughout this experiment. This concept can be related back to our class where we also mentioned how each groups needs to understand how the findings in their projects will be published. Credit and compensation are vitally important.