SKIP THE SHIPPING
Use code NOSHIP during checkout to save 40% on eligible eBooks, now through January 5. Shop now.
Register your product to gain access to bonus material or receive a coupon.
Video accessible from your Account page after purchase.
11+ Hours of Video Instruction
The perfect way to up your data analytics game: tools and parallel case studies!
There are lots of ways to learn data analysis, but one way that really helps it to sink in is to see analyses of actual data. Add to that the use of four analytic tools to analyze each of the five data sets, and you have an excellent way to learn data analysis. In Data Analytics Toolkit: From Excel to Python, R, and Tableau that is exactly what is done. Data from five case studies are each analyzed using Excel, Tableau, R, and Python. This approach provides a unique, thorough, and in-depth way to learn data analytics.
Learn How To
Who Should Take This Course:
Anyone who:
Course Requirements:
No specific requirements
Lesson Descriptions:
Lesson 1: Bank Data
In this first lesson we look at a basic dataset involving bank information that introduces the concepts of both quantitative and qualitative variables. The dataset is small enough and tidy enough to be easily manipulated in Excel and lends itself nicely to the fundamental concepts of transforming and visualizing your data. Creating identical column charts in Excel using pivot tables and in Tableau illustrates the parallel data analysis structures of these two software packages. Tableau further introduces the idea of importing your data, something basic to using R and Python but unfamiliar to Excel users who always see their data in front of them.
Lesson 2: Countries
This simple case study in Lesson 2 highlights some of the key functionality benefits of Tableau. Excel is relatively simple to use, and without a clear rationale for learning a new software package, faculty (like students) will resist the learning curve required for orientation to a new package. This case study uses data easily downloaded from the internet consisting of basic statistics for countries: GDP, life expectancy, infant mortality, and population.
The bank data example started with tidy data (so no cleaning or wrangling involved) and introduced the basic column chart or bar chart. This Countries example highlights the cleaning process of data analysis. This is typically the most arduous part of data analysis and a major weakness in Excel, which lacks a straightforward way to merge data sets and remove observations with missing values. Typically, we simply sort the data and delete rows by hand in Excel. In the other packages, there are powerful data cleaning functions and operations. We start with Tableau and create four more of the fundamental data visualizations: a scatterplot, histogram, boxplot and also a choropleth or heat map.
Lesson 3: Wisconsin Elections
In Lesson 3 we explore how voting patterns changed in Wisconsin between the 2012 and 2016 presidential elections. In 2012, Democrat Barack Obama defeated Republican Mitt Romney by a substantial margin, both nationally and in Wisconsin. On the other hand, Republican Donald Trump won both Wisconsin and the electoral college vote over Democrat Hillary Clinton (although not the overall popular vote) in 2016. Based on the final vote counts, Wisconsin became redder (more Republican) between 2012 and 2016. We create a variety of graphics to explore how this happened.
Lesson 4: COVID-19
Lesson 4 illustrates a phenomenon called Simpsons Paradox, which can occur when applying an aggregate calculation to subsets of a data set. Counterintuitively (though not actually paradoxically), it is possible for the aggregate calculationsay, an averageto yield one kind of result on the majority or even all of the subsets while producing a contradictory result on the data set as a whole. The Wikipedia page on Simpsons Paradox describes an especially compelling example. The baseball player David Justice had a better batting average than Derek Jeter in 1995 and 1996, but combining all of the data for those two years into a single data set yields the opposite result: Derek Jeters batting average was better for the two-year period as a whole.
In this example, we have data about a group of people who all got COVID-19. Overall, a greater percentage of vaccinated people died than unvaccinated. However, if we divide the people in the data set into two age groups, we find that the death rate for unvaccinated people was greater in each group on its own.
Lesson 5: Nightingales Rose
In Lesson 5 we reproduce a famous visualization created by the nurse and statistician Florence Nightingale. The graphic shows deaths at a military hospital in Crimea, broken down by month and by cause of death. The visualization is further divided into two timeframes: before improved sanitation measures were implemented and after. The point of the graphic was to illustrate that preventable diseases did the following:
Nightingale used a highly creative and compelling graphical form called a polar bar chart to display this information. In the lesson you see an image of her original chart and then we recreate it using our visualization tools.
About Pearson Video Training:
Pearson publishes expert-led video tutorials covering a wide selection of technology topics designed to teach you the skills you need to succeed. These professional and personal technology videos feature world-leading author instructors published by your trusted technology brands: Addison-Wesley, Cisco Press, Pearson IT Certification, Prentice Hall, Sams, and Que Topics include IT Certification, Network Security, Cisco Technology, Programming, Web Development, Mobile Development, and more. Learn more about Pearson Video training at http://www.informit.com/video.
Video Lessons are available for download for offline viewing within the streaming format. Look for the green arrow in each lesson.
Lesson 1: Bank Data
Lesson 2: Countries
Lesson 3: Wisconsin Elections
Lesson 4: COVID-19
Lesson 5: Nightingales Rose