This eBook includes the following formats, accessible from your Account page after purchase:
EPUB The open industry format known for its reflowable content and usability on supported mobile devices.
PDF The popular standard, used most often with the free Acrobat® Reader® software.
This eBook requires no passwords or activation to read. We customize your eBook by discreetly watermarking it with your name, making it uniquely yours.
Also available in other formats.
Register your product to gain access to bonus material or receive a coupon.
An accessible introduction to applied data science and machine learning, with minimal math and code required to master the foundational and technical aspects of data science.
In Just Enough Data Science and Machine Learning, authors Mark Levene and Martyn Harris present a comprehensive and accessible introduction to data science. It allows the readers to develop an intuition behind the methods adopted in both data science and machine learning, which is the algorithmic component of data science involving the discovery of patterns from input data. This book looks at data science from an applied perspective, where emphasis is placed on the algorithmic aspects of data science and on the fundamental statistical concepts necessary to understand the subject.
The book begins by exploring the nature of data science and its origins in basic statistics. The authors then guide readers through the essential steps of data science, starting with exploratory data analysis using visualisation tools. They explain the process of forming hypotheses, building statistical models, and utilising algorithmic methods to discover patterns in the data. Finally, the authors discuss general issues and preliminary concepts that are needed to understand machine learning, which is central to the discipline of data science.
The book is packed with practical examples and real-world data sets throughout to reinforce the concepts. All examples are supported by Python code external to the reading material to keep the book timeless.
Notable features of this book:
List of Figures ix
Preface xvii
About the Authors xix
Chapter 1. What Is Data Science? 1
Chapter 2. Basic Statistics 3
2.1 Introductory Statistical Notions 3
2.2 Expectation 17
2.3 Variance 21
2.4 Correlation 26
2.5 Regression 28
2.6 Chapter Summary 32
Chapter 3. Types of Data 33
3.1 Tabular Data 33
3.2 Textual Data 38
3.3 Image, Video, and Audio Data 40
3.4 Time Series Data 41
3.5 Geographical Data 42
3.6 Social Network Data 44
3.7 Transforming Data 46
3.8 Chapter Summary 51
Chapter 4. Machine Learning Tools 52
4.1 What Is Machine Learning? 52
4.2 Evaluation 57
4.3 Supervised Methods 68
4.4 Unsupervised Methods 105
4.5 Semi-Supervised Methods 125
4.6 Chapter Summary 129
Chapter 5. Data Science Topics 130
5.1 Searching, Ranking, and Rating 130
5.2 Social Networks 150
5.3 Three Natural Language Processing Topics 171
5.4 Chapter Summary 183
Chapter 6. Selected Additional Topics 184
6.1 Neuro-Symbolic AI 184
6.2 Conversational AI 185
6.3 Generative Neural Networks 185
6.4 Trustworthy AI 186
6.5 Large Language Models 187
6.6 Epilogue 187
Chapter 7. Further Reading 189
7.1 Basic Statistics 189
7.2 Data Science 189
7.3 Machine Learning 190
7.4 Deep Learning 191
7.5 Research Papers 191
7.6 Python 191
Bibliography 192
Index 195