Introduction
- Data Sources (Including: Relational Databases, Introduction to the Farmer’s Market Database)
- The SELECT Statement
- The WHERE Clause
- CASE Statements
- SQL JOINs
- Aggregating Results for Analysis
- Window Functions and Subqueries
- Date and Time Functions
- Exploratory Data Analysis with SQL
- Building SQL Datasets for Analytical Reporting
- More Advanced Query Structures
- Creating Machine Learning Datasets Using SQL
- Analytical Dataset Development Examples
- Storing and Modifying Data
Appendix: Answers to Chapter Exercises
A more detailed Table of Contents, index, and an excerpt of the book are available on the publisher’s website under “Read an Excerpt”.
Who This Book is For
SQL for Data Scientists is designed to be a learning resource for anyone who wants to become (or who already is) a data analyst or data scientist, and wants to be able to pull data from databases to build their own datasets without having to rely on others in the organization to query the source system and transform it into flat files (or spreadsheets) for them.
There are plenty of SQL books out there, but many are either written as syntax references or written for people in other roles that create, query from, and maintain databases. However, this book is written from the perspective of a data scientist and is aimed at those who will primarily be extracting data from existing databases in order to generate datasets for analysis.
I won’t assume that you’ve ever written SQL queries before, and we’ll start with the basics, but I do assume that you have some basic understanding of what databases are and a general idea of how data might be used in reports, analyses, and machine learning algorithms. This book is meant to fill in the steps between finding a database that contains the data you need and starting the analysis. I aim to teach you how to think about structuring datasets for analysis and how to use SQL to extract the data from the database and get it into that form.