Deep learning is making waves. At the time of this writing (March 2016), Google’s AlghaGo program just beat 9-dan professional Go player Lee Sedol at the game of Go, a Chinese board game.
Experts in the field of Artificial Intelligence thought we were 10 years away from achieving a victory against a top professional Go player, but progress seems to have accelerated!
While deep learning is a complex subject, it is not any more difficult to learn than any other machine learning algorithm. I wrote this book to introduce you to the basics of neural networks. You will get along fine with undergraduate-level math and programming skill.
All the materials in this book can be downloaded and installed for free. We will use the Python programming language, along with the numerical computing library Numpy. I will also show you in the later chapters how to build a deep network using Theano and TensorFlow, which are libraries built specifically for deep learning and can accelerate computation by taking advantage of the GPU.
Unlike other machine learning algorithms, deep learning is particularly powerful because it automatically learns features. That means you don’t need to spend your time trying to come up with and test “kernels” or “interaction effects” - something only statisticians love to do. Instead, we will let the neural network learn these things for us. Each layer of the neural network learns a different abstraction than the previous layers. For example, in image classification, the first layer might learn different strokes, and in the next layer put the strokes together to learn shapes, and in the next layer put the shapes together to form facial features, and in the next layer have a high level representation of faces.
Do you want a gentle introduction to this “dark art”, with practical code examples that you can try right away and apply to your own data? Then this book is for you.
What do I mean by “fundamentals”?
When students first hear about deep learning, they often are introduced to the field via some hyped up news article about convolutional neural networks or LSTMs. While this is a fine eventual goal, this is not the place to start when you’re first learning about deep learning.
All of deep learning depends on one fundamental algorithm, the “secret sauce”, if you will. That is what you will learn in this book. You will learn how we get there from basic undergraduate math. You will learn how it can be modified for speed improvements. You will learn how to code it in Numpy, Theano, and TensorFlow.
But the most fundamental, important thing, is understanding what “it” is and how “it” works.
What happens when you skip over these important fundamentals?
If you’re reading this book, you probably have some experience with software and programming in a team. More often than not, there is someone on the team who:
* Talks about machine learning endlessly, but is barely able to use Sci-Kit Learn.
* Can regurgitate that convolutional neural networks “do convolution” so that “they can find features in different places in an image”, but can’t actually make one work, much less write one.
* Can regurgitate that LSTMs can “remember long-term dependencies!” and “circumvent the vanishing gradient problem!” but has no idea what formulas actually govern an LSTM or how to write one other than using the Keras built-ins.
* Can possibly plug-and-play into some pre-written deep learning code, so that it at least runs without errors, but has no idea how to make it work for the problem at hand.
If you are on a software team, and you don’t know who “that guy” is, YOU could be “that guy”! My goal in this book is to make sure you are not “that guy”.
I want you to know how deep learning works on a mathematical and algorithmic level.
A true computer scientist can take an algorithm, transform it into pseudocode, and transform that into real, working code.
At the very highest level, all we are doing is “minimizing cost”. Even business people can understand this very intuitive idea. All business try to minimize their costs and maximize their profits.
In this book, I will show you how to take an intuitive objective like “minimize cost”, and how that eventually results in deep learning. It is nothing more than a little bit of math and Python programming.