Guiding principles
A model is understood as a sequence or a graph of standalone, fully-configurable modules that can be plugged together with as little restrictions as possible. In particular, neural layers, cost functions, optimizers, initialization schemes, activation functions, regularization schemes are all standalone modules that you can combine to create new models.
Each module should be kept short and simple. Every piece of code should be transparent upon first reading. No black magic: it hurts iteration speed and ability to innovate.
New modules are dead simple to add (as new classes and functions), and existing modules provide ample examples. To be able to easily create new modules allows for total expressiveness, making Keras suitable for advanced research.
No separate models configuration files in a declarative format. Models are described in Python code, which is compact, easier to debug, and allows for ease of extensibility.
Keras is a high-level neural networks library, written in Python and capable of running on top of either TensorFlow or Theano. It was developed with a focus on enabling fast experimentation. Being able to go from idea to result with the least possible delay is key to doing good research. Use Keras if you need a deep learning library that:
Keras uses the following dependencies:
Keras is a high-level library that provides a convenient Machine Learning API on top of other low-level libraries for tensor processing and manipulation, called Backends. At this time, Keras can be used on top any of the three available backends: TensorFlow, Theano, and CNTK.
Theano is installed automatically if you install Keras using pip. If you want to install Theano manually, please refer to Theano installation instructions.
TensorFlow is a recommended option, and by default, Keras uses TensorFlow backend, if available. To install TensorFlow, the easiest way is to do
$ pip install tensorflow
If you want to install it manually, please refer to TensorFlow installation instructions.
To install Keras, cd to the Keras folder and run the install command:
$ python setup.py install
You can also install Keras from PyPI:
$ pip install keras
If you have run Keras at least once, you will find the Keras configuration file at:
~/.keras/keras.json
If it isn't there, you can create it. The default configuration file looks like this:
{
"image_dim_ordering": "tf",
"epsilon": 1e-07,
"floatx": "float32",
"backend": "tensorflow"
}
By default, Keras will use TensorFlow as its tensor manipulation library. If you want to use other backend, simply change the field backend to either "theano"
or "tensorflow"
, and Keras will use the new configuration next time you run any Keras code.
The core data structure of Keras is a model, a way to organize layers. The main type of model is the Sequential model, a linear stack of layers. For more complex architectures, you should use the Keras functional API.
Here's the Sequential model:
from keras.models import Sequential
model = Sequential()
Stacking layers is as easy as .add()
:
from keras.layers import Dense, Activation
model.add(Dense(output_dim=64, input_dim=100))
model.add(Activation("relu"))
model.add(Dense(output_dim=10))
model.add(Activation("softmax"))
Once your model looks good, configure its learning process with .compile()
:
model.compile(loss='categorical_crossentropy', optimizer='sgd', metrics=['accuracy'])
If you need to, you can further configure your optimizer. A core principle of Keras is to make things reasonably simple, while allowing the user to be fully in control when they need to (the ultimate control being the easy extensibility of the source code).
from keras.optimizers import SGD
model.compile(loss='categorical_crossentropy', optimizer=SGD(lr=0.01, momentum=0.9, nesterov=True))
You can now iterate on your training data in batches:
model.fit(X_train, Y_train, nb_epoch=5, batch_size=32)
Alternatively, you can feed batches to your model manually:
model.train_on_batch(X_batch, Y_batch)
Evaluate your performance in one line:
loss_and_metrics = model.evaluate(X_test, Y_test, batch_size=32)
Or generate predictions on new data:
classes = model.predict_classes(X_test, batch_size=32)
proba = model.predict_proba(X_test, batch_size=32)
Building a question answering system, an image classification model, a Neural Turing Machine, a word2vec embedder or any other model is just as fast. The ideas behind deep learning are simple, so why should their implementation be painful?
You will find more advanced models: question-answering with memory networks, text generation with stacked LSTMs, etc in example folder.