Eclipse Deeplearning4j

Primary tabs

The goal of Eclipse Deeplearning4j is to provide a core set of components for building applications that incorporate AI. AI products within an enterprise often have a wider scope than just machine learning. The overall goal of a distribution is to provide smart defaults for building deep learning applications.

We define a machine learning product lifecycle as:

  • Securely connecting to enterprise environments via Kerberos™ and other auth protocols with the purpose of:

    • Connecting to disparate data sources

    • Cleaning data

    • Using that data to build vectors that a neural network is capable of understanding

    • Building and tuning a neural network

    • Deploying to production via REST, Spark, or embedded environments such as Android™ phones or Raspberry Pi’s

 

Deeplearning4j can facilitate the process of building an application without relying on third-party providers for ETL libraries, tensor libraries, etc. Convention over configuration is key for scaling large software projects that will be maintained for long periods.

Most current projects in deep learning don't think about backwards compatibility with large enterprise applications, nor do they facilitate the building of applications. Instead, they optimize for flexibility and loose coupling (which is great for research). Deeplearning4j is the bridge between research in the lab and applications in the real world.

The Deeplearning4j software distribution contains the following components:

  • Deeplearning4j: Neural network DSL (facilitates building neural networks integrated with data pipelines and Spark)

  • ND4J: N-dimensional arrays for Java, a tensor library: "Eclipse January with C code and wider scope". The goal is to provide tensor operations and optimized support for various hardware platforms

  • DataVec: An ETL library that vectorizes and "tensorizes" data. Extract transform load with support for connecting to various data sources and outputting n-dimensional arrays via a series of data transformations

  • libnd4j: Pure C++ library for tensor operations, which works closely with the open-source library JavaCPP (JavaCPP was created and is maintained by a Skymind engineer, but it is not part of this project).

  • RL4J: Reinforcement learning on the JVM, integrated with Deeplearning4j. Includes Deep-Q learning used in AlphaGo and A3C.

  • Jumpy: A Python interface to the ND4J library integrating with Numpy

  • Arbiter: Automatic tuning of neural networks via hyperparameter search. Hyperparameter optimization using grid search, random search and Bayesian methods.