Hello H2O community,
there are many new changes in H2O ecosystem and we are working furiously to publish and share them with the community.
In this context, we are preparing a new H2O release 3.12 with amazing features (e.g., AutoML, XGBoost support) and planning some changes which can affect existing code bases. This email would like to inform and start discussion about them.
The changes include:
- migrating from Java 6 to Java 7
- modularization of code base with help of Java Service Provider Interface (SPI) instead of using reflections library
- Java 6 public support was ended in February 2013
- Lack of Java 6 compatible libraries (e.g., Jetty)
- Security concerns with using old libraries to keep compatibility with Java 6
-
We will remove Java 6 support from H2O build chain including:
- removal of artifact byte code rewriting from Java 7 to Java 6
- upgrading Animal Sniffer signature to Java 7
-
We are going to publish only Java 7 compatible binary artifacts to Maven Central.
-
We are going to use only Java 7 compatible syntax in our source code base. Only exception is
h2o-genmodelmodule, which we will try to keep close to Java 6 syntax.
If your stack is running on top of Java 6 JVM (e.g., old Hadoop distribution, proprietary tools), the H2O will stop working. Please let us know!
- The change is implemented in PR-835
- The JIRA epic number is PUBDEV-4049
- We would like to provide more flexible system to extend H2O and plug new tools into H2O platform (e.g, XGBoost, TensorFlow, Sparkling Water).
- Current code base is using reflections library to handle lookup of optional components, however it brings several issues including:
- limit on used package name by extension (only
waterandhexare allowed) - force traversal of full classpath which causes problems in systems with dynamic classloaders (e.g., Spark executors).
- limit on used package name by extension (only
- We will remove usage of reflections library to find instances of
water.AbstractH2OExtension,water.api.AbstractRegisterandwater.api.Schema - The extensions (meaning classes listed in the previous point) will be registered using Java Service Provider Interface. In short, the concept relies on service files which are located in
META-INF/servicesdirectory. Each service file is called by name of a class it extends (e.g.,water.AbstractH2OExtension) and contains a list of classes which extends the service class. For example, for core H2O REST API we have a single fileh2o-core/src/main/resources/META-INF/services/water.api.RestApiExtensionwhich contains 3 REST API extensions implementing interfacewater.api.RestApiExtension:
water.api.RegisterResourceRoots
water.api.RegisterV3Api
water.api.RegisterV4Api
-
We provide capabilities REST end-point to provide list of registered core extensions, REST API extensions, parsers (WIP)
-
In the scope of H2O source code, we provide optional
@AutoServiceannotation to register extensions (see documentation). -
We do not modularize R/Python/Flow clients. The client is responsible to self-configure based on information provided by the backend and fails gracefully if user invokes an operation which is not provided by backend
Note: the same concept is already used in H2O to register parsers and Rapids extensions.
- Code which register new REST API calls by extending
water.api.AbstractRegisterclass will need to be updated by adding a service file as described above - Each class extending
water.api.Schemaneeds to be registered as well inwater.api.Schemaservice file.
- The change is implemented in PR-915
- The JIRA epic number is PUBDEV-4271