Java News from Thursday, December 11, 2003

Oracle has posted the second public review draft specification of Java Specification Request (JSR) 73 Data Mining API. According to the draft spec,

The Java Data Mining (JDM) specification addresses the need for a pure Java API to facilitate development of data mining-enabled applications. JDM supports common data mining operations, as well as the creation, persistence, access, and maintenance of metadata supporting mining activities.

Currently, no existing Java platform specification provides a standard API for data mining systems. Existing APIs are vendor-proprietary. By using JDM, implementers of data mining applications can expose a single, standard API that will be understood by a wide variety of developers writing client applications and components running on the Java™ 2 Platform. Similarly, data mining clients can be coded against a single API that is independent of the underlying data mining system. JDM is targeted for the Java™ 2 Platform, Enterprise Edition (J2EE™) and Standard Edition (J2SE™).

In JDM, data mining [Mitchell1997, BL1997] includes the functional areas of classification, regression, attribute importance1, clustering, and association. These are supported by such supervised and unsupervised learning algorithms as decision trees, neural networks, Naive Bayes, Support Vector Machine, K-Means, and Apriori, on structured data. Common operations include model build, test, and apply (score). A particular implementation of this specification may not necessarily support all interfaces and services defined by JDM. However, JDM provides a mechanism for client discovery of supported interfaces and capabilities.

JDM is based on a generalized, object-oriented, data mining conceptual model leveraging emerging data mining standards such the Object Management Group’s Common Warehouse Metadata (CWM), ISO’s SQL/MM for Data Mining, and the Data Mining Group’s Predictive Model Markup Language (PMML), as appropriate Implementation details of JDM are delegated to each vendor. A vendor may decide to implement JDM as a native API of its data mining product. Others may opt to develop a driver/adapter that mediates between a core JDM layer and multiple vendor products. The JDM specification does not prescribe a particular implementation strategy, nor does it prescribe performance or accuracy of a given capability or algorithm.

To ensure J2EE™ compatibility and eliminate duplication of effort, JDM leverages existing specifications. In particular, JDM leverages the Java Connection Architecture [JSR16] to provide communication and resource management between applications and the services that implement the JDM API. JDM also reflects aspects the Java Metadata Interface [JSR40] for the general interface specification.

Comments are due by March 8, 2004.

Version 1.3.5 of BlueJ, a free integrated development environment (IDE) for Java aimed at education, has been released. The major new feature in 1.3.5 is auto-open of the last used projects on startup. Numerous minor improvements and bug fixes have also been made.