This proposal has been approved and the Eclipse Daanse - Data and Analysis Services project has been created.
Visit the project page for the latest information and development.

Eclipse Daanse - Data and Analysis Services

Thursday, November 16, 2023 - 05:47 by Stefan Bischof
This proposal is in the Project Proposal Phase (as defined in the Eclipse Development Process) and is written to declare its intent and scope. We solicit additional participation and input from the community. Please login and add your feedback in the comments section.
Parent Project
Proposal State
Created
Background

The "Smart City Project Jena" is an innovative project that aims to improve the quality of life and efficiency in the city of Jena through the use of state-of-the-art technologies. A key component of this project is the implementation of a Smart City system that optimises various aspects of urban life, such as energy efficiency, transport, waste management and public services. A key software component of this project is the use of the open source "Mondrian" framework, which was originally developed to enable interactive data visualisations for business intelligence applications. However, the framework has been modified by creating a fork to adapt it to the specific requirements of the Jena Smart City project. It seems that there was an original intention to work with the main developer of the Mondrian framework, but for various reasons this was not an option. Therefore, Smart City Project Jena decided to pursue independent development by creating a fork of the Mondrian (https://github.com/pentaho/mondrian) code and adapting it according to the specific needs and requirements of the project. As there was no way to continue working with the original developer, the only option for continued development of the modified code may be a contract agreement with a company like Hitachi to obtain the necessary resources and expertise to move the project forward. The Smart City Project Jena thus illustrates how innovative projects sometimes choose to use existing open source technologies, adapting and developing them to meet specific requirements and objectives. The decision to work with an external company allows the project to get the support it needs and achieve its goals more efficiently.

Scope

Eclipse Daanse (Data Analysis Services) allow to document, publish, analyze and visualize large amounts of data and extract valuable insights from it, providing server-side Services like OLAP (Online Analytical Processing) , APIs like XMLA (XML for Analysis) and client-side Services like Dashboards.

Description

Eclipse Data Analysis Services (Daanse) is an open source software designed to analyze large amounts of data and extract valuable insights from it. With a steady rise of digital data production in business, science and technology, the importance of data analysis has grown tremendously. Data Analysis Services offers a wide range of tools, technologies and expertise to interpret data in meaningful ways and help decision makers formulate strategies and identify patterns and trends. There are several important features of Data Analysis Services: Data cleansing and integration: One of the crucial steps in data analysis is the preparation of the data. Often data are flawed or incomplete. Data Analysis Services helps to clean, transform and bring data into a unified form to enable sound analysis. Data Visualisation: Data can be complex and numbers alone are sometimes not enough to convey information. Data Analysis Services uses visualisation tools to present data in the form of charts, graphs and interactive dashboards. This simplifies the communication of results and insights. Statistical Analysis: Data Analysis Services uses statistical methods and models to analyse data and identify patterns, correlations or significance. This includes descriptives, inferential statistics, regression analysis, time series analysis and more. Business Intelligence: Data Analysis Services helps management to extract business-critical insights that help improve business processes, optimise resources and identify new business opportunities. Big Data Analysis: In a world characterised by huge amounts of data, Big Data analysis is an integral part of Data Analysis Services. Processing and analysing large and complex data sets require specialised tools and technologies.

Eclipse Daanse combines the following technologies to cover different applications scopes:

XMLA (XML for Analysis)

XMLA provides a detailed specification for accessing analytical data sources. This specification describes the structure of XMLA messages and the supported operations for accessing OLAP data sources. The Test and Compatibility Kit (TCK) can be used to check compliance of different implementations with the XMLA specification and to ensure interoperability between systems. Eclipse Daanse offers a Java API via Jakarta XML Bind and flexible customisable SOAP messages.

MDX (Multidimensional Expressions):

MDX is a query language for multidimensional data sources and is closely related to XMLA. The scope and delimitation of MDX are as follows: Stringd can be converted into an implementation or API model using parsers.



OLAP

OLAP (Online Analytical Processing) is a powerful technology widely used in data analysis and reporting. It is based on several important components: The DataCube Provider makes it possible to create and provide multidimensional data cubes that offer efficient and flexible data organisation for complex analyses. OLAP's Access and Security Model regulates access rights to data sources and ensures that only authorised users can access the information they need, while sensitive data remains protected. The Calculation Model enables complex calculations and aggregations to be performed on the multidimensional data. This enables the creation of meaningful reports and analyses that provide deeper insights into the data. OLAP's Dynamic Function Model allows dynamic functions to be used to respond to changes in data sources or query results. This allows users to customise analyses and include real-time data in their reports. Parts of this implementation are a Fork ot the Pentaho  Mondrian Project

Databases:

Different databases can be integrated via JDBC (Java Database Connectivity) which allows developers to interact with different databases by providing a standardised method for connecting and querying. The JDBC Database Dialect Abstraction Layer makes it possible to work with different databases without having to worry about the specific syntax differences. OLAP data can thus also be accessed by means of OLAP database-schema mapping.



Clients:

Eclipse Daanse enables access to analytical data sources from various client applications: In addition to the possibility of integrating your own applications using the client libraries (in several languages like Java and Typescript), Eclipse Daanse includes its own web client for tables, charts and maps as well as dashboards. In addition, further reporting tools can be connected via adapters and templates.



Dashboards:

In addition to XMLA data sources, the Eclipse Daanse client's dashboard engine enables other data sources such as SensorThingsAPI, REST and OCG (WMS, WFS) to be connected and merged with each other. These data sources can feed a variety of visualisations such as charts, tables, maps, infographics ,texts and more. Visualisations can be combined to different widgets and designed interactively via input and control options such as (fields, buttons, sliders, map events ... ). In addition to classic dashboards, interactive maps, infographics or interfaces for e.g. building management systems can also be realised in this way.

Why Here?

The Eclipse Foundation's governance model is an integral part of the organisational structure. of the organisational structure, which ensures that the developer community and Foundation members work together in a transparent, collaborative and rules-based environment. environment. The Foundation has clear guidelines for the licensing of code and ensures that the projects it hosts are compatible with open source licences, that respect the freedom of developers and users. We see the Eclipse Foundation as the basis for providing users and developers with the free use of data analysis.

Future Work

Integrate Usecases of SmartCitys and Business Inteligence e.g.

- EMF Model based datacubes

- additional DataProviders that are commen uses in SmartCitys and Business Inteligence

- DataSecurity and DataPrivacy and Documentation

- Tutorials

- Remote/Distrebured Datacubes

- SemanticWeb bridge

- Eclipse Sensinact Event based DataCubes

Project Scheduling

Q1 /2024 - starting with clean, new and empty repos

Q1 /2024 - initial contribution of existing code

Q2 /2024 - stabilize SNAPSHOT releases

Q3 /2024 - BETA releases XMLA and Dashboard Clients

Q4 /2024 - BETA releases Server Components

Q1 /2025 - STABLE releases

Project Leads
Interested Parties

City of Jena

DataInMotion

Stefan Bischof - individual

Initial Contribution

Fork and Rework of Pentaho Mondrian parts

New Components:

- XMLA APIS

- MDX Parser

- Clients XMLA and Dashboard Engine

Source Repository Type