Proposals

Eclipse TRAICE (Tracking Real-time AI Carbon Emission)

Tuesday, September 17, 2024 - 11:20 by Matthew Khouzam

Horizontal Federated Machine Learning (H-FML)

Federated learning enables AI/ML model training at the network nodes by exploiting large scale distributed data and compute resources. Federated learning also restricts explicit data sharing so that confidentiality and privacy associated with the use case are preserved. FL differs from classical AI/ML in four main domains: data privacy (no end-user data leaves the device, worker, node or client), data distribution (data could be IID or no-IID), continual learning (the communication time
between client and central server may be too long to provide a satisfactory user experience), and aggregation of data (some privacy notions and rules are violated when user data aggregation occurs in the central server) [4,5].

Federated learning will require $\mathcal{K}$ devices to upload and aggregate parameters iteratively to train the global model [6, 7]. In such a scenario, distributed devices (Mobile devices, workers) collaborate, to train a common AI/ML model under the coordination of an access point (AP) or parameter server.

H-FML occurs over multiple communication (encapsulated into upload cost and download cost) and computation rounds. In each training round, five-stages process is repeated until model convergence.

-   In step 1, The FML starts when a training task is created by the sever (coordinator) who initializes the parameters of the global model and sends to each worker (client or participant), over first download cost. 

-   In step 2, each worker k in K (participants) independently completes training on its local dataset to minimize the loss on their local data distribution D_k of size n_k.

-   In step 3, Each worker submits its local model to the server (coordinator) over upload cost.

-   In Step 4, the global model is consolidated by aggregating local models received from workers by the server.

-   In Step 5, The global model is then dispatched back to workers over second download cost. This updated global model will be used by each worker in the next training round.

To achieve the goal in Step 2, FML train a global model that minimizes the average loss across parties.

In subsequent training iteration (Step 2, Step 3, Step 4, and Step5 are the single round of FL), the process is repeated until the training loss converges or the model converges, or a time limit is exceeded, or the maximum number of iterations is reached.

ECLIPSE TRAICE

Eclipse TRAICE is proposed as an interactive visualization software designed for Real-Time monitoring of AI/ML system carbon emission. It is designed to handle horizontal federated machine leaning algorithms (H-FML) and enabling user to track the carbon footprint. TRAICE allows users to simultaneous observe the environmental impact and the model training
effectiveness.

TRAICE is build based on three main components:

-   A python library

-   A Server and

-   A client application

The python library binds to the code ran by the nodes (workers) of a H-FML based on a TRAICE package and collects real-time data generated by the modes. The library then sends those metrics to the server. The library exploits the codecarbon library [3] to extract the amount of energy and carbon emission spent by the nodes.

The server aggregates the generated data and computes several metrics on the real-time training session of participant workers.

The client is a user's visualization tool, allowing the users to monitor and interact with the framework.

The components communicate via WebSockets, ensuring a real-time bidirectional communication between the workers and the servers, as well as between the server and client.

System Requirement

Eclipse TRAICE runs and requires a Docker engine installed and up running on the host machine. Python should be installed to run scripts and components. Python version 3.10.x is recommended. The following requirements are also mandatory:

-   The library must be installed on all workers nodes of the H-FML. The workers can run anywhere (as per user preference) and must only be able to communicate with the server.

-   The server can be deployed everywhere but, must expose port 3000 and be accessible by all services. Appropriated access should be allowed proper routing techniques and firewall configurations.

-   The database can be deployed anywhere and only needs to communicate with the server. Ideally it could be on the same subnet as the server to minimize latency.

-   The client can be installed anywhere but must communicate with the server. On a typical client-server architecture, the client sends requests to the server and received response on return. The process
   is activated via socket connections.

ECLIPSE TRAICE System Components

Eclipse TRAICE system is deployed based on three parts. Each part is deployed as a separated Docker container.

1)  TRAICE- frontend: The TRAICE client application for visualizing the graphs of emissions in real-time.

2)  TRAICE-backend: Is the server responsible for receiving, aggregating and sending the carbon emission data of the worker participating in the training to the TRAICE client.

3)  TRAICE-database: Used for storing training data, as SQL database.

Docker Compose facilitates the communication and network setup between the containers, ensuring they operate seamlessly as a unified system. The system components each expose different ports for communication.

Here are the exposed ports:

-   Server: port :3000

-   Client: port :4200

-   Database: port :8001 (The Docker container exposes port 8001 and
   redirects it to port 5432 inside the container to communicate with
   the PostgreSQL, service).

Step-by-step installation of Eclipse TRAICE with Docker Compose

Three distinct bidirectional communications have been identified. The TRAICE library communicates bidirectionally with the TRAICE sever. The TRAICE server communicates bidirectionally with the client and finally, the TRAICE server communicates bidirectionally with the database.

The steps of the installations are as follows:

-   Clone the repository and navigate to the project root.

-   Build the application images with Docker Compose: docker -- compose
   build

-   Run the application: docker -- compose up -d

Access the frontend by navigating to http://localhost:4200 in your local web browser. To stop all containers related to TRAICE: docker -- compose down

Installing Eclipse TRAICE library

To build the library from sources, follow the steps below to obtain the ".whl" file:

cd library/traice

pip install --upgrade setuptools

pip install --upgrade build

python -m build

The ".whl" will be created in the "dist" folder, you can then install the package by doing "pip install <filename> .wh" , (e.g. "pip install Traice-0.0.1-py3-none-any.whl").

Library Usage

The "example" folder contains examples of usage. The library exposes a class "TraiceClient" that handles all the tracking and communication logic. Once this is done, you should be able to see information about the training and its energy usage in the frontend!

-   Typical example: cifar10

This example shows the federated training on CIFAR-10 dataset with 3 workers and carbon emissions tracking using TRAICE. This code is based on the Flower Federated Learning library example available here:

[Flower Quickstart PyTorch\](https://github.com/adap/flower/tree/main/examples/quickstart-pytorch).

To use, first install the dependencies using "pip install -r requirements.txt" (if you don't have "TRAICE" installed, please follow instructions given above to build and install it).

# 1. Start federated learning server

python server.py

# 2. Start TRAICE server (follow instructions in section c))

# 3. Start workers

python worker.py --node-id 0

python worker.py --node-id 1

python worker.py --node-id 2

You can now open the TRAICE client and access the visualization.

IV/- References

1.  K. Ahmad, A. Jafar, and K. Aljoumaa, "Customer churn prediction in    telecom using machine learning in big data platform," Journal of Big Data, vol. 6, no. 1, pp. 1--24, 2019.

2.  L. Bariah, H. Zou, Q. Zhao, B. Mouhouche, F. Bader, and M. Debbah, "Understanding telecom language through large language models," in IEEE Global Communications Conference (GLOBECOM), 2023, pp. 6542--6547.

3.  S. Luccioni, "Code carbon: Track and reduce CO2 emissions from your computing," https://github.com/mlco2/codecarbon, 2013.

4.  Y. Chen et al, "Federated learning for privacy preserving IA", Communications of ACM, vol. 63, no. 12, pp. 33-36, 2020.

5.  Q. Yang et al, "Federated machine learning: concept and applications", ACM Transaction on Intelligent Systems and Technology (TIST), vol. 20, no. 2, pp. 12-19, 2019.

6.  D. Ye et al, "Federated Learning in Vehicular edge computing: A selective model aggregation approach", IEEE Access, vol. 8, pp. 23 920-23 935, 2020.

7.  Zhilu Chen and Xinming Huang, "E2E learning for lane keeping of self-driving cars", IEEE Intelligent Vehicles Symposium (IV), 2017,  pp. 1856-1860
 

Eclipse TMLL (Trace Server Machine Learning Library)

Monday, September 16, 2024 - 13:49 by Matthew Khouzam

Eclipse TMLL provides users with pre-built, automated solutions that integrate general trace server analyses (e.g., CPU usage, memory, and interrupts) with machine learning models. This allows for more precise, efficient analysis without requiring deep knowledge in either trace server operations or ML. By streamlining the workflow, TMLL empowers users to identify anomalies, trends, and other performance insights without extensive technical expertise, significantly improving the usability of trace server data in real-world applications. 

Capabilities of TMLL 

  • Anomaly Detection: TMLL employs unsupervised machine learning techniques, such as clustering and density-based methods, alongside traditional statistical approaches like Z-score and IQR analysis, to automatically detect outliers and irregular patterns in system behavior. This helps users quickly identify potential anomalies, such as unexpected spikes in CPU usage or memory leaks.
  • Predictive Maintenance: Using time-series analysis, TMLL can forecast potential system failures or performance degradation. By analyzing historical data, the tool can predict when maintenance or adjustments will be necessary, helping users avoid costly downtime and improve system reliability.
  • Root Cause Analysis: TMLL leverages supervised learning techniques to identify the underlying causes of performance issues. By training models on labelled trace data, users can determine which factors contribute to problems such as bottlenecks or system crashes, leading to faster resolution and more effective troubleshooting.
  • Resource Optimization: Through a combination of classical optimization techniques and Reinforcement Learning (RL), TMLL helps users optimize system resources like CPU, memory, and disk I/O. This ensures efficient use of system resources and helps avoid unnecessary waste, while also adapting to changing workloads for better overall performance.
  • Performance Trend Analysis: TMLL provides comprehensive tools to analyze long-term performance trends. By evaluating historical data and identifying patterns, users can detect performance shifts, regressions, or improvements over time, providing valuable insights for ongoing system optimization and future planning. 

Eclipse Wheel Speed Sensor Signal Packer

Wednesday, September 11, 2024 - 07:42 by Daniel Fischer

Making Eclipse Wheel Speed Sensor Signal Packer a lossless packing SW-module available as FOSS will avoid a multitude of proprietary solutions. Instead, a generic packer SW-module shall be made available which for example can be integrated by brake system suppliers as default into their main path brake SW.
A key purpose of the project is to establish an open industry standard for losslessly packed WSS signals for subsequent spectrum analysis.
It shall avoid competition restrictions which otherwise could arise through the choice of a certain proprietary packer solution by the brake system supplier leading to incompatibility with applications from other sources selected by the vehicle manufacturer ("lock-in"). In the best case, it enables through interoperability creating an (admittedly small) ecosystem of WSS-spectrum-based functions from various, fairly competing players and promotes the creation of new and innovative solutions.

Eclipse Quneiform

Tuesday, September 10, 2024 - 11:05 by Blake Madden

Eclipse Quneiform offers support for analyzing C/C++, Java, and C# source code to identify internationalization (i18n) issues. Additionally, Eclipse Quneiform assists in reviewing and pseudo-translating translation catalogs, including gettext (.po files), Java .properties files, XLIFF files, and .NET formats like .resx and .xaml. We also provide support for reviewing Windows resource files (.rc).

Eclipse Open Collaboration Tools

Wednesday, September 4, 2024 - 04:43 by Miro Spönemann

Eclipse Open Collaboration Tools is a set of libraries, extensions and tools for integrating collaborative editing with multiple editing paradigms (textual, graphical, etc.) and in multiple IDEs or other applications.

The basic idea is simple: one person starts a collaboration session as host and invites others to join. The IDE extension distributes the contents of the hostʼs workspace and highlights text selections and cursor positions of other participants. In parallel, they get together in their favorite meeting or chat app for immediate discussion. All participants see what the others are looking at and what changes they propose in real-time. This way of remote collaboration reduces confusion and maximizes productivity.

The project includes the following components:

  • A protocol definition based on JSON messages, with a reference implementation in TypeScript
  • A Node.js based server for handling authentication and forwarding messages between participants of a collaboration session
  • A VS Code extension for collaborative text editing
  • Additional integrations: Eclipse IDE, Monaco Editor, and more to come

An integration with Eclipse Theia is already included in the Theia project.

Eclipse SKyBT

Friday, August 9, 2024 - 04:51 by Christian Claus

Eclipse SKyBT (Smart Keyword Based Testing)

The core idea of Eclipse SKyBT: Based on our experience from numerous projects, the success factor of testing lies in the test design, everything else can and should be automated as much as possible.

With Eclipse SKyBT the test designers are able to describe the system under test as a model by using a defined syntax and keywords. The model includes the definition and selection of the needed interfaces, logical relations as a state machine and description of the logical objects with keyword sentences.

Custom keywords can be defined as needed for the current project. Platform or communication definitions can be imported and then be used as keywords. The keyword and syntax management are embedded.

Form these keyword-based models of the system under test the test designer is able to generate the needed testcase using the testcase generator. 

Of course, the user is also able to write classic testcases based on the syntax and keywords.

The models and testcases can be combined with parameters and test data sets.

As most of the test departments already have a working test management, testcases can be exported to the used application lifecycle management tools. 

Coming from the ALM tool the configured test suites can be executed in the used test automation. Since the testcases are based on keywords, there is no need to implement or update the testcase. All testcases that are based on the already implemented keywords can be executed directly. 

Jakarta Logging

Wednesday, July 31, 2024 - 17:06 by Christian Grobmeier

Both Log4j and SLF4J contain legacy elements that are outdated by modern standards. This project aims to distill the most effective features from both APIs. A new Jakarta Logging API will be modern, user-friendly, and efficient. The goal is to make upgrading to the Jakarta Logging API straightforward, ensuring that it feels familiar to current users while providing improved functionality and simplicity.

Eclipse Eclipse eXtensible State Machine

Tuesday, July 23, 2024 - 07:43 by Carsten Pitz

Eclipse eXtensible State Machine (XSM) provides a middleware to implement state machines. It allows to alter an existing state machine without altering the existing code. As a middleware it does not provide a  service to a user directly but helps developers to focus on business logic.

An example:

Imagine a simple LIN node. The standard behaviour might be, that if it cannot serve due to internal issues it signals no service to the gateway. A carmaker explicitly requires the LIN node to signal operating the first 2s after powering on. With Eclipse eXtensible State Machine you can add this custom behaviour without touching the existing code.

Eclipse Ankaios Dashboard

Monday, July 15, 2024 - 05:51 by Felix Mölders

The Ankaios Dashboard is the UI for the Eclipse Ankaios project. It offers insights into a running Ankaios cluster, allows for modification/deletion/creation of workloads and offers a dependency graph for easier understanding of the interdependencies between workloads. The dashboard is a powerful tool to understand the functionality of Ankaios, if one didn't use the Ankaios CLI commands so far. Furthermore, if one is running into any issues during the execution of the Ankaios workloads, it is a good starting point to start debugging, especially for finding workloads that are missing that other workloads are depending on.