×

Status message

This proposal has been approved and the Eclipse Ditto project has been created.

Eclipse Ditto

Basics
This proposal is in the Project Proposal Phase (as defined in the Eclipse Development Process) and is written to declare its intent and scope. We solicit additional participation and input from the community. Please login and add your feedback in the comments section.
Parent Project: 
Background: 

In the Internet of Things two disciplines of software development come together. On one hand, a hardware influenced view of software for devices and protocols and on the other hand the view of web/mobile/business application development.

To bring these both sides closer together the metaphor of Digital Twins can be used, where a Digital Twin is kind of holistic view of all capabilities and aspects of a device/product asset including its digital representation.

It is important to make these Digital Twins accessible in a convenient way, independent of the protocols used to integrated them and the current connectivity state of the devices. The core of this Digital Twin is the information of the state of the device regarding properties and meta data.

As of today a lot of IoT solutions implement functionality to manage something similar to this idea of Digital Twins. They do this by using simple databases covering some aspects of Digital Twins, which is totally fine until a certain point in time.

Probably more complex topics come into play soon:

  • Reflect changes of the physical world in the back end representation
  • Notification of different application layers on relevant changes
  • Providing access to the Digital Twins via API
  • Harmonizing the persistence and access for multiple different device types
  • Providing access control for the Digital Twins
  • On increasing load still provide a scalable, robust and high-performance implementation
  • ...

Gradually the initially good and simple starting point gets more and more work and requires to invest in technical fundamentals instead of business level features.

It would be helpful to have a domain-independent component that takes care of these mentioned aspects and can be used out-of-the-box in IoT solutions.

Beyond that, this component could be used in a "back end as a service" approach to move into the direction of serverless architectures. Without the need to develop and operate a custom back end, IoT solution developers can focus again on business requirements, on connecting devices to the cloud / back end and implementing business applications.

Scope: 

Eclipse Ditto is the open-source project of Eclipse IoT that provides a ready-to-use functionality to manage the state of Digital Twins. It provides access to them and mediates between the physical world and this digital representation.

To achieve this Ditto addresses the following aspects:

  • Device-as-a-Service
    Provide a higher abstraction level in form of an API used to work with individual devices.
  • State management for Digital Twins
    Differ between reported (last known), desired (target) and current state (live) of devices, including support for synchronization and publishing of state changes.
  • Organize your set of Digital Twins
    Support finding and selecting sets of Digital Twins by providing search functionality on meta data and state data.
Description: 

Device-as-a-Service

IoT solutions have to interact with a heterogeneous set of device types, device protocols and communication patterns.

To bring back simplicity to IoT developers, Eclipse Ditto exposes a unified resource-based API that can be used to interact with devices, abstracting from the complexity of different device types and how devices are connected. It helps to structure the devices into their distinct aspects of functionality and can optionally enforce data types and data validation based on a formal device meta model (Eclipse Vorto).

As devices are equipped with a public API (potentially public on the Internet), it is crucial to define on a device level which individuals are allowed interact with the devices. Ditto ensures that access to the device API is only granted for authorized individuals. Authentication is not scope of Ditto and delegated to existing identity providers.

With this approach your devices are turned into services with a hosted, always accessible and available API.
Devices managed by Ditto are usable as easy as any of the other services (like weather, maps, ...) within your application.

State management for Digital Twins

A digital representation of physical devices consists at its heart of the state these devices.

For IoT solutions the following information regarding state is most relevant:

  • Device and sensor properties like temperature, location, level, fault information, etc.
  • Configuration properties of sensors and actors like thresholds, intervals, ranges, toggles and limits, etc.

A good representation of this state in a Digital Twin should support different perspectives for these properties:

  • Reported property values based on the last transmission to the back end
  • Desired target property value for configuration properties
  • Live perspective reflecting the properties values at the current point of time

The state management provides access to all three different perspectives and helps in synchronizing between them.

Organize your set of Digital Twins

When interacting with huge amounts of devices, it can get difficult to keep track of which devices and devices types are there and how they are related (e.g. spatially).

It is wise to add meta data like e.g. manufacturer, model information, static geographic location, serial number or software version to devices in order to find them again later.
That meta data as well as the state data is automatically indexed by Ditto which leads to fast search responses provided by the search API, even when there are millions of devices to search in.

When working with sets of devices authorization information is used to determine the visibility of devices to individuals and control who can change device data.

Features at a glance

The described functionality is offered by Ditto in form of the following features:

  • Unified resource-based HTTP JSON API representing devices
  • Definition of a simple "Digital Twin State Management Protocol" using JSON for command- and events-based interaction with devices
  • Websocket API based on this protocol in addition to the resource-based HTTP API
  • Managing device meta data via APIs
  • Optionally configure and enforce a schema for device state (via Eclipse Vorto)
  • Schema evolution support for schema based device state
  • Accessing and setting different state perspectives
    • Live
    • Reported
    • Desired
  • Notification about changes of device resources via HTTP Server-sent events (SSE)
  • Authorization/Access control at device API enforcing that only allowed individuals may read/write
  • Search HTTP API accepting predicate-based query language
  • Emit events resulting from state changes of devices which can be used
    • for building up a "transaction log" (e.g. using Apache Kafka)
    • as source for stream processing (e.g. via Apache Spark Streaming)
    • for building up additional persistence representations (e.g. into an InfluxDB providing optimized access to the history of a device's state properties)
    • for transmitting data into data analytic tools (e.g. into an Apache Hadoop HDFS)
  • Out-of-box integration with Eclipse Hono for communication with devices using standard or custom device protocols

Integrate Ditto in a IoT landscape

Ditto is especially useful in the context of a larger IoT landscape. In this landscape the other important aspects of a IoT solution like device communication or data analytics are covered by distinct components.

The following diagram shows a typical landscape for IoT solutions where Ditto brings in its functionality:

The most important integration aspect is the device communication. This is essential for the Digital Twins to really be twins of real-world physical devices.

In addition to provide a custom device communication layer Ditto will provide an out-of-the-box integration with Eclipse Hono to support device communication.

In this scenario Ditto uses Eclipse Hono in order to receive messages (e.g. state changes) from devices and to send messages (e.g. configuration changes) to devices.

Why Here?: 

The Eclipse IoT community already brings a lot of very useful technologies and components for IoT applications. As the scope of Ditto has a strong focus on the management of Digital Twins it is essential for it to integrate with these other components to provide a helpful end-to-end platform for IoT application developers.

That's why Ditto does explicitly not cover aspects like connectivity and communication with devices, communication protocols, the development of (embedded) software for devices and the modeling of devices.

To support this end-to-end platform idea the following projects, deal with relevant aspects that are interesting for Ditto:

  • Eclipse Hono for the message exchange with devices
  • Eclipse Vorto for the modeling of device structures reflected by the Digital Twins
  • Eclipse Hawkbit for rolling out software updates based on meta-data of the Digital Twins
  • Eclipse Kapua as integration framework as easy quick-start for end-to-end IoT solutions leveraging the Digital Twins approach

A potential orchestration of Ditto with these Eclipse IoT projects could result in the following end-to-end platform:

Project Scheduling: 

We aim to provide the initial contribution in Q2/2017.

Future Work: 
  • Alternative search index implementation based on Elastic Search
  • Support for geolocation information within search and notifications (e.g. for geofencing)
  • Integration of business logic executing using Function-as-a-Service providers (e.g. OpenWhisk)
  • In general: scoping towards more aspects of the Digital Twin metaphor
    • seamless integration of Digital Twin history data
    • integration with technologies covering high-level aspects of Digital Twins like semantics, simulation and orchestration/federation
People
Source Code
Initial Contribution: 

The initial contribution of Ditto will contain several ready-to-run microservices bundled as Docker images.

The contributed source code is structured as Maven multi module build providing services as well as libraries on which the services are based on.

Libraries

  • ditto-json
    • JSON library inspired by minimal-json (https://github.com/ralfstx/minimal-json) library
    • provides convenient access for JSON fields, JsonPointer based operations (get/set), definitions of JSON fields, etc.
    • uses minimal-json as default JSON serializer/deserializer
  • ditto-model
    • includes domain model classes for Ditto, e.g. "Thing" and "Feature"
    • all domain model classes are serializable to JSON and deserializable from JSON
  • ditto-commands-events
    • includes command and event classes used for CQRS pattern, event sourcing + event publishing
    • all commands and events are serializable to JSON and deserializable from JSON
  • ditto-protocol-adapter
    • provides a definition of the "Digital Twin State Management Protocol" including an adapter of the command and event objects to JSON format

Microservices

The services are implemented in Java and make use of the Akka Toolkit (http://akka.io/).

The services use an externally provided MongoDB as database (https://www.mongodb.org). This is a prerequisite that has to be provided by the user and is not part of Ditto.

  • ditto-api
    • provides REST-like HTTP API for CRUD and search operations on Things
    • provides a Websocket API for the same CRUD operations and for receiving events
  • ditto-things
    • responsible for persisting/restoring Things into a MongoDB
    • applies event sourcing pattern: accepts commands for modifying which result in events which are persisted + published
  • ditto-search-updater
    • subscribes to events published by "ditto-things" service and updates its own search index in MongoDB
    • does synchronizations if events for specific Things were missed in order to keep eventual consistency
  • ditto-search
    • responsible for answering to search queries directly from optimized search index from MongoDB
  • ditto-amqp-bridge
    • consumes AMQP 1.0 interface and forwards the received "State Management Protocol" messages to the responsible services
    • emits published events from "ditto-things" towards the AMQP 1.0 interface via the defined "State Management Protocol"

Source Repository Type: 
Ed Stevens's picture

Hello

I'm very interested in the concept of Digital Twins, especially in the concept of managing industrial assests.

Currently every major platform is rushing to get a 'Digitial Twin' offer on the market, including GE, IBM, Siemens, PTC, ...
There's a lot of literature on it, but most seem to be from a marketing perspective, and from the point of view of their own platform.
People are even starting to talk about it as a coming war between these industrial platforms.

Let's take a typical example of a mixing/blending/reactor vessel in a plant. (Applicable to chemicals, food, pharma, ...)

So around this Vessel Unit you could define multiple possible digital twins:

  • The Vessel Unit including all the connected pipes, valves, motors, measurements, ...
  • The vessel itself, so only the mechanical steel structure
  • Another part of the Vessel Unit, for example the Bottom Valve, or a Temperature Transmitter
  • The Plant of which the Vessel Unit is a part of
  • The Company that owns the Plant (there could be 10-100 plants worldwide for any bigger sized company)
  • Something less physical: a Production Order running on this Vessel Unit

Some questions/thoughts related to this list:

1.      Could all of these have their own digitial twin?
I think yes. For example the digital twin of that valve might contain Technical Data Sheets, an event history of that valve that logs each time the valve opens or closes, a maintenance log, ...
The producion order probably also yes: already today in plants that do track & trace the PO's are digitally managed and stored, including that it was produced on this particular vessel.
The Vessel Unit also yes: it will have some properties by itself, but also a collection of assest that each have their own digital twin, like that bottom valve.
The Plant also yes: a collection of all the assets in that plant.
And finaly the Company as a collection of all the assets it owns.

2.      Will all of these digital twins run on the same platform? For example all on Predix?
I think that will be almost impossible. It might be possible inside one plant, but even there it would be difficult to keep it limited to one platform.
At the plant level it's clear that there will be different platforms: companies often buy entire plants, and those might be using another platform for their digital twins

3.      So to make the Digital Twin concept work we'll need one, or at least a limited number of open API's to access Digital Twins running on different platforms

4.      The actual status of one Digital Twin will probably be distributed over different systems. For example the actual temperature of the Temperature Transmitter might be in the DCS, while the historical time series data of that temperature might be in a Historian, and the Technical Data sheet of the transmitter might be on the website of the vendor. So you'll probably be able to access the Digital Twin at one system with an API, and then there it connects to other systems to get the latest information.

5.      Somehow we'll probably wantto incorporate semantics in the API when linking to other assets. For example:

o   Parent

o   Child

o   Upstream Asset

o   Downstream Asset

o   ...

6.      Will Digital Twins all be managed in the cloud? Probably there will also Digital Twins only contained at Edge devices. For example a Package Unit might come pre-configured with its own Digital Twin.

7.      Access Permissions: with the API you'll probably be able to browse to connected Digital Twins. Somehow we'll need to have a system to give and limit access to particular users/groups. Also even when access is granted you'll probably have different access levels for one Digital Twin.

8.      We'll probably want to have a Query-language that allows us to browse Digital Twins and their data. For example:

o   List all Vessel Units in the Company that produce a certain type of PO.

o   Get all the time series data of any measurement related to those Vessel Units, and use them to predict (and optimize) another variable

o   List all events of the Bottom Valve, and compare difference between Vesselt Units, Plants or brand of that valve.

o   ...

 

What do you think? Will Digital Twin technology be like this or different?

And is this the scope of Eclipse Ditto?

 

Ed

 

Gerald Glocker's picture

Hi Ed,

thanks for this description and your questions.

They do - to a large degree - match our ideas in the context of digital twins.
And they also match in parts with our proposal of Eclipse Ditto.

I try to address the topics along your questions.

  1. We think most of the mentioned elements could have their own digital twin. The concept of Eclipse Ditto is quite open regarding the types of state information you manage within your twins. Regarding production orders we are not that sure if they also match this approach because of the quite short lifecycle compared to other assets. Normally we think that digital twins are more long-lived entities.
     
  2. No - we don't think that all digital twins run on the same single platform. Maybe some of them live in the same platform, many others of them live in the same type of platform but some selected twins may run on specialized systems. For support of this kind of heterogeneity we think Eclipse Ditto can help a lot by first proposing a structure of APIs/patterns/models for working with digital twins. As second it can be a reference implementation used for a lot of standard forms of digital twins. And last we already mention under "Future Work" that we think about topics like orchestration/federation, which could help to work with a distributed management of digital twins.
     
  3. Yes - we hope that Eclipse Ditto helps to propose this kind of API/model. If there really comes up some open standardization on this, we of course would like Eclipse Ditto to follow this direction.
     
  4. These are aspects which could be addressed with approaches like Function-as-a-Service that are leveraged in the digital twin API and also aspects that could by worked on in federation scenarios - where one or multiple digital twins work together in a federated way across different runtimes/platforms. The initial scope of the project does not cover these approaches but we are hoping to work on this within the project later on.
     
  5. Some of these mentioned topics are already supported by the proposed model for digital twins that Eclipse Ditto will introduce. We also have some ideas that could help organizing additional information about relationships/topologies of multiple digital twins. For more "semantics" related topics we think about integrations with sophisticated systems in this area.
     
  6. We would tend to not name a digital twin that lives on an edge device as digital twin at all. There is of course a "software" representation of the physical sensors/actors within the chipset of the edge device. But the physical sensors/actors combined with this "software" are actually the real device and not a twin of a device. In contrast to this digital twins could reside on gateway controllers which are used as a kind of relay or local backend for multiple devices. These gateways act as servers for their devices and could be as a runtime for digital twins. In this case the cloud would probably manage "twins of twins" - but this still would make sense.
     
  7. Access control on digital twins is at least as important as access control on real devices. Eclipse Ditto will allow to define who is allowed to read/write/access which digital twin. More fine-grained access control could also be integrated on sub-elements of digital twins - but this is not in the initial scope of the project.
     
  8. Eclipse Ditto will try to provide support for searching and querying the managed digital twins. This will be possible by using a simple predicate-based query language. At first we think of support for querying the twins themselves. Later on there should also be support for querying relationships between twins.


In summary I think that we really see a good matching in the scope you describe and the scope that Eclipse Ditto tries to address.
So it would be great to continue this conversation.

As we just started the Incubation process all discussions are welcome and helpful to shape this project.

Regards,
Gerald

Ed Stevens's picture

Thanks for the feedback Gerald

Indeed very much in line with my expectations!

 

Do you have any more info about the timeline?

When will you have a version ready for some initial testing?

 

Ed Stevens's picture

I also found this interesting related conversation on the Vorto forum:

https://www.eclipse.org/forums/index.php/t/1068541/

 

Paolo Patierno's picture

Hi,

following some questions because I can't see the current code :-)

I can imagine that Ditto is just another consumer application on the "right" side of Eclipse Hono acquiring messages from telemetry & event channels other than using command & control channel for sending messages. Is that right ?

I read that "In this scenario Ditto uses Eclipse Hono in order to receive messages (e.g. state changes) from devices and to send messages (e.g. configuration changes) to devices" so because Eclipse Hono is payload agnostic, does Ditto define a specific payload format that devices have to send for notifying state changes or for receiving configuration changes ? Or you can configure that ?

I read that notification changes are available through HTTP Server-sent events (SSE) which is part of HTML5 so I assume that only a web application can leverage on that. What If I want to develop a "desktop" application (no web based) and receiving such notifications. Do I have to use HTTP polling on the Ditto API ? If yes, Does it make sense to provide an API based on AMQP for example (like the Hono one) tha provide pushing feature.
 
What is the running model ? Is Ditto something like a microservices ready for scalability, failover and so on ?
 
Thanks,
Paolo.
Thomas Jaeckle's picture

 

Hi Paolo.
 
Thank you for your interest in the proposal and the follow up questions.
 
You're right, from the perspective of Hono, Ditto is a consumer application, Hono does not need to know anything from Ditto consuming telemetry and event data. Command/control messages are also sent via Hono to the connected devices.
From the perspective of a user (or developer) working with the Ditto API, Hono is just another device connectivity component. Normally the Ditto user should not be aware of how the devices are connected he is working with.
 
You also assummed correctly that Ditto defines a payload (in the proposal we called this "Digital Twin State Management Protocol") as Hono is payload agnostic. We are however aware that we will need a mapping/transformation concept for devices which, for good reasons, should not be aware of Ditto and its specific payload. We're thinking about custom functions (in a "Function as a Service" manner) which allow to transform raw data coming from devices and also translating command/control messages towards devices. Whether this feature is better located in Hono or Ditto (or even in both) we're happy to discuss in the community.
 
Regarding the notifications: SSEs came up as a HTML5 standard, but can (similar to HTTP) also be used outside of web browsers. You can consume Server Sent Events in a Java "desktop" application. Additionally to the SSEs Ditto will include a Websocket channel for notifications which can also be used in desktop applications. So alone for the pushing feature we see no need for providing an AMQP API. But for other features only a message broker can deliver (e.g. message retention in order to not miss important notifications), connecting to an existing AMQP 1.0 broker seems natural.
 
The running model of Ditto is independent microservices forming a cluster which runs in a Docker Swarm overlay network. Each microservice can be scaled independently and failover to other instances is done if service instances crash or are no longer reachable in the cluster.

I hope that you have a better picture of Ditto and the proposal now, if you have more questions don't hesitate to ask them :)

Regards
Thomas
Paolo Patierno's picture

Hi Thomas,

thanks for your clarification !

Regarding the payload transformation from devices needs to Ditto needs, I think that this feature should be something in Ditto; from my point of you it's something like the "protocol adapters" we have in Hono but in this case we can call them "payload adapter" somehow :-) Are you thinking about a specific framework for implementing this feature as "Function as a Service" ?

Always regarding the payload for sending, for example, notifications from devices to Ditto, is that incremental/partial or the device needs to send always the entire state ? I mean, if only one properties change on the device and it wants to send notification to Ditto, does the device need to send an entire payload with all the other properties or just what's changed ?

There is a maximum size in the payload defined by Ditto (other than maximum size from Hono) ?

Did you think about a versioning for the Ditto payload ?

What is the authentication and authorization mechanism used on the API for accessing twin in order to modify device state for example ?

Sorry if some questions could be stupid but without taking a look at the code ... it's the only way to know more :-)

Thanks,

Paolo

Thomas Jaeckle's picture

Hi Paolo.

We have no concrete plan or frameworks in mind for the mapping feature - the "Function as a Service" is also only a first idea. As I wrote .. the inital contribution will not address the mapping problem.

Devices can send their state incrementally - or as a whole, as the device likes or as the protocol adapter sends it. When sending the state incrementally, the "field" to update is specified as part of the "Digital Twin State Management Protocol".

We have no explicit max. payload. But as our persistence is currently MongoDB, we have a limitation of 16MB per document in MongoDB, so that would be a "hard" maximum for our messages which get persisted. The real maximum is lower, currently at 4MB for a "Twin" state (as we have an optimized search index also based on MongoDB which works with storing field redunantly in order to be able to apply a search index on arbitrary state data).

Yes, we did think about versioning the Ditto payload.

Authentication is not part of Ditto (in the proposal we mentioned "OpenID Connect" which we want to utilize for authentication via JWTs). Authorization is part of Ditto and takes use of an "authenticated subject" (e.g. a JWT "subject") which is determined e.g. in a API management or nginx proxy layer in front of Ditto.
Regarding the authorization: Ditto enforces that only subjects which are entitled to modify state can modify the state - that information is persisted and managed via Ditto itself. The same goes for retrieving the state.

 

Those are no stupid questions at all, they are very detailled being in proposal phase.

Regards
Thomas