Creation Review

Type
Creation
State
Successful
End Date of the Review Period

Reviews run for a minimum of one week. The outcome of the review is decided on this date. This is the last day to make comments or ask questions about this review.

Project
Proposal

Eclipse CROSSMETER

Friday, September 15, 2017 - 15:29 by Davide Di Ruscio
This proposal is in the Project Proposal Phase (as defined in the Eclipse Development Process) and is written to declare its intent and scope. We solicit additional participation and input from the community. Please login and add your feedback in the comments section.
Project
Parent Project
Proposal State
Created
Background

Open-source software (OSS) is very often developed in a public, collaborative, and loosely-coordinated manner. This has several implications to the level of quality of different OSS software as well as to the level of support that different OSS communities provide to users of the software they produce. On the one hand, there are several high-quality and mature OSS projects which deliver stable and well-documented products. Such projects typically also foster a vibrant expert and user community which provides remarkable levels of support both in answering user questions and in repairing reported defects (bugs) in the provided software. On the other hand, there is also a substantial number of OSS projects which are dysfunctional in one or more of the following ways:

  • The development team behind the OSS project invests little time on its development, maintenance and support;
  • The development of the project has been altogether discontinued due to lack of commitment or motivation;
  • The documentation of the produced software is limited and/or of poor quality;
  • The source code contains little or low-quality comments which make studying and maintaining it challenging;
  • The community around the project is limited, and as such, questions asked by users receive late/no response and identified defects either get repaired very slowly or are altogether ignored.

Consequently, developing new software systems by reusing existing open source components raises relevant challenges related to at least the following activities:

  • searching for candidate components;
  • evaluating a set of retrieved candidate components to find the most suitable one;
  • understanding how to use the selected components;
  • adapting the selected components to fit the specific requirements.
Scope

Eclipse CROSSMETER collects data from open-source repositories (code version management systems, issue trackers, continuous integration systems and discussion forums in natural language). This data is analyzed qualitatively and then stored in a knowledge base. The knowledge base is used to query for specific answers when the programmer is confronted with (a) a design decision (b) a code or design smell. CROSSMETER extends the Eclipse IDE for several languages, at least the Java language at first, with new IDE interactions for code suggestions and problem detection.

To this end, the following tools are in scope:

  • source code analysis tools to extract and store actionable knowledge from the source code of a collection of open-source projects
  • natural language analysis tools to extract quality metrics related to the communication channels, and bug tracking systems of OSS projects by using Natural Language Processing and text mining techniques;
  • system configuration analysis tools to gather and analyse system configuration artefacts and data to provide an integrated DevOps-level view of a considered open source project;
  • workflow-based knowledge extractors to simplify the development of bespoke analysis and knowledge extraction tools by contributing a framework that will shield engineers from technological issues and allow them to concentrate on the core analysis tasks instead;
  • cross-project relationship analysis tools to specify and manage in a homogeneous manner a wider range of open source project relationships, such as dependencies and conflicts. The outcomes of the different CROSSMETER analysis tools will contribute the definition of a knowledge base supporting multidimensional classifications of projects and disclosing a number of applications such as automated identification of complementary and competing projects, project incompatibilities and prediction of the future of given projects based on the evolution of other projects that had similar characteristics in the past.​
  • extensions for the Eclipse IDE that will allow developers to adopt the CROSSMETER knowledge base and analysis tools directly from the development environment. The IDE extensions will also include features for monitoring the developer activity while they work on a given OSS project. Thus the IDE will issue alerts or recommendations and collect user feedback which will help developers to improve their productivity. Depending on the context, recommendations can include suggested code snippets, patterns, automatic fix to coding issues, suggestions to use alternative APIs or components, etc.
Description

Software engineers spend most of their time learning to understand the software they maintain or depend on (or will depend on). The goal of this learning process is to support decision-making. In this project, we focus on the increasing dependence on open-source software (OSS) over the last years and the decisions related to depending on open-source software. Eclipse CROSSMETER will support the efficient and effective decision-making regarding dependence on OSS projects and components thereof. This entails both decisions on the architecture level (to decide to select and OSS project) and on the code level (to design the use of the OSS project).  In particular, CROSSMETER will provide techniques and tools for extracting knowledge from existing open source components, and use such knowledge to properly select and reuse existing software to develop new systems. The activity of the developer will be continuously monitored in order to raise alerts related to the quality of the selected OSS projects and to give suggestions that can reduce the development effort and increase the quality of the final software products.

Figure 1 shows a high-level overview of the CROSSMETER approach. It sports two major use cases and two minor user channels which are implemented using two architectural stages: online and offline. The common use case features software engineers using their normal IDE, which is enhanced with decision supporting information mined from OSS projects. The second use case is an advanced tool engineer developing bespoke analysis workflows which can make use of already available and mined data. Next, to these two major IDE use cases, Figure 1 also features the release of the mined information via two online channels. The first is a normal analytics dashboard via the Web to disclose mined information to other stakeholders next to the software engineers (such as project managers). The second online channel is the GitHub API to which we will push information rather than pull it. OSS projects on GitHub can be tagged with useful information (e.g., number succeeded builds and tests by a continuous integration toolkit). CROSSMETER will publish the results of mining qualitative and quantitative information as GitHub project tags as well.

We describe the two major use cases here in some detail to clarify what CROSSMETER as a whole entails. We first explain the exceptional case of tool engineers extending the platform, and then the normal case of a software engineering using the CROSSMETER enabled IDE:

  • In step 1 the tool engineers of Use case II use a special (graphical) editor in their IDE to compose new workflows of data sources and computations. This functionality is commonly available in big data analytics suites; here we specialize this functionality for typical OSS project analysis tasks. This leads to the installation of a new bespoke analysis to the set of existing mining and analysis tools ( 2 ).
  • mining tools will run incrementally in step ( 2 ), and possibly on a remote server, to extract relevant information from a pre-configured set of projects and a list of projects configured by the software engineers of Use case I.
  • The software engineers of Use case I have a wizard to configure CROSSMETER with a rich set of requirements (step 3 ), which includes not only registering a set of projects of interest but also expressing preferences regarding the algorithms and processed used to project the mined information into the IDE. This configuration is an important step to make meaningful assessment possible later since it makes the context and preferences of the engineer explicit to the platform in terms of technological, quality, configuration, and licensing aspects.
  • Finally step 4 is when the acquired information is put into action, actively supporting the engineers via the IDE, managers via the website, and the open-source community via GitHub integration.

 

Figure 1: CROSSMETER approach at a glance

Typical example IDE services which may be introduced or enhanced using this architecture are:

  • Code assist - propose relevant code snippets, ranked by relevance and quality and informed by the earlier configuration;
  • Infer/Fix project setup - retrieve a list of ranked relevant reusable components, then set up relevant projects in the IDE and configure dependent projects to use them;
  • Monitoring of development activities of the engineers, who will be notified of relevant facts pertaining to their current task context.

In short, CROSSMETER analyzes OSS projects offline, in the background, and employs the mined information to support engineers online, directly with their tasks of decision-making through otherwise unobtrusive IDE features which are highly configurable and extensible.

Why Here?

Eclipse Foundation Europe GmbH is one of the consortium members of the EU H2020 CROSSMINER project (https://www.crossminer.org/). Together with the Eclipse representatives in the project, it will be investigated how to apply the CROSSMINER analysis techniques on the Eclipse projects and repositories. Moreover, hosting the CROSSMETER project as an Eclipse project will help to make the project more sustainable especially once the EU funding ends. In particular, we believe that having CROSSMETER as an Eclipse project as soon as possible will contribute to creating a community of (Eclipse) users, that in turn will benefit the adoption of the envisioned technologies and their novelties.

Future Work

Future works include any facilities related to the scope of the project, including, but not limited to:

  • Source code analysis
  • Configuration analysis
  • Natural language processing techniques applied on project communication channels

 

Project Scheduling

M12:  Eclipse CROSSMETER platform and methodology - initial version

Description: Interim versions of the dependency inference and analysis component, of the text representation system, and of the workflow modeling component have been delivered. Initial versions of the Eclipse-based IDE, of the Web-based dashboards and of the tool for the unsupervised classification of OSS projects have also been developed.

 

M18: Eclipse CROSSMETER platform and methodology - interim version

Description: Initial versions of the tools for analysing natural language sources, knowledge base, and the dependency inference and analysis component have been delivered. Interim versions of the developer activity monitoring, IDE integration services, system configuration analysis tools, and of the CROSSMETER platform have been delivered. Interim versions of the workflow development tool and of the engine for supporting parallel and distributed workflows have also been developed.

 

M24: Eclipse CROSSMETER platform and methodology - second interim version

Description:  Final version of the dependency inference and analysis component, the text representation system, tools for analysing natural language sources and for the unsupervised classification of OSS projects have been developed. Interim versions of all use-case demonstrators have been developed and the respective evaluation reports have been delivered.

 

M30: Eclipse CROSSMETER platform and methodology - second version

Description:  The API analysis component, the tool for mining documentation and code snippets, the DevOps Dashboard, and the integration with GitHub have been developed. Final versions of the workflow development tools, the workflow execution engine, Eclipse-based IDE, Web-based dashboards, and of the knowledge base have been delivered.

Committers
Philippe Krief (This committer does not have an Eclipse Account)
Jose Manrique Lopez de la Fuente (This committer does not have an Eclipse Account)
Boris Baldassari (This committer does not have an Eclipse Account)
Interested Parties

Software Heritage, The Open Group (UK), University of York (UK), University of L’Aquila (Italy), Edge Hill University (UK), Centrum Wiskunde and Informatica (Netherlands), Athens University of Economics and Business (Greece), Unparallel (Portugal), Softeam (France), Frontendart (Hungary), Bitergia (Spain), OW2 (France)