×

Status message

This proposal has been approved and the ChemClipse project has been created.

ChemClipse

Basics
This proposal is in the Project Proposal Phase (as defined in the Eclipse Development Process) and is written to declare its intent and scope. We solicit additional participation and input from the community. Please login and add your feedback in the comments section.
Parent Project: 
Background: 

Software has become an important part for the evaluation of scientific data. Furthermore, open source software plays a vital role due to its support for collaborative work. Since years, several software projects appeared to handle problems of specific scope. The scope ranges from biology, geology, bioinformatics, physics, linguistics, epidemiology and others. Moreover, several applications are focused to handle chemical issues but only a few took the challenge to create tools to handle data from analytical instruments. Among these are systems for chromatography and mass spectrometry, most commonly used in analytical chemistry. Both techniques are used in combination e.g. for forensics, quality control or medical research. A bottleneck for scientific discoveries is the availability of several different instruments. Each instrument vendor offers its own software package and data format. That makes it hard or in most cases impossible to evaluate data sets independently from the vendor software. Moreover, it prevents finding new insights from existing data records as well as it prevents to handle the measured data in a unique way.

Scope: 

Chromatography and mass spectrometry are both key technologies used in almost any field of analytical chemistry, for example quality control or forensic science. ChemClipse addresses issues to manage data sets from chromatography and mass spectrometry systems. ChemClipse does not offer any specific instrument control capabilities, but rather provides supporting software to import the recorded raw data, optimize and run evaluations, and export the results into various formats. It adds a new functionality to the hardware due to its flexible and modular approach. Moreover, it offers a rich graphical user interface to facilitate the most intuitive data evaluation possible for scientists in these fields.

Description: 

ChemClipse supports the user to analyse data acquired from systems used in analytical chemistry. In particular, chromatography coupled with mass spectrometry (GC/MS) or flame-ionization detectors (GC/FID) is used to identify and/or monitor chemical substances. It's an important task e.g. for quality control issues. Groceries, for example, are under strict control. Producers, traders and retailers try to prevent that groceries contain harmful chemical substances. The presence or absence of those chemical substances is identified, among others, by using GC/MS or GC/FID techniques. Nevertheless, it requires some experience to evaluate the data sets, recorded by the instruments. Hence, ChemClipse supports the chemists to evaluate the analytical data sets and to create reports. Moreover, it offers a rich set of functionality to edit the data sets as well as an easy to use GUI. ChemClipse is based on the Eclipse Application Platform. It currently utilizes Eclipse 4.5M3 in mixed mode and is build using Maven/Tycho 0.21.0. Its main functionality is listed as follows:

 

  • Converter (import and/or export of raw data sets)

  • Classifier (non-destructive methods to extract characteristic values)

  • Filter (destructive methods to optimize the data sets)

  • Peak detection (finding peaks – each peak is a chemical substance)

  • Chromatogram/Peak integration (calculation of the chromatogram/peak area)

  • Identification (identification of each peak mass spectrum)

  • Quantitation (use the data for calibration issues)

  • Reporting (report the results for further analytical steps)

  • Processing (automation of the data handling)

 

Due to its flexible approach, each functionality can be extended by plugins. For this, ChemClipse makes use of the Eclipse extension point mechanism. Therefore, it is best suited for scientists, students and interested persons to write their own extensions. The data model has been well designed, hence it should be no problem to focus on the necessary methods that are needed to write an own extension. Moreover, its graphical user interface can be extended by additional UI parts as well.

Figure 1 – An overview of a chromatogram data set recorded with a mass selective detector.

 

Figure 2 – Running a principal component analysis (PCA) on chromatographic data.

 

Why Here?: 

The Eclipse Foundation is the right place to collaborate for ChemClipse because of the Science Working Group. The Eclipse Foundation in general and the Science Working Group in particular offers great opportunities to collaborate with other projects and to find new ways for the data evaluation. Moreover, the recombination of software from different scientific scopes offers chances to make serendipitous discoveries.

Project Scheduling: 

The initial contribution will be made in quarter one of 2015 and the first release will happen by the end of quarter two of 2015.

Future Work: 

Implementation of:

1. Re-implementation of mzXML, mzML and mzData converters

2. Improvements of the data handling for Triple-Quad and QTOF instruments

3. Improvements for the data handling of high-resolution mass spectrometry systems

4. Improvements of the data handling for flame-ionization data (FID)

5. Support for new detector types like diode-array detectors (DAD)

6. Performance improvements

People
Project Leads: 
Committers: 
Janos Binder
Interested Parties: 

Science Working Group

Lablicate UG (haftungsgeschränkt)

Source Code
Initial Contribution: 

The source code is currently hosted at SourceForge. It will be migrated to the Eclipse Foundation after acceptance of this proposal.

 

http://sourceforge.net/projects/chemclipse3rdpl

http://sourceforge.net/projects/chemclipse

http://sourceforge.net/projects/openchromplugins

Source Repository Type: 
Jay Billings's picture

I strongly support this project and recommend that it be accepted. It will make a fine addition to the Science Working Group. I have two small suggestions. First, you should search Nebula and Orbit to see how many of your dependencies are already available. For example, SWT XY GRAPH is in Nebula and many Apache Commons packages are in Orbit. Doing this will reduce the amount of time that the IP team needs to review your initial contribution and help you hit your proposed release schedule. Second, how does this project interact with other projects? For example, are you collaborating with DAWNSci in some way? This information isn't required, but it would be good to see so that we can track collaborations and reuse.