The most widely used tools in many organizations continue to be text processors and spreadsheets. Often these documents describe business data that are important to manipulate in other contexts. Examples of data contained in such documents include the following:
Because these kinds of tools often produce plain text documents, it's typically quite complex and time-consuming to develop a specific parser able to produce output more amenable to further manipulation.
Currently some organizations are investing effort to publish specifications of open source file formats, for example Office XML (e.g., docx, xlsx...) and Open Document (e.g., odt, odf...) to facilitate widespread adoption and easier consumption.
In fact, most of the business documents are organized with data defined in a common way, (top down for example for text documents) using text style, regular expressions, and column numbering. As such, it's possible to support a generic solution for parsing those documents and transforming the business data into EMF models, using XML parsing and EMF's reflective capabilities.