AsciiDoc has gained traction as a preferred choice for technical writing because it’s expressive, author-friendly, and tool agnostic. The AsciiDoc community has asserted that a specification for AsciiDoc is needed to solidify the ecosystem’s current foundation. We anticipate a specification will also provide pathways for new capabilities that adapt the language to the ever-changing technology landscape. The goal of this project is to produce that specification and its artifacts.
The AsciiDoc Language project defines and maintains the AsciiDoc Language Specification and Technology Compatiblity Kit (TCK), its artifacts, and the corresponding language and API documentation. The AsciiDoc Language Specification describes the syntax and grammar, Abstract Semantic Graph (ASG), Document Object Model (DOM), referencing system, and APIs for processing, converting, and extending the language. The TCK is used to verify and certify that an AsciiDoc processor implementation is compatible with this specification.
Specifically, the project scope includes the:
- AsciiDoc language syntax and grammar (e.g., EBNF)
- doctype structure and objects
- ASG: namely the encoded form for use in the TCK (e.g., JSON)
- TCK: Technology Compatiblity Kit for the AsciiDoc language
- DOM API: in memory semantic representation of the encoded information
- Processor API (load, convert)
- Converter API
- Extension API
- Extended syntax processors (e.g., custom block or macro)
- Resolvers (e.g., path and attribute resolvers, ID generator)
- Parse events and lifecycle interceptors (e.g., input processor, output processor, tree processor)
- Integration adapters: syntax highlighter, STEM, bibliography, docinfo
- Expected converter behaviors (e.g., toc, ID generation, icon type, safe mode)
- Internal and external referencing system: (e.g., xrefs, includes, images)
- Reference converter and output format (e.g., HTML w/ reference stylesheet, DocBook)
- Built-in attributes and reserved attribute namespaces
- AsciiDoc media type (MIME) and .adoc file extension
The project also provides the:
- AsciiDoc language documentation for writers
- AsciiDoc API documentation
AsciiDoc is a comprehensive, semantic markup language for producing a variety of presentation-rich output formats from content encoded in a concise, human-readable, plain text format. It also includes a set of APIs for transforming the encoded content, extending the syntax/grammar and processor lifecycle, and integrating with tools and publishing platforms. Teams and individuals use AsciiDoc to write product documentation, technical specifications, architectural guides, scientific and analytical reports, academic courses and training materials, books, and other technical communication.
The AsciiDoc Language isn’t coupled to the output format it produces. Software that implements the AsciiDoc Language Specification can parse and comprehend AsciiDoc and convert the parsed document structure to one or more output formats, such as HTML, PDF, EPUB, man page, DocBook. The ability to produce multiple output formats allows AsciiDoc to be used in static site generators, IDEs, git tools and services, CI/CD systems, and other software.
AsciiDoc bridges the gap between ease of writing and the rigorous requirements of technical authoring and publishing.
AsciiDoc is used across a spectrum of industries and communities, many that are associated with or members of the Eclipse Foundation. Being co-located with so many groups that are invested in AsciiDoc will provide a neutral and diverse forum for collaborating on and improving the language, its software, and related initiatives. Additionally, the Eclipse Foundation’s values of open source, transparency, and vendor neutrality are of the utmost importance to AsciiDoc and its community.
The .adoc extension was once associated with a now obsolete file format. Otherwise, we don’t know of any legal issues at this time.
The initial contributions are expected to be ready in Q2 2020. Once the initial contributions are accepted and the project infrastructure and team process established, the plan is to iterate on the specification and TCK in coordination with the compatible implementation project(s). The goal of the first, stable version of the specification is to match the AsciiDoc Language as described by Asciidoctor 2.0.x as best as possible to minimize syntax and structure impacts on active AsciiDoc documents, but not propagate deprecations.
Future functionality and activities will be driven by community feedback and their requirements. Proposed specification advancements could include:
- defining syntax patterns for common, stable content models (e.g., tabbed blocks)
- exploring accessibility functionality
- improving integration with compatible tooling
- adapting to the latest output format specifications and related web browser and output standards
- providing additional doctypes to accommodate the needs of other types of technical writing
- implementing language server protocol support for AsciiDoc
Comments Sign in to post comments
Here's my take.
Submitted by Philippe Proulx on Mon, 2020-04-27 22:32
Here's my take.
I use AsciiDoc to document:
This is my bias.
Missing semantics
Graeme Smecher wrote on the mailing list:
Having this impedance reduced is also my principal ambition.
I want AsciiDoc to offer as many semantic markup as possible while remaining as lightweight as possible (otherwise I'd just write DocBook directly).
Considering this, here's the list DocBook tags of which an equivalent markup is missing from AsciiDoc (as far as I know) for my use cases:
abbrev
acronym
date
firstterm
replaceable
see
seealso
termdef
wordasword
revdescription
revhistory
revisiondocument
revnumber
revremark
procedure
result
step
stepalternatives
substeps
task
taskprerequisites
taskrelated
tasksummary
citerefentry
manvolnum
refentrytitle
command
database
envar
errorcode
errorname
errortext
errortype
filename
markup
menuchoice
msg
msgaud
msgentry
msgexplan
msginfo
msglevel
msgmain
msgorig
msgrel
msgset
msgsub
msgtext
optional
package
prompt
property
screenshot
synopsis
systemitem
uri
userinput
accel
guiicon
guilabel
mousebutton
arg
cmdsynopsis
command
option
sbr
classname
classsynopsis
classsynopsisinfo
constant
constructorsynopsis
exceptionname
fieldsynopsis
funcdef
funcparams
funcprototype
funcsynopsis
funcsynopsisinfo
function
initializer
interfacename
methodname
methodparam
methodsynopsis
modifier
ooexception
oointerface
paramdef
parameter
returnvalue
symbol
tag
token
type
varargs
variablelist
varname
void
I get that for many inline elements, you can use hash symbols with a custom class:
Is this the intention? If so, it's still not specified and up to the writer. I suggest to formalize this, using another syntax than the class attribute, for example:
Syntax improvements
Here are a few syntax improvement suggestions, in order of importance for me.
List item continuation
As a tech writer, what I use the most outside paragraphs are lists: unordered, ordered, and description.
Those lists often contain items which can get rather complex. I've always had a hard time dealing with list item continuation in AsciiDoc. I find the
+
syntax is annoying at best. Sure you can use open blocks, but you can't nest them:Also, I find the "unnesting" syntax, where the number of newlines above the following
+
on a single line indicates how many levels to go back, very confusing:Is this readable to you?
This issue (for me at least) includes the syntax to nest lists, where you use more
*
, more.
, or more:
depending on the list type when not using open blocks:I know AsciiDoc does not rely on indentation usually, but what I'm suggesting is make an exception here, at least optionally, to nest lists and to continue list items, just like Markdown does. There might be limitations, but as far as I know I see none.
Here are the two previous examples reformatted to use identation to nest and continue items:
Nested open blocks
As mentioned above, you can't nest open blocks.
The suggested solution in the GitHub issue is to use
~~~~
to delimit open blocks, adding more~
to nest them:While this at least provides a solution, why not use a dedicated closing delimiter instead?
Here's an example, reusing the
--
delimiter we know to begin an open block:There might be forms that are more visually appealing. For example, using
<
to open and>
to close on single lines:In fact, why not use this strategy for any nestable block?
Block title
According to Title:
Example:
Sometimes the block title can be long, especially for example titles.
I therefore suggest to have a way to continue the title on the following line(s) in some way. For example, using a single space on the following lines:
Dedicated non-breaking space and hyphen shorthands
I often need non-breaking spaces. I use them between:
and more.
You can use
{nbsp}
to write a non-breaking space and‑
to write a non-breaking hyphen.I suggest to have built-in shorthands for both of them. LaTeX uses
~
for a non-breaking space.Macros
AsciiDoc (Python) has macros and attributes while Asciidoctor has extensions (Ruby/Java/JavaScript) and attributes.
Should the AsciiDoc specification include an official macro language?
What I mean by macro is a template of AsciiDoc content with variable placeholders. The expanded macro can become block content or inline content.
Here's a fictitious example:
Here's another example for block content:
Re: Here's my take.
Submitted by Philippe Proulx on Thu, 2020-04-30 16:45
To add to this: I thought I was commenting the specification proposal here, but now I understand those are supposed to be project proposal comments.
So I might post this comment again at the appropriate location when the specification draft takes form.
Re: Re: Here's my take.
Submitted by Dan Allen on Thu, 2020-05-28 18:16
Thank you for taking the time to share this input. Indeed, these points are best suited for the AsciiDoc specification list once this proposal is approved and the mailinglist is up and running.
I do want to emphasize that the focus of this spec is not on creating a new language with new syntax, but rather to standardize and evolve (within reason) the existing syntax. We can and should address matters of semantics, but we're not aiming to fundamentally alter the syntax, such as changing the fences for delimited blocks (aside from the open block issue). An AsciiDoc document written today should still continue to work with the standard processor. Just something to keep in mind when we discuss enhancements to the syntax.
Scope ideas from an IDE perspective
Submitted by Alexander Schwartz on Thu, 2020-04-30 17:46
I'm the current maintainer of the AsciiDoc IntelliJ plugin, and I'm taking the perspective of a IDE-plugin developer for this comment.
Please reply and let me know if you second any of these ideas for the scope of the proposals, or if you consider them part of the existing proposal.
From an IDE perspective I'd like to see the following elements to be part of the scope:
ad 1: The meta-information at runtime should include built-in and active extensions for a list of available macros and attributes. Each macro and attribute should provide a textual self-description for a (human) writer. Each macro (extension or built-in) should provide a list of supported attributes. Each attribute should provide a sample and default value and possibly also a type so that the IDE can trigger auto-complete for file names, IDs, etc.
ad 2: The Asciidoctor HTML output already implements it by adding additional data attributes to some HTML tags, but doesn't attach it to all tags. It is currently based on line level. Future implementations could provide line and column information. Sourcemaps would be a method of implementation, but might depend on the output.
ad 3: I assume this is covered by either "Internal and external referencing system" or the Extension API "(path) resolvers", I just want to make sure either of them can be used for this.
ad 4: When retrofitting some Antora style behavior to Asciidoctor Ruby I used "prepend" to monkey-patch some of the necessary functionality. With a mechanism as described above this would not have been necessary.
Re: Scope ideas from an IDE perspective
Submitted by Dan Allen on Thu, 2020-05-28 18:30
Yes, I consider this part of the AST / DOM. Unlike the existing AsciiDoc processors, a standard processor should be able to capture and make available all the information about the parsed document. We'll need to work out how all that is stored, but it needs to be in there somewhere.
Yes, source-level information will be available for each parsed node, and perhaps even lower-level than that.
Great idea. That will come into play when we get into extensions. (We're not sure yet whether extensions will be in the main language spec or a supplemental spec).
This is defintely a detail that the extensions part of the spec (or supplemental spec) will need to address. There are two concerns here...one is about extention hierarchy and one is about extension ordering relative to one another. Asciidoctor gives us some direction here, though we need to address where it leaves ambiguity.
In general, just having source information for each node will help a lot here. I do like the idea that nodes self identify as having content that should be considered / processed by a spell checker...or perhaps something more high level. Certainly a great idea to consider.
All in all, the key point to keep in mind is that one of the key goals is to parse the language fully. When AsciiDoc began, it was a streaming processor that offered no access to a parsed document. Asciidoctor introduced a document model and parsed down to the block level. The standard language will require that mapping to be complete down to the lowest reasonable level, certainly inline nodes and maybe even characters.
One request
Submitted by Phil Beauvoir on Fri, 2020-07-03 03:56
Just please don't rename it to "Eclipse AsciiDocium".