=====================================
 Docutils Frequently Asked Questions
=====================================

:Date: $Date: 2003/06/09 21:26:29 $
:Web site: http://docutils.sourceforge.net/
:Copyright: This document has been placed in the public domain.

.. Please note that until there's a Q&A-specific construct available,
   this FAQ will use section titles for questions.  Therefore
   questions must fit on one line.  The title may be a summary of the
   question, with the full question in the section body.


.. contents::
.. sectnum::


This is a work in progress.  Please feel free to ask questions and/or
provide answers; `send email`__ to the `Docutils-Users mailing
list`__.  Project members should feel free to edit the source text
file directly.

.. _let us know:
__ mailto:docutils-users@lists.sourceforge.net
__ http://lists.sourceforge.net/lists/listinfo/docutils-users


Docutils
========

What is Docutils?
-----------------

Docutils_ is a system for processing plaintext documentation into
useful formats, such as HTML, XML, and TeX.  It supports multiple
types of input, such as standalone files (implemented), inline
documentation from Python modules and packages (under development),
`PEPs (Python Enhancement Proposals)`_ (implemented), and others as
discovered.

For an overview of the Docutils project implementation, see `PEP
258`_, "Docutils Design Specification".

Docutils is implemented in Python_.

.. _Docutils: http://docutils.sourceforge.net/
.. _PEPs (Python Enhancement Proposals):
   http://www.python.org/peps/pep-0012.html
.. _PEP 258: spec/pep-0258.html
.. _Python: http://www.python.org/


Why is it called "Docutils"?
----------------------------

Docutils is short for "Python Documentation Utilities".  The name
"Docutils" was inspired by "Distutils", the Python Distribution
Utilities architected by Greg Ward, a component of Python's standard
library.

The earliest known use of the term "docutils" in a Python context was
a `fleeting reference`__ in a message by Fred Drake on 1999-12-02 in
the Python Doc-SIG mailing list.  It was suggested `as a project
name`__ on 2000-11-27 on Doc-SIG, again by Fred Drake, in response to
a question from Tony "Tibs" Ibbs: "What do we want to *call* this
thing?".  This was shortly after David Goodger first `announced
reStructuredText`__ on Doc-SIG.

Tibs used the name "Docutils" for `his effort`__ "to document what the
Python docutils package should support, with a particular emphasis on
documentation strings".  Tibs joined the current project (and its
predecessors) and graciously donated the name.

For more history of reStructuredText and the Docutils project, see `An
Introduction to reStructuredText`_.

Please note that the name is "Docutils", not "DocUtils" or "Doc-Utils"
or any other variation.

.. _An Introduction to reStructuredText: spec/rst/introduction.html
__ http://mail.python.org/pipermail/doc-sig/1999-December/000878.html
__ http://mail.python.org/pipermail/doc-sig/2000-November/001252.html
__ http://mail.python.org/pipermail/doc-sig/2000-November/001239.html
__ http://homepage.ntlworld.com/tibsnjoan/docutils/STpy.html


Is there a GUI authoring environment for Docutils?
--------------------------------------------------

DocFactory_ is under development.  It uses wxPython and looks very
promising.

.. _DocFactory:
   http://docutils.sf.net/sandbox/gschwant/docfactory/doc/


What is the status of the Docutils project?
-------------------------------------------

Although useful and relatively stable, Docutils is experimental code,
with APIs and architecture subject to change.

Our highest priority is to fix bugs as they are reported.  So the
latest code from CVS (or `development snapshots`_) is almost always
the most stable (bug-free) as well as the most featureful.


What is the Docutils project release policy?
--------------------------------------------

It ought to be "release early & often", but official releases are a
significant effort and aren't done that often.  We have
automatically-generated `development snapshots`_ which always contain
the latest code from CVS.  As the project matures, we may formalize on
a stable/development-branch scheme, but we're not using anything like
that yet.

If anyone would like to volunteer as a release coordinator, please
`contact the project coordinator`_.

.. _development snapshots:
   http://docutils.sf.net/#development-snapshots

.. _contact the project coordinator:
   mailto:goodger@python.org


reStructuredText
================

What is reStructuredText?
-------------------------

reStructuredText_ is an easy-to-read, what-you-see-is-what-you-get
plaintext markup syntax and parser system.  The reStructuredText
parser is a component of Docutils_.  reStructuredText is a revision
and reinterpretation of the StructuredText_ and Setext_ lightweight
markup systems.

If you are reading this on the web, you can see for yourself.  `The
source for this FAQ <FAQ.txt>`_ is written in reStructuredText; open
it in another window and compare them side by side.

`A ReStructuredText Primer <docs/rst/quickstart.html>`_ and the `Quick
reStructuredText <docs/rst/quickref.html>`_ user reference are a good
place to start.  The `reStructuredText Markup Specification
<spec/rst/reStructuredText.html>`_ is a detailed technical
specification.

.. _reStructuredText: http://docutils.sourceforge.net/rst.html
.. _StructuredText:
   http://dev.zope.org/Members/jim/StructuredTextWiki/FrontPage/
.. _Setext: mirror/setext.html


Why is it called "reStructuredText"?
------------------------------------

The name came from a combination of "StructuredText", one of
reStructuredText's predecessors, with "re": "revised", "reworked", and
"reinterpreted", and as in the ``re.py`` regular expression module.
For a detailed history of reStructuredText and the Docutils project,
see `An Introduction to reStructuredText`_.


What's the standard abbreviation for "reStructuredText"?
--------------------------------------------------------

"RST" and "ReST" (or "reST") are both acceptable.  Care should be
taken with capitalization, to avoid confusion with "REST__", an
acronym for "Representational State Transfer".

The abbreviations "reSTX" and "rSTX"/"rstx" should **not** be used;
they overemphasize reStructuredText's precedessor, Zope's
StructuredText.

__ http://www.xml.com/pub/a/2002/02/06/rest.html


What's the standard filename extension for a reStructuredText file?
-------------------------------------------------------------------

It's ".txt".  Some people would like to use ".rest" or ".rst" or
".restx", but why bother?  ReStructuredText source files are meant to
be readable as plaintext, and most operating systems already associate
".txt" with text files.  Using a specialized filename extension would
require that users alter their OS settings, which is something that
many users will not be willing or able to do.


Are there any reStructuredText editor extensions?
-------------------------------------------------

There is `some code under development for Emacs`__.

Extensions for other editors are welcome.

__ http://docutils.sf.net/tools/editors/emacs/


How can I indicate the document title?  Subtitle?
-------------------------------------------------

A uniquely-adorned section title at the beginning of a document is
treated specially, as the document title.  Similarly, a
uniquely-adorned section title immediately after the document title
becomes the document subtitle.  For example::

    This is the Document Title
    ==========================

    This is the Document Subtitle
    -----------------------------

    Here's an ordinary paragraph.

Counterexample::

    Here's an ordinary paragraph.

    This is *not* a Document Title
    ==============================

    The "ordinary paragraph" above the section title
    prevents it from becoming the document title.


How can I represent esoteric characters (e.g. character entities) in a document?
--------------------------------------------------------------------------------

For example, say you want an em-dash (XML character entity &mdash;,
Unicode character ``\u2014``) in your document: use a real em-dash.
Insert concrete characters (e.g. type a *real* em-dash) into your
input file, using whatever encoding suits your application, and tell
Docutils the input encoding.  Docutils uses Unicode internally, so the
em-dash character is a real em-dash internally.

ReStructuredText has no character entity subsystem; it doesn't know
anything about XML charents.  "&mdash;" in input text is 7 discrete
characters to Docutils; no interpretation happens.  When writing HTML,
the "&" is converted to "&amp;", so in the output you'd see
"&amp;mdash;".  There's no difference in interpretation for text
inside or outside inline literals or literal blocks -- no character
entity interpretation in either case.

If you can't use a Unicode-compatible encoding and must rely on 7-bit
ASCII, there is a workaround, although ugly.  David Priest developed a
substitution table for character entities; see
<http://article.gmane.org/gmane.comp.python.documentation/432> and
David's other March Doc-SIG posts.  Incorporating this into Docutils
is on the `to-do list <spec/notes.html#to-do>`_.

If you insist on using XML-style charents, you'll have to implement a
pre-processing system to convert to UTF-8 or something.  That opens a
can of worms though; you can no longer *write* about charents
naturally; you'd have to write "&amp;mdash;".


How can I generate backticks using a Scandinavian keyboard?
-----------------------------------------------------------

The use of backticks in reStructuredText is a bit awkward with
Scandinavian keyboards, where the backtick is a "dead" key.  To get
one ` character one must press SHIFT-` + SPACE.

Unfortunately, with all the variations out there, there's no way to
please everyone.  For Scandinavian programmers and technical writers,
this is not limited to reStructuredText but affects many languages and
environments.

Possible solutions include

* If you have to input a lot of backticks, simply type one in the
  normal/awkward way, select it, copy and then paste the rest (CTRL-V
  is a lot faster than SHIFT-` + SPACE).

* Use keyboard macros.

* Remap the keyboard.  The Scandinavian keyboard layout is awkward for
  other programming/technical characters too; for example, []{}
  etc. are a bit awkward compared to US keyboards.

If anyone knows of other/better solutions, please `let us know`_.


Are there any tools for HTML/XML-to-reStructuredText?  (Round-tripping)
-----------------------------------------------------------------------

People have tossed the idea around, but little if any actual work has
ever been done.  There's no reason why reStructuredText should not be
round-trippable to/from XML; any technicalities which prevent
round-tripping would be considered bugs.  Whitespace would not be
identical, but paragraphs shouldn't suffer.  The tricky parts would be
the smaller details, like links and IDs and other bookkeeping.

For HTML, true round-tripping may not be possible.  Even adding lots
of extra "class" attributes may not be enough.  A "simple HTML" to RST
filter is possible -- for some definition of "simple HTML" -- but HTML
is used as dumb formatting so much that such a filter may not be
particularly useful.  No general-purpose filter exists.  An 80/20
approach should work though: build a tool that does 80% of the work
automatically, leaving the other 20% for manual tweaks.


Are there any Wikis that use reStructuredText syntax?
-----------------------------------------------------

There are several, with various degrees of completeness.  With no
implied endorsement or recommendation, and in no particular order:

* `Ian Bicking's experimental code <sandbox/ianb/wiki/WikiPage.py>`__
* `MoinMoin <http://moin.sf.net>`__ has some support; `here's a sample
  <http://twistedmatrix.com/users/jh.twistd/moin/moin.cgi/RestSample>`__
* Zope-based `Zwiki <http://zwiki.org/>`__
* `StikiWiki <http://mithrandr.moria.org/code/stikiwiki/>`__

Please `let us know`_ of any other reStructuredText Wikis.

The example application for the `Web Framework Shootout
<http://colorstudy.com/docs/shootout.html>` article is a Wiki using
reStructuredText.


Are there any Weblog (Blog) projects that use reStructuredText syntax?
----------------------------------------------------------------------

With no implied endorsement or recommendation, and in no particular
order:

* `Python Desktop Server <http://pyds.muensterland.org/>`__
* `PyBloxsom <http://roughingit.subtlehints.net/pyblosxom>`__

Please `let us know`_ of any other reStructuredText Blogs.


HTML Writer
===========

What is the status of the HTML Writer?
--------------------------------------

The HTML Writer module, ``docutils/writers/html4css1.py``, is a
proof-of-concept reference implementation.  While it is a complete
implementation, some aspects of the HTML it produces may be
incompatible with older browsers or specialized applications (such as
web templating).  Alternate implementations are welcome.


What kind of HTML does it produce?
----------------------------------

It produces XHTML compatible with the `HTML 4.01`_ and `XHTML 1.0`_
specifications.  A cascading style sheet ("default.css" by default) is
required for proper viewing with a modern graphical browser.  Correct
rendering of the HTML produced depends on the CSS support of the
browser.

.. _HTML 4.01: http://www.w3.org/TR/html4/
.. _XHTML 1.0: http://www.w3.org/TR/xhtml1/


What browsers are supported?
----------------------------

No specific browser is targeted; all modern graphical browsers should
work.  Some older browsers, text-only browsers, and browsers without
full CSS support are known to produce inferior results.  Mozilla
(version 1.0 and up) and MS Internet Explorer (version 5.0 and up) are
known to give good results.  Reports of experiences with other
browsers are welcome.


Unexpected results from tools/html.py: H1, H1 instead of H1, H2.  Why?
----------------------------------------------------------------------

Here's the question in full:

    I have this text::

        Heading 1
        =========

        All my life, I wanted to be H1.

        Heading 1.1
        -----------

        But along came H1, and so shouldn't I be H2?
        No!  I'm H1!

        Heading 1.1.1
        *************

        Yeah, imagine me, I'm stuck at H3!  No?!?

    When I run it through tools/html.py, I get unexpected results
    (below).  I was expecting H1, H2, then H3; instead, I get H1, H1,
    H2::

        ...
        <html lang="en">
        <head>
        ...
        <title>Heading 1</title>
        <link rel="stylesheet" href="default.css" type="text/css" />
        </head>
        <body>
        <div class="document" id="heading-1">
        <h1 class="title">Heading 1</h1>                <-- first H1
        <p>All my life, I wanted to be H1.</p>
        <div class="section" id="heading-1-1">
        <h1><a name="heading-1-1">Heading 1.1</a></h1>        <-- H1
        <p>But along came H1, and so now I must be H2.</p>
        <div class="section" id="heading-1-1-1">
        <h2><a name="heading-1-1-1">Heading 1.1.1</a></h2>
        <p>Yeah, imagine me, I'm stuck at H3!</p>
        ...

    What gives? 

Check the "class" attribute on the H1 tags, and you will see a
difference.  The first H1 is actually ``<h1 class="title">``; this is
the document title, and the default stylesheet renders it centered.
There can also be an ``<h2 class="subtitle">`` for the document
subtitle.

If there's only one highest-level section title at the beginning of a
document, it is treated specially, as the document title.  (Similarly,
a lone second-highest-level section title may become the document
subtitle.)  Rather than use a plain H1 for that, we use ``<h1
class="title">`` so that we can use H1 again within the document.  Why
do we do this?  HTML only has H1-H6, so by making H1 do double duty,
we effectively reserve these tags to provide 6 levels of heading
beyond the single document title.

HTML is being used for dumb formatting for nothing but final display.
A stylesheet *is required*, and one is provided:
tools/stylesheets/default.css.  Of course, you're welcome to roll your
own.

(Thanks to Mark McEahern for the question and much of the answer.)


Why do enumerated lists only use numbers (no letters or roman numerals)?
------------------------------------------------------------------------

The rendering of enumerators (the numbers or letters acting as list
markers) is completely governed by the stylesheet, so either the
browser can't find the stylesheet (try using the "--embed-stylesheet"
option), or the browser can't understand it (try a recent Mozilla or
MSIE).


Python Source Reader
====================

Can I use Docutils for Python auto-documentation?
-------------------------------------------------

Docstring extraction is still under development.  There is most of a
source code parsing module in docutils/readers/python/moduleparser.py.
I (David Goodger) haven't worked on it in a while, but I do plan to
finish it eventually.  Ian Bicking wrote an initial front end for my
moduleparser module, in sandbox/ianb/extractor/extractor.py.


Miscellaneous
=============

Is the Docutils document model based on any existing XML models?
----------------------------------------------------------------

Not directly, no.  It borrows bits from DocBook, HTML, and others.  I
(David Goodger) have designed several document models over the years,
and have my own biases.  The Docutils document model is designed for
simplicity and extensibility, and has been influenced by the needs of
the reStructuredText markup.
