Set up Sphinx sitemaps

It is recommended to generate a sitemap for your documentation using the sphinx-sitemap extension.

Read the Docs generated sitemaps

RTD generates a basic sitemap pointing to the index page, and relies on crawlers to index the site. This is sufficient for some projects, but RTD does not generate sitemaps for subprojects.

This means any project under the Ubuntu documentation library project must generate its own sitemap.

Sitemap prerequisites

Ensure sphinx-sitemap has been added to your requirements.txt file.

Add sphinx_sitemap to extensions in your configuration file (docs/conf.py):

extensions = ['sphinx_sitemap']

Required sitemap configuration

Sphinx Sitemap requires a html_baseurl configured for the project in your configuration file. For example, in docs/conf.py:

html_baseurl = 'https://canonical-starter-pack.readthedocs-hosted.com/'

Note

Sitemap configuration is included in the Starter pack’s default configuration file.

Optional sitemap configuration

Sphinx sitemap uses a configurable URL scheme to set language and version options for your documentation. Default configuration provided by the starter pack uses:

sitemap_url_scheme = "{link}"

To add versioning, this can be done manually, or you can read the version from the RTD instance. To implement a manual version:

sitemap_url_scheme = "<version>/{link}"

Or, if the version is set with the version key in your configuration file:

sitemap_url_scheme = "{version}{link}"

To read from the provided RTD environment variable:

if 'READTHEDOCS_VERSION' in os.environ:
    version = os.environ["READTHEDOCS_VERSION"]
    sitemap_url_scheme = '{version}{link}'
else:
    sitemap_url_scheme = 'MANUAL/{link}'

Note

If you are implementing a sitemap on an RTD instance that is not a subproject, and it uses {link} for the sitemap_url_scheme, RTD will replace your sitemap with their own.

This is a known bug. The only current workaround is to use a different sitemap name and a custom robots.txt pointing to it.

Validating your sitemap

A sitemap will be available at different locations, depending on how it is generated.

Read the Docs generated sitemaps are available at the base domain of a project, while sitemaps generated with this extension will be placed in the base of the URL schema used.

For example, two sitemaps are generated for the Sphinx sitemap’s documentation as it is hosted on RTD:

How to specify a sitemap

A robots.txt file dictates which sitemap is used to index a website. You can use a custom robots.txt file by creating your own and adding it to html_static_path in your configuration file. An example can be found in the Ubuntu documentation library project.

Supporting multiple versions

Sphinx sitemap does not support multiple versions by default. Configuring your versioned documentation to use an appropriate version may be sufficient, as Google and other automated tools will crawl websites for the purposes of indexing. However, if you want comprehensive sitemaps for your documentation and all its versions, you will need to deploy your own robots.txt file and sitemap index.

For instance, using the starter pack as an example, with three versions (1.0, 2.0, 3.0), using the RTD URL schema of {version}{link}:

  1. Ensure each version of your documentation has a sitemap generated by this extension with the appropriate version.

  2. Create a robots.txt file, in the same directory as your configuration file, pointing to a custom sitemapindex.xml file:

    User-agent: *
    
    Disallow: # Allow everything
    
    Sitemap: https://canonical-starter-pack.readthedocs-hosted.com/latest/sitemapindex.xml
    
  3. Create a sitemapindex.xml file, in the same directory as your configuration file, which points to the sitemap files of each of your documentation sets:

    <urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9" xmlns:xhtml="http://www.w3.org/1999/xhtml">
    <url>
    <loc>https://canonical-starter-pack.readthedocs-hosted.com/latest/sitemap.xml</loc>
    <lastmod>2025-04-30</lastmod>
    </url>
    <url>
    <loc>https://canonical-starter-pack.readthedocs-hosted.com/3.0/sitemap.xml</loc>
    <lastmod>2025-04-30</lastmod>
    </url>
    <url>
    <loc>https://canonical-starter-pack.readthedocs-hosted.com/2.0/sitemap.xml</loc>
    <lastmod>2025-04-30</lastmod>
    </url>
    <url>
    <loc>https://canonical-starter-pack.readthedocs-hosted.com/1.0/sitemap.xml</loc>
    <lastmod>2025-04-30</lastmod>
    </url>
    </urlset>
    
  4. Add robots.txt and sitemapindex.xml to your configuration file:

html_extra_path = ["sitemapindex.xml", "robots.txt"]

Note

You may want to automate the generation of the sitemapindex.xml file. To see how this is done for the Ubuntu documentation library project, which generates a sitemap containing subproject sitemaps, see the script here.

This will provide a sitemapindex.xml file which points to the sphinx-sitemap generated sitemap for each version.