Generate Sitemap Versions Save

Generate an XML sitemap for a GitHub Pages site using GitHub Actions

v1.8.0

2 years ago

[1.8.0] - 2021-06-28

Added

  • Added option to exclude .html from URLs listed in the sitemap for html files. GitHub Pages automatically serves a corresponding html file if a user browses to a page with a URL with no file extension. This new option to the generate-sitemap action enables your sitemap to match this behavior if you prefer the extension-less look of URLs. There is a new action input, drop-html-extension, to control this behavior.

Changed

  • Use major release tag when pulling base docker image (e.g., automatically get non-breaking changes to base image, such as bug fixes, etc without need to update Dockerfile).

v1.7.2

3 years ago

[1.7.2] - 2021-05-13

Changed

  • Switched tag used to pull base Docker image from latest to the specific release that is the current latest, to enable testing against base image updates prior to releases. This is a purely non-functional change.

Fixed

  • Bug involving missing lastmod dates for website files created by the workflow, but not yet committed. These are now set using the current date and time.

v1.7.1

3 years ago

[1.7.1] - 2021-05-06

Changed

  • Refactored to improve code maintainability.

CI/CD

  • Introduced major version tag.

v1.7.0

3 years ago

[1.7.0] - 2021-4-26

Added

  • New action input, additional-extensions, that enables adding other indexable file types to the sitemap.

CI/CD

  • Enabled CodeQL code scanning on all push/pull-request events.

v1.6.2

3 years ago

[1.6.2] - 2021-3-10

Changed

  • Improved the documentation (otherwise, this release is functionally equivalent to the previous release).

v1.6.1

3 years ago

[1.6.1] - 2020-9-24

Fixed

  • Bug in generating URL for files with names ending in "index.html" but not exactly equal to "index.html", such as "aindex.html". Previous version would incorrectly truncate this to just "a", dropping the "index.html". This version now correctly identifies "index.html" files.

v1.6.0

3 years ago

[1.6.0] - 2020-9-21

Added

  • Support for robots.txt: In addition to the previous functionality of excluding html URL's that contain <meta name="robots" content="noindex"> directives, the generate-sitemap GitHub action now parses a robots.txt file, if present at the root of the website, excluding any URLs from the sitemap that match Disallow: rules for User-agent: *.

v1.5.0

3 years ago

generate-sitemap, v1.5.0

This action generates a sitemap for a website hosted on GitHub Pages. It supports both xml and txt sitemaps. When generating an xml sitemap, it uses the last commit date of each file to generate the <lastmod> tag in the sitemap entry. It can include html as well as pdf files in the sitemap, and has inputs to control the included file types (defaults include both html and pdf files in the sitemap). It skips html files that contain <meta name="robots" content="noindex">. It otherwise does not currently attempt to respect a robots.txt file. The sitemap entries are sorted in a consistent order (primary sort is by depth of page in site, and URLs at same depth are then sorted alphabetically).

[1.5.0] - 2020-9-14

Changed

  • Minor refactoring of python, and optimized action load time by using a prebuilt base docker image that includes exactly what is needed (git and python).

v1.4.0

3 years ago

generate-sitemap, v1.4.0

This action generates a sitemap for a website hosted on GitHub Pages. It supports both xml and txt sitemaps. When generating an xml sitemap, it uses the last commit date of each file to generate the <lastmod> tag in the sitemap entry. It can include html as well as pdf files in the sitemap, and has inputs to control the included file types (defaults include both html and pdf files in the sitemap). It skips over html files that contain <meta name="robots" content="noindex">. It otherwise does not currently attempt to respect a robots.txt file. The sitemap entries are sorted in a consistent order (primary sort is by depth of page in site, and URLs at same depth are then sorted alphabetically).

[1.4.0] - 2020-9-11

Changed

  • Completely re-implemented in Python to enable more easily adding planned future functionality.

v1.3.0

3 years ago

generate-sitemap, v1.3.0

This action generates a sitemap for a website hosted on GitHub Pages. It supports both xml and txt sitemaps. When generating an xml sitemap, it uses the last commit date of each file to generate the <lastmod> tag in the sitemap entry. It can include html as well as pdf files in the sitemap, and has inputs to control the included file types (defaults include both html and pdf files in the sitemap). It skips over html files that contain <meta name="robots" content="noindex">. It otherwise does not currently attempt to respect a robots.txt file. The sitemap entries are sorted in a consistent order (primary sort is by depth of page in site, and URLs at same depth are then sorted alphabetically).

[1.3.0] - 2020-9-9

Changed

  • URL sort order updated (primary sort is by depth of page in site, and URLs at same depth are then sorted alphabetically)
  • URL sorting and URL filtering (skipping html files with meta robots noindex directives) is now implemented in Python