Can we automatically check that some pages have a valid HTML markup ? This can be useful as part of a CI process, or for an automated audit.
The W3C provides an online validation tool. It turns out it is packaged as a docker container, and there are Ruby bindings to communicate with this service. Here is an example.
- Run the W3C validator as a docker container
docker run -it --rm -p 8888:8888 validator/validator:latest
- Install the w3c_validators ruby gem.
# Gemfile source "https://rubygems.org" gem 'w3c_validators'
- Run the audit. In my case, the list of URLs I wanted to check came from a yaml file. I do not use the
results.errorsarray, but it contains the error messages and locations.
# audit.rb require 'w3c_validators' require 'yaml' include W3CValidators validator = NuValidator.new(:validator_uri => 'http://localhost:8888/') url_db = YAML.load_file('data/urls.yml') invalid =  url_db['urls'].each do |se| uri = se['url'] results = validator.validate_uri(uri) if results.errors.length > 0 invalid << uri end end puts invalid
This is not necessarily convenient, but all this can be automated as part of a test suite inside a CI.
See a typo ? You can suggest a modification on Github.