Skip to content

Instantly share code, notes, and snippets.

@injms
Created July 21, 2020 11:07
Show Gist options
  • Select an option

  • Save injms/d9820e0fb6d7ba1a7f27f77b185ec7c1 to your computer and use it in GitHub Desktop.

Select an option

Save injms/d9820e0fb6d7ba1a7f27f77b185ec7c1 to your computer and use it in GitHub Desktop.
require 'nokogiri'
require 'open-uri'
homepage = Nokogiri::HTML(URI.open('https://www.gov.uk/'))
organisations = Nokogiri::HTML(URI.open('https://www.gov.uk/government/organisations/'))
# Homepage selectors to find the number of ministerial departments and agencies
agencies_on_homepage_selector = '[href="/government/organisations#agencies_and_other_public_bodies"] .home-numbers__large'
agencies_on_organisation_page_selector = '#agencies_and_other_public_bodies + .organisations__department-count-wrapper .js-department-count'
# Organisation page selectors to find the number of ministerial departments and
# agencies
ministerial_departments_on_homepage_selector = '[href="/government/organisations#ministerial_departments"] .home-numbers__large'
ministerial_departments_on_organisation_page_selector = '#ministerial_departments + .organisations__department-count-wrapper .js-department-count'
# Queue up errors to try and avoid 'fix error, push, find new error' loop.
error_queue = []
def check_for_errors errors
if errors.count >= 1
error_title = errors.count === 1 ? "Error" : "Errors"
raise Exception.new "#{error_title}:\n\n - " + errors.join("\n\n - ")
end
end
number_of_agencies_on_homepage = homepage.css(agencies_on_homepage_selector)
unless number_of_agencies_on_homepage.count === 1
error_queue.push(
"Incorrect CSS selector used to find number of agencies on homepage." +
"\n " +
"Selector used '#{agencies_on_homepage_selector}'"
)
end
number_of_agencies_on_organisations_page = organisations
.css(agencies_on_organisation_page_selector)
unless number_of_agencies_on_organisations_page.count === 1
error_queue.push(
"Incorrect CSS selector used to find number of agencies on organisations page." +
"\n " +
"Selector used '#{agencies_on_organisation_page_selector}'")
end
number_of_ministerial_deptartments_on_homepage = homepage.css(ministerial_departments_on_homepage_selector)
unless number_of_ministerial_deptartments_on_homepage.count === 1
error_queue.push(
"Incorrect CSS selector used to find number of ministerial deptartments on homepage." +
"\n " +
"Selector used '#{ministerial_departments_on_homepage_selector}'"
)
end
number_of_ministerial_deptartments_on_organisations_page = organisations.css(ministerial_departments_on_organisation_page_selector)
unless number_of_ministerial_deptartments_on_organisations_page.count === 1
error_queue.push(
"Incorrect CSS selector used to find number of ministerial departments on organisations page." +
"\n " +
"Selector used #{ministerial_departments_on_organisation_page_selector}"
)
end
# Highlight any errors from incorrect CSS selectors being used.
check_for_errors(error_queue)
check_numbers = {
"other_agencies_and_public_bodies" => number_of_agencies_on_homepage.first.content === number_of_agencies_on_organisations_page.first.content + "1",
"ministerial_departments" => number_of_ministerial_deptartments_on_homepage.first.content === number_of_ministerial_deptartments_on_organisations_page.first.content,
}
check_numbers.each do |key, match|
unless match
error_queue.push(
"Number of '#{key.gsub('_', ' ').capitalize}' do not match on homepage and organisation page."
)
end
end
# Highlight any mismatch between the numbers on the homepage and organisation
# pages.
check_for_errors(error_queue)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment