Skip to content

Instantly share code, notes, and snippets.

@denissellu
Last active January 21, 2018 22:55
Show Gist options
  • Select an option

  • Save denissellu/1ff33c2ff32f4c6a790d797f41ed8c21 to your computer and use it in GitHub Desktop.

Select an option

Save denissellu/1ff33c2ff32f4c6a790d797f41ed8c21 to your computer and use it in GitHub Desktop.
# Install Ruby 2.1+
# gem install nokogiri
require "nokogiri"
require "open-uri"
require "csv"
CSV.open("top-posts.csv", "wb") do |csv|
csv << ["Article Name", "Article Link", "Article Claps", "Trending Date"]
# Date range to scrape the article for
Date.new(2017, 01, 01).upto(Date.new(2018, 01, 21)) do |date|
year = date.year
month = date.strftime("%B").downcase
day = date.strftime("%d").downcase
trending_date = "#{day}-#{month}-#{year}"
url = "https://medium.com/browse/top/#{month}-#{day}-#{year}"
doc = Nokogiri::HTML(open(url))
puts "Scraping #{day} #{month} #{year}"
puts url
doc.css('div.postArticle').each do |article_title|
# Incase there is no title
if article_title.css('.postArticle-content a > h3').first
article_name = article_title.css('.postArticle-content a > h3').first.content
else
article_name = ""
end
article_link = article_title.css('.postArticle-content a').first['href']
article_claps = article_title.css('.js-multirecommendCountButton').first.content
csv << [article_name, article_link, article_claps, trending_date]
end
end
end
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment