Skip to content

Instantly share code, notes, and snippets.

@turizoft
Last active August 29, 2015 14:01
Show Gist options
  • Select an option

  • Save turizoft/52e2218a3fb612986079 to your computer and use it in GitHub Desktop.

Select an option

Save turizoft/52e2218a3fb612986079 to your computer and use it in GitHub Desktop.
Download csv song collection using tinysong
require 'csv'
require 'i18n'
require 'nokogiri'
require 'open-uri'
require 'watir-webdriver'
# This script reads a csv file with song name and
# artist name in order to download them
# This csv can be generated using for example http://groovebackup.com/
# if your collection is hosted on Grooveshark
# Songs are downloaded from http://www.mp3xd.com/
# Before running ensure i18n, watir-webdriver and nokogiri gems are installed
csv_path = 'songs.csv'
# Read song list
puts 'reading song list'
csv_text = File.read(csv_path)
csv = CSV.parse(csv_text, headers: true)
# Start a browser
browser = Watir::Browser.new :chrome
csv.each do |row|
puts "-- downloading #{row[0]} --"
# Format query
query = I18n.transliterate("#{row[1]} #{row[0]}").downcase.squeeze(' ')
query = query.gsub(/[^0-9a-z ]/i, '')
# Generate url
param = query.gsub(' ', '-')
url = "http://www.mp3xd.com/descargar-mp3/#{param}-1.html"
# Download source from search page
doc = Nokogiri::HTML(open(url))
# Scan document to search for links
if doc.at_css('.song')
href = doc.at_css('.song a.button-icon-descargar')['href']
dl_url = "http://www.mp3xd.com#{href}"
# Url needs to be opened in a headless browser with js support
browser.goto(dl_url)
browser.iframe(id: 'dlframe').wait_until_present
song_url = browser.iframe(id: 'dlframe').p(id: 'url').a().href
#Download song
song_name = I18n.transliterate(row[0]).squeeze(' ')
artist_name = I18n.transliterate(row[1]).squeeze(' ')
path = "#{artist_name} - #{song_name}.mp3" #Note: override this as needed
open(path, 'wb') do |file|
file << open(song_url).read
end
else
puts "#{row[0]} not found"
end
end
browser.close
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment