Created
January 5, 2012 01:44
-
-
Save idclark/1563255 to your computer and use it in GitHub Desktop.
scraping results pages from Chicago marathon.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| rm(list=ls()) | |
| library(XML) | |
| library(ggplot2) | |
| library(reshape) | |
| page_numbers <- 1:1430 | |
| weburl <- "http://results.public.chicagomarathon.com/2011/index.php?page=1&content=list&lang=EN&num_results=25&pid=list&search_sort_order=ASC&top_results=3&type=list" | |
| pages <- rep(1,1430) | |
| tables <-(for i in page_numbers){ | |
| readHTMLTable(weburl) | |
| } | |
| n.rows <- unlist(lapply(tables, function(t) dim(t)[1])) | |
| times <- tables[[which.max(n.rows)]] |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Hey - I am working on a data project for a coursera class I am doing and wanted to do a project on marathon data. I don't have any programming experience and was wondering if you would help me understand how I can run your code to get the Chicago marathon data.