Created
May 11, 2018 17:25
-
-
Save tvieiragoncalves/2192abf0f00342427849d80f35d756ce to your computer and use it in GitHub Desktop.
Scraper para selecionar os títulos dos posts de um blog em jekyll
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| # Bibliotexas necessárias | |
| import requests | |
| import bs4 | |
| #get.request da página a analisar | |
| fonte = requests.get('https://tvieiragoncalves.github.io/genesis/') | |
| #Transformar o objecto em soup object, escolher o parser a utilizar | |
| soup = bs4.BeautifulSoup(fonte.text, 'lxml') | |
| #função para escolher todo o texto com a mesma tag e finalmente fazer print | |
| for paragrafo in soup.find_all('h3'): | |
| print(paragrafo.text) |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment