整理一些知识图谱的应用示例。来源于网络,不定时更新。
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| import pandas as pd | |
| import nltk | |
| from tqdm import tnrange | |
| import re | |
| import gdelt | |
| # Version 2 queries | |
| gd2 = gdelt.gdelt(version=2) | |
| # days |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| !pip install gdelt #make sure gdelt installed | |
| import pandas as pd, numpy as np, matplotlib.pyplot as plt, gdelt, os, datetime, warnings #imports | |
| gd = gdelt.gdelt(version=1) #instantiate object to pull gdelt files | |
| os.makedirs("data",exist_ok=True) #check if there's a data folder | |
| cur_date = datetime.datetime(2019,10,7)-datetime.timedelta(days=60) #start pulling from 60 days prior to 10/7 | |
| while cur_date < datetime.datetime(2019,10,7): #pull until 10/7 | |
| if not os.path.exists("data/%s-%s-%s.pkl"%(cur_date.year, cur_date.month, cur_date.day)): #if don't have |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| pip install --upgrade -e git+https://github.com/twintproject/twint.git@origin/master#egg=twint | |
| pip install confluent-kafka[avro] |
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| import spacy | |
| nlp = spacy.load("en_core_web_lg") | |
| with open("scraped.json", "r") as f: | |
| news = json.load(f) | |
| news = [i['body'] for i in news] | |
| processed_docs = list(nlp.pipe(news)) | |
| verb_list = ["launch", "begin", "initiate", "start"] | |
| dobj_list = ["attack", "offensive", "operation", "assault"] |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| # Author: Linwood Creekmore | |
| # Email: valinvescap@gmail.com | |
| # Description: Python script to pull content from a website (works on news stories). | |
| #Licensed under GNU GPLv3; see https://choosealicense.com/licenses/lgpl-3.0/ for details | |
| # Notes | |
| """ | |
| 23 Oct 2017: updated to include readability based on PyCon talk: https://github.com/DistrictDataLabs/PyCon2016/blob/master/notebooks/tutorial/Working%20with%20Text%20Corpora.ipynb | |
| 18 Jul 2018: added keywords and summary |
- Sorting segments within a stacked bar chart: http://kb.tableau.com/articles/howto/sorting-segments-within-stacked-bars-by-value
- Pareto 2-part area chart: http://www.vizwiz.com/2016/08/tableau-tip-tuesday-how-to-create-two.html
- Doughnut charts: http://www.evolytics.com/blog/tableau-201-how-to-make-donut-charts/
- Hex maps: http://sirvizalot.blogspot.com/2015/11/hex-tile-maps-in-tableau.html
- Tile maps: http://www.bfongdata.com/2015/11/periodic-table-map.html
- Small multiple-tile maps: http://sirvizalot.blogspot.com/2016/05/how-to-small-multiple-tile-map-in.html
- Date Functions:
简明 Python 教程: http://woodpecker.org.cn/abyteofpython_cn/chinese/
一开始通读一遍这个很不错,是最简单明确的 Python 教程,最适合快速了解。
NewerOlder