Skip to content

Instantly share code, notes, and snippets.

View hezila's full-sized avatar

Feng Wang (Felix) hezila

View GitHub Profile
@xrstf
xrstf / setup.md
Last active October 3, 2022 13:30
Nutch 2.3 + ElasticSearch 1.4 + HBase 0.94 Setup

Info

This guide sets up a non-clustered Nutch crawler, which stores its data via HBase. We will not learn how to setup Hadoop et al., but just the bare minimum to crawl and index websites on a single machine.

Terms

  • Nutch - the crawler (fetches and parses websites)
  • HBase - filesystem storage for Nutch (Hadoop component, basically)
@tonyseek
tonyseek / formula.py
Last active June 28, 2022 09:13
render latex formula with matplotlib and pillow.
try:
from StringIO import StringIO as BytesIO
except ImportError:
from io import BytesIO
import matplotlib.pyplot as plt
def render_latex(formula, fontsize=12, dpi=300, format_='svg'):
"""Renders LaTeX formula into image."""
@stephanetimmermans
stephanetimmermans / ubuntu-docker
Created November 10, 2014 12:21
Install Docker on Ubuntu
sudo apt-get update
sudo apt-get install docker.io
source /etc/bash_completion.d/docker.io
sudo docker run -i -t ubuntu /bin/bash
@debasishg
debasishg / gist:b4df1648d3f1776abdff
Last active June 20, 2025 13:59
another attempt to organize my ML readings ..
  1. Feature Learning
  1. Deep Learning
@eliasah
eliasah / knn.sh
Last active July 7, 2016 21:33
[elasticsearch] compute K-nearest neighbor for training a classifier purposes
##########################################################################################
# use case: training a classifier
#
# Many systems classify documents by assigning “tag” or “category” fields. Classifying
# documents can be a tedious manual process and so in this example we will train a classifier
# to automatically spot keywords in new documents that suggest a suitable category.
curl -XGET "http://localhost:9200/products_fr/_search" -d'
{
"query": {
@ricardo-rossi
ricardo-rossi / ElasticSearch.sh
Last active February 25, 2025 22:09
Installing ElasticSearch on Ubuntu 14.04
#!/bin/bash
### USAGE
###
### ./ElasticSearch.sh 1.7 will install Elasticsearch 1.7
### ./ElasticSearch.sh will fail because no version was specified (exit code 1)
###
### CLI options Contributed by @janpieper
### Check http://www.elasticsearch.org/download/ for latest version of ElasticSearch
@jyemin
jyemin / MongoDBWithJodaExample.java
Created July 16, 2014 16:57
An example showing how to integrate support for Joda DateTime into the MongoDB Java driver.
import com.mongodb.BasicDBObject;
import com.mongodb.DBCollection;
import com.mongodb.DBObject;
import com.mongodb.MongoClient;
import org.bson.BSON;
import org.bson.Transformer;
import org.joda.time.DateTime;
import java.net.UnknownHostException;
import java.util.Date;
@typehorror
typehorror / Flask-SQLAlchemy Caching.md
Last active February 15, 2024 14:44
Flask SQLAlchemy Caching

Flask-SQLAlchemy Caching

The following gist is an extract of the article Flask-SQLAlchemy Caching. It allows automated simple cache query and invalidation of cache relations through event among other features.

Usage

retrieve one object

# pulling one User object

user = User.query.get(1)

@hest
hest / gist:8798884
Created February 4, 2014 06:08
Fast SQLAlchemy counting (avoid query.count() subquery)
def get_count(q):
count_q = q.statement.with_only_columns([func.count()]).order_by(None)
count = q.session.execute(count_q).scalar()
return count
q = session.query(TestModel).filter(...).order_by(...)
# Slow: SELECT COUNT(*) FROM (SELECT ... FROM TestModel WHERE ...) ...
print q.count()
@granoeste
granoeste / webkitmediasource-is-type-supported.html
Last active December 26, 2021 10:54
[JavaScript][HTML5][MediaSource] MediaSource.isTypeSupported
<!DOCTYPE html>
<html>
<head>
<script>
window.MediaSource = window.MediaSource || window.WebKitMediaSource;
function testTypes(types) {
for (var i = 0; i < types.length; ++i)
console.log("MediaSource.isTypeSupported(" + types[i] + ") : " + MediaSource.isTypeSupported(types[i]));
}