Skip to content

Instantly share code, notes, and snippets.

View rmaestre's full-sized avatar

Roberto Maestre rmaestre

View GitHub Profile
@yohokuno
yohokuno / residual_rnn_cell.py
Created January 18, 2017 10:14
Residual RNN cell for TensorFlow 0.10
import tensorflow as tf
from tensorflow.python.ops.rnn_cell import RNNCell
from tensorflow.python.ops import variable_scope as vs
from tensorflow.python.util import nest
class ResidualRNNCell(RNNCell):
"""RNN cell composed sequentially of multiple simple cells with residual connection."""
def __init__(self, cells, state_is_tuple=False):
@hnykda
hnykda / keras.py
Last active June 15, 2023 04:11
Tada's usage (see discussion)
""" From: http://danielhnyk.cz/predicting-sequences-vectors-keras-using-rnn-lstm/ """
from keras.models import Sequential
from keras.layers.core import TimeDistributedDense, Activation, Dropout
from keras.layers.recurrent import GRU
import numpy as np
def _load_data(data, steps = 40):
docX, docY = [], []
for i in range(0, data.shape[0]/steps-1):
docX.append(data[i*steps:(i+1)*steps,:])
NREP = 30000;
N = 1000;
K = 5;
p_random = nan(NREP, 1);
p_cluster = nan(NREP, 1);
parfor rep=1:NREP
data = randn(N, 1);
random_group = mod(randperm(N), K);
@freeman-lab
freeman-lab / bisecting.scala
Last active December 29, 2015 07:45
Bisecting k-means for hierarchical clustering in Spark
/**
* bisecting <master> <input> <nNodes> <subIterations>
*
* divisive hierarchical clustering using bisecting k-means
* assumes input is a text file, each row is a data point
* given as numbers separated by spaces
*
*/
import org.apache.spark.SparkContext
@bigaidream
bigaidream / spark_ide.py
Last active January 14, 2018 08:30
To enable IDE (PyCharm) syntax support for Apache Spark, adopted from http://www.abisen.com/spark-from-ipython-notebook.html
#!/public/spark-0.9.1/bin/pyspark
import os
import sys
# Set the path for spark installation
# this is the path where you have built spark using sbt/sbt assembly
os.environ['SPARK_HOME'] = "/public/spark-0.9.1"
# os.environ['SPARK_HOME'] = "/home/jie/d2/spark-0.9.1"
# Append to PYTHONPATH so that pyspark could be found
@andreas-h
andreas-h / r_stl.py
Created December 5, 2013 16:30
Python-wrapper for R's STL
# -*- coding: utf-8 -*-
import datetime
from numpy import asarray, ceil
import pandas
import rpy2.robjects as robjects
def stl(data, ns, np=None, nt=None, nl=None, isdeg=0, itdeg=1, ildeg=1,
nsjump=None, ntjump=None, nljump=None, ni=2, no=0, fulloutput=False):
@jeongho
jeongho / hadoop-benchmark
Last active July 18, 2016 06:47
Hadoop benchmark
http://answers.oreilly.com/topic/460-how-to-benchmark-a-hadoop-cluster/
http://www.michael-noll.com/blog/2011/04/09/benchmarking-and-stress-testing-an-hadoop-cluster-with-terasort-testdfsio-nnbench-mrbench/
## MR pi
https://gist.github.com/jeongho/371aaed47ab462d79851
## Terasort
https://gist.github.com/jeongho/3b8c028f5e8409c3a10a
## TestDFSIO
@wpm
wpm / spark_parallel_boost.py
Last active December 3, 2018 02:56
A simple example of how to integrate the Spark parallel computing framework and the scikit-learn machine learning toolkit. This script randomly generates test and train data sets, trains an ensemble of decision trees using boosting, and applies the ensemble to the test set. The ensemble training is done in parallel.
from pyspark import SparkContext
import numpy as np
from sklearn.cross_validation import train_test_split, Bootstrap
from sklearn.datasets import make_classification
from sklearn.metrics import accuracy_score
from sklearn.tree import DecisionTreeClassifier
def run(sc):
@mrdwab
mrdwab / stratified.R
Last active November 20, 2025 09:23
Stratified random sampling from a `data.frame` in R
stratified <- function(df, group, size, select = NULL,
replace = FALSE, bothSets = FALSE) {
if (is.null(select)) {
df <- df
} else {
if (is.null(names(select))) stop("'select' must be a named list")
if (!all(names(select) %in% names(df)))
stop("Please verify your 'select' argument")
temp <- sapply(names(select),
function(x) df[[x]] %in% select[[x]])
@fperez
fperez / 00-Setup-IPython-PySpark.ipynb
Last active December 21, 2015 23:48
HowTo for starting an IPython Notebook server
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.