rmaestre’s gists

yohokuno / residual_rnn_cell.py

Created January 18, 2017 10:14

Residual RNN cell for TensorFlow 0.10

	import tensorflow as tf
	from tensorflow.python.ops.rnn_cell import RNNCell
	from tensorflow.python.ops import variable_scope as vs
	from tensorflow.python.util import nest


	class ResidualRNNCell(RNNCell):
	"""RNN cell composed sequentially of multiple simple cells with residual connection."""

	def __init__(self, cells, state_is_tuple=False):

hnykda / keras.py

Last active June 15, 2023 04:11

Tada's usage (see discussion)

	""" From: http://danielhnyk.cz/predicting-sequences-vectors-keras-using-rnn-lstm/ """
	from keras.models import Sequential
	from keras.layers.core import TimeDistributedDense, Activation, Dropout
	from keras.layers.recurrent import GRU
	import numpy as np

	def _load_data(data, steps = 40):
	docX, docY = [], []
	for i in range(0, data.shape[0]/steps-1):
	docX.append(data[isteps:(i+1)steps,:])

mrkrause / cluster_anova_sim.m

Created September 23, 2014 21:16

	NREP = 30000;
	N = 1000;
	K = 5;

	p_random = nan(NREP, 1);
	p_cluster = nan(NREP, 1);

	parfor rep=1:NREP
	data = randn(N, 1);
	random_group = mod(randperm(N), K);

freeman-lab / bisecting.scala

Last active December 29, 2015 07:45

Bisecting k-means for hierarchical clustering in Spark

	/**
	* bisecting <master> <input> <nNodes> <subIterations>
	*
	* divisive hierarchical clustering using bisecting k-means
	* assumes input is a text file, each row is a data point
	* given as numbers separated by spaces
	*
	*/

	import org.apache.spark.SparkContext

bigaidream / spark_ide.py

Last active January 14, 2018 08:30

To enable IDE (PyCharm) syntax support for Apache Spark, adopted from http://www.abisen.com/spark-from-ipython-notebook.html

	#!/public/spark-0.9.1/bin/pyspark

	import os
	import sys

	# Set the path for spark installation
	# this is the path where you have built spark using sbt/sbt assembly
	os.environ['SPARK_HOME'] = "/public/spark-0.9.1"
	# os.environ['SPARK_HOME'] = "/home/jie/d2/spark-0.9.1"
	# Append to PYTHONPATH so that pyspark could be found

andreas-h / r_stl.py

Created December 5, 2013 16:30

Python-wrapper for R's STL

	# -- coding: utf-8 --

	import datetime

	from numpy import asarray, ceil
	import pandas
	import rpy2.robjects as robjects

	def stl(data, ns, np=None, nt=None, nl=None, isdeg=0, itdeg=1, ildeg=1,
	nsjump=None, ntjump=None, nljump=None, ni=2, no=0, fulloutput=False):

jeongho / hadoop-benchmark

Last active July 18, 2016 06:47

Hadoop benchmark

	http://answers.oreilly.com/topic/460-how-to-benchmark-a-hadoop-cluster/
	http://www.michael-noll.com/blog/2011/04/09/benchmarking-and-stress-testing-an-hadoop-cluster-with-terasort-testdfsio-nnbench-mrbench/

	## MR pi
	https://gist.github.com/jeongho/371aaed47ab462d79851

	## Terasort
	https://gist.github.com/jeongho/3b8c028f5e8409c3a10a

	## TestDFSIO

wpm / spark_parallel_boost.py

Last active December 3, 2018 02:56

A simple example of how to integrate the Spark parallel computing framework and the scikit-learn machine learning toolkit. This script randomly generates test and train data sets, trains an ensemble of decision trees using boosting, and applies the ensemble to the test set. The ensemble training is done in parallel.

	from pyspark import SparkContext

	import numpy as np

	from sklearn.cross_validation import train_test_split, Bootstrap
	from sklearn.datasets import make_classification
	from sklearn.metrics import accuracy_score
	from sklearn.tree import DecisionTreeClassifier

	def run(sc):

mrdwab / stratified.R

Last active November 20, 2025 09:23

Stratified random sampling from a `data.frame` in R

	stratified <- function(df, group, size, select = NULL,
	replace = FALSE, bothSets = FALSE) {
	if (is.null(select)) {
	df <- df
	} else {
	if (is.null(names(select))) stop("'select' must be a named list")
	if (!all(names(select) %in% names(df)))
	stop("Please verify your 'select' argument")
	temp <- sapply(names(select),
	function(x) df[[x]] %in% select[[x]])

fperez / 00-Setup-IPython-PySpark.ipynb

Last active December 21, 2015 23:48

HowTo for starting an IPython Notebook server

Sorry, something went wrong. Reload?

Sorry, we cannot display this file.

Sorry, this file is invalid so it cannot be displayed.

Roberto Maestre rmaestre