Wenjun YANG bladereo

Random Sampling Without Replacement

Last time we talked about rolling unbiased virtural dice, now let's turn to opening virtual booster packs in a collectible card game. If you're a nerd like me, you may be interested in reading about the specifics of how cards are distributed in [Magic: The Gathering][2] and [Yu-Gi-Oh!][3], but our main concern here is to randomly select n items from an array of N options. I'm told that it's also useful in scientific computing or whatever.

FWIW: I (@rondy) am not the creator of the content shared here, which is an excerpt from Edmond Lau's book. I simply copied and pasted it from another location and saved it as a personal note, before it gained popularity on news.ycombinator.com. Unfortunately, I cannot recall the exact origin of the original source, nor was I able to find the author's name, so I am can't provide the appropriate credits.

Effective Engineer - Notes

By Edmond Lau
Highly Recommended 👍
http://www.theeffectiveengineer.com/

What's an Effective Engineer?

If you were to give recommendations to your "little brother/sister" on things that they need to do to become a data scientist, what would those things be?

I think the "Data Science Venn Diagram" (http://drewconway.com/zia/2013/3/26/the-data-science-venn-diagram) is a great place to start. You need three things to be a good data scientist:

Statistical knowledge
Programming/hacking skills
Domain expertise

	import scala.collection.JavaConverters._
	import org.apache.spark.sql.types.{StructType,StructField,StringType}
	import org.apache.spark.sql.Row



	def identityMatrix(n:Int):Array[Array[String]]=Array.tabulate(n,n)((x,y) => if(x==y) "1" else "0")
	def encodeStringOneHot(table:org.apache.spark.sql.DataFrame,column:String) = {
	//Accepts the dataframe and the target column name. Returns a new dataframe in which the target column has been replaced with a one-hot/dummy encoding.
	table.registerTempTable("temp")

	library(jsonlite)

	cp = fromJSON(txt = "Cell Phone Data.txt", simplifyDataFrame = TRUE)

	num.atts = c(4,9,11,12,13,14,15,16,18,22)

	cp[,num.atts] = sapply(cp[,num.atts], function (x) as.numeric(x))
	cp$aspect.ratio = cp$att_pixels_y / cp$att_pixels_x
	cp$isSmartPhone = ifelse(grepl("smart\|iphone\|blackberry", cp$name, ignore.case=TRUE) == TRUE \| cp$att_screen_size >= 4, "Yes", "No")

	#!/usr/bin/env python
	# -- coding: utf-8 --

	# This is a simplified implementation of the LSTM language model (by Graham Neubig)
	#
	# LSTM Neural Networks for Language Modeling
	# Martin Sundermeyer, Ralf Schlüter, Hermann Ney
	# InterSpeech 2012
	#
	# The structure of the model is extremely simple. At every time step we

Wenjun YANG bladereo

Random Sampling Without Replacement

Effective Engineer - Notes

What's an Effective Engineer?

Statistical knowledge