Last time we talked about rolling unbiased virtural dice, now let's turn to
opening virtual booster packs in a collectible card game. If you're a nerd like
me, you may be interested in reading about the specifics of how cards are distributed
in [Magic: The Gathering][2] and [Yu-Gi-Oh!][3], but our main concern here is to
randomly select n items from an array of N options. I'm told that it's also
useful in scientific computing or whatever.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
FWIW: I (@rondy) am not the creator of the content shared here, which is an excerpt from Edmond Lau's book. I simply copied and pasted it from another location and saved it as a personal note, before it gained popularity on news.ycombinator.com. Unfortunately, I cannot recall the exact origin of the original source, nor was I able to find the author's name, so I am can't provide the appropriate credits.
- By Edmond Lau
- Highly Recommended 👍
- http://www.theeffectiveengineer.com/
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| import scala.collection.JavaConverters._ | |
| import org.apache.spark.sql.types.{StructType,StructField,StringType} | |
| import org.apache.spark.sql.Row | |
| def identityMatrix(n:Int):Array[Array[String]]=Array.tabulate(n,n)((x,y) => if(x==y) "1" else "0") | |
| def encodeStringOneHot(table:org.apache.spark.sql.DataFrame,column:String) = { | |
| //Accepts the dataframe and the target column name. Returns a new dataframe in which the target column has been replaced with a one-hot/dummy encoding. | |
| table.registerTempTable("temp") |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| library(jsonlite) | |
| cp = fromJSON(txt = "Cell Phone Data.txt", simplifyDataFrame = TRUE) | |
| num.atts = c(4,9,11,12,13,14,15,16,18,22) | |
| cp[,num.atts] = sapply(cp[,num.atts], function (x) as.numeric(x)) | |
| cp$aspect.ratio = cp$att_pixels_y / cp$att_pixels_x | |
| cp$isSmartPhone = ifelse(grepl("smart|iphone|blackberry", cp$name, ignore.case=TRUE) == TRUE | cp$att_screen_size >= 4, "Yes", "No") |
If you were to give recommendations to your "little brother/sister" on things that they need to do to become a data scientist, what would those things be?
I think the "Data Science Venn Diagram" (http://drewconway.com/zia/2013/3/26/the-data-science-venn-diagram) is a great place to start. You need three things to be a good data scientist:
- Statistical knowledge
- Programming/hacking skills
- Domain expertise
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| #!/usr/bin/env python | |
| # -*- coding: utf-8 -*- | |
| # This is a simplified implementation of the LSTM language model (by Graham Neubig) | |
| # | |
| # LSTM Neural Networks for Language Modeling | |
| # Martin Sundermeyer, Ralf Schlüter, Hermann Ney | |
| # InterSpeech 2012 | |
| # | |
| # The structure of the model is extremely simple. At every time step we |
NewerOlder