Spark in me

sample words with frequency between uniform and 3/4 power of frequency in corpus to alleviate frequent words
-- train only a limited number of classifiers (i.e. 5-15, 1 positive sample + k negative) on each update
-- skip-gram model in a nutshell - http://prntscr.com/iwfwb2

- GloVe - Global Vectors (2014)
-- http://aclweb.org/anthology/D14-1162
-- supposedly GloVe is better given same resources than Word2Vec - http://prntscr.com/iwf9bx
-- in practice word vectors with 200 dimensions are enough for applied tasks
-- considered to be one of sota solutions now (afaik)

(2) BLEU score for translation
- essentially an exp of modified precision index for logs of 4 n-grams
- http://prntscr.com/iwe3v2
- http://dl.acm.org/citation.cfm?id=1073135

(3) Attention is all you need
- http://arxiv.org/abs/1706.03762

To be continued.

#data_science
#nlp
#rnns

Lightshot

Screenshot

Captured with Lightshot

891 viewsAlexander, 09:56

About

Blog

Apps

Platform