最近出来实习,泪奔,没时间学习了,把一些觉得很好但是没时间看的资源放这 以后学习
如果说理解一个技术的最高境界,就是能够用最简单的方式将这个技术表达出来的话,那么Igor对于CPU Cache的理解绝对达到了此境界。他的博文:Gallery of Processor Cache Effects http://t.cn/hrXwvb 7个简单至极的代码示例,覆盖了Cache Line、Cache Size、False Sharing等重要知识点,不得不服
NAACL今天的tutorial包括了斯坦福Richard Socher和Christopher Manning关于深度学习在NLP中应用的教学讲座。看了一下slides,比去年ACL的版本增加了一些新内容,可以算是关于深度学习在语言技术的应用中相当全面的tutorial了。"Deep Learning for NLP (without Magic)" slides: http://t.cn/zHHyKUo
教程tutorial
ubc 的machine learning 2013 课程
有mcmc 以及最新的深度学习的课程
http://www.cs.ubc.ca/~nando/540-2013/lectures.html
文本挖掘技术
http://www.icst.pku.edu.cn/course/mining/11-12spring/index.html
rbm java 代码 估计是最对我胃口的代码
https://github.com/tjake/rbm-dbn-mnist
Stanford NLP组专门设置了Deep Learning in Natural Language Processing的主页
http://nlp.stanford.edu/projects/DeepLearningInNaturalLanguageProcessing.shtml
一个大牛的主页
这是其教学 有很多资料
http://alex.smola.org/teaching/
http://www.cs.princeton.edu/courses/archive/spring10/cos424/w/syllabus
The Large Scale Learning class notes
http://cilvr.cs.nyu.edu/doku.php?id=courses:bigdata:slides:start
算法tutorial
一个剑桥大学教授的主页 高斯过程的pdf讲得很细很好
http://mlg.eng.cam.ac.uk/zoubin/
变分贝叶斯 tutorial 很nice
http://people.inf.ethz.ch/bkay/talks/Brodersen_2013_03_22.pdf
关于协同过滤 和graph mind 的hadoop 实现
https://code.google.com/p/hadoop-network/
单机模式处理大数据,搜集一些好用的开源利器
1. LibFM
2. Svdfeature
项目主页:http://apex.sjtu.edu.cn/apex_wiki/svdfeature
3. Libsvm和Liblinear
libsvm项目主页:http://www.csie.ntu.edu.tw/~cjlin/libsvm/
liblinear项目主页:http://www.csie.ntu.edu.tw/~cjlin/liblinear/
初次使用必读:practical guide
libsvm的开发心得by林智仁:http://www.csie.ntu.edu.tw/~cjlin/talks/kdd.pdf
4. rt-rank
项目主页:http://research.engineering.wustl.edu/~amohan/
rt-rank中实现了推荐系统中常见的random forests和gradient boosted decision trees这两种方法,使用起来很方便。
3. Mahout
项目主页:http://mahout.apache.org/
4. MyMediaLite
项目主页:http://www.ismll.uni-hildesheim.de/mymedialite/
4. GraphLab 和 GraphChi
GraphLab项目主页:http://graphlab.org/
GraphChi项目主页:http://graphlab.org/graphchi/
GraphChi的下载地址:https://code.google.com/p/graphchi/downloads/detail?name=graphchi_src_v0.1.2_toolkits.tar.gz
CF for GraphChi: http://bickson.blogspot.com/2012/08/collaborative-filtering-with-graphchi.html
pylearn2
https://github.com/lisa-lab/pylearn2
包含很多特性 ,更新很快
-
-
Training algorithms
-
-
A “default training algorithm” that asks the model to train itself
-
-
Stochastic gradient descent, with extensions including
-
- Learning rate decay
- Momentum
- Polyak averaging
- Early stopping
- A simple framework for adding your own extensions
-
-
Batch gradient descent with line searches
-
Nonlinear conjugate gradient descent (with line searches)
-
-
-
-
Model Estimation Criteria
-
- Score Matching
- Denoising Score Matching
- Noise-Contrastive Estimation
- Cross-entropy
- Log-likelihood
-
-
-
Models
-
-
Autoencoders, including Contractive and Denoising Autoencoders
-
-
RBMs, including gaussian and ssRBM. Varying levels of integration into
-
the full framework.
-
-
k-means
-
Local Coordinate Coding
-
Maxout networks
-
PCA
-
Spike-and-Slab Sparse coding
-
-
SVMs (we provide a wrapper around scikit-learn that makes it easy to
-
train a multiclass svm on dense training data in a memory efficient way, which doesn’t always happen using scikit-learn directly)
-
-
-
Partial implementation of DBMs (contact Ian Goodfellow if you would like
-
to complete it)
-
-
-
-
-
Datasets:
-
- MNIST, MNIST with background and rotations
- STL-10
- CIFAR-10, CIFAR-100
- NIPS Workshops 2011 Transfer Learning Challenge
- UTLC
- NORB
- Toronto Faces Dataset
-
-
-
Dataset pre-processing
-
- Contrast normalization
- ZCA whitening
- Patch extraction (for implementing convolution-like algorithms)
- The Coates+Lee+Ng CIFAR processing pipeline
-
-
-
Miscellaneous algorithms and utilities:
-
-
AIS
-
Weight visualization for single layer networks
-
-
Can plot learning curves showing how user-configured quantities
-
change during learning
-
-
-