Machine Discovery (3
credits)
Instructor: Prof.
Shou-de Lin (sdlin@csie.ntu.edu.tw)
Classroom: CSIE 107
Meeting Time: Friday
9:10am-12:10pm
Office Hour: By
Appointment
TA: Wei-Chi Lai (r94163@csie.ntu.edu.tw)
Course Description:
This course
discusses how machine can automatically perform or (assist human in
performing) discovery tasks. It will cover three main themes: the instructor
will go through several promising modeling techniques for MD, discuss some
useful computational methods for MD, and introduce several exemplary/classic
MD systems. Students are expected to not only comprehend the theoretical
issues behind machine discovery but also have hands on experience in
designing a discovery system.
Also please refer to
"Machine-Discovery - the Popular Science View".
Grading:
Homework and Presentation: (30%)
Programming Assignments: (35%)
Final Project (35%)
Recommend Readings:
B1:
"Machine Discovery", Jan Zytkow, 1997
B2:
"Knowledge Discovery and Measures of Interest", Robert J. Hilderman, Howard J.
Hamilton, 2001
Papers
to be presented:
Machine Discovery Principle Group:
MD1. Herbert A.
Simon, "Machine Discovery" and comments (B1, p171-p224)
MD2: Wei-Min Shen
"The Process of Discovery" (B1, p233-251)
MD3: S. Borrett et
al. "A
method for representing and developing process models", Ecological
Complexity, 2007
MD4: P, Langley,
et.al.
Constructing
explanatory process models from biological data and knowledge. AI in
Medicine, 2006
Link Discovery and Anomaly Detection
Group:
LA1: William Eberle
and Lawrence Holder, "Discovering Structural Anomalies in Graph-Based Data."
ICDM07
LA2: L. Backstrom, et al. "Group
Formation in Large Social Networks: Membership, Growth, and Evolution"
KDD2006
LA3: Bo Long, Xiaoyun Wu, Zhongfei Zhang, Philip Yu, "Unsupervised
Learning on K-partite Graphs", KDD2006
LA4: Neville, J. and
D. Jensen
"Relational Dependency Networks" Journal of Machine Learning Research,
2007
Language Model Group:
LM1:
R. Iyer, M. Ostendorf, "Modeling
Long Distance Dependence in Language: Topic Mixtures vs. Dynamic Cache Models",
ICSLP '96
LM2: Ronald Rosenfeld "Two
Decades Of Statistical Language Modeling: Where Do We Go From Here?" 2000
LM3: P. Brown, et al. "Class-Based
N-Gram Models of Natural Language" Computational
Linguistics, vol.18, no.4, pp. 467-- 479, 1992.
LM4: Kai-Fu Lee; Mingjing Li; Zheng Chen , "Discriminative
training on language model", 2000, MSRA.
LM5:
Jianfeng Gao; Kai-Fu Lee; Mingjing Li, "N-gram
distribution based language model adaptation", 2000, MSRA.
Unsupervised Learning Group:
Syllabus:
Introduction |
¡@ |
¡@ |
Week 1 |
Course introduction: what, why, how, grades, SOP for MD |
¡@ |
Modeling techniques, unsupervised methods and Exemplary Discovery Systems |
¡@ |
Week 2 |
Probabilistic Graphical Model (FSA, LM, BN) |
¡@ |
Week 3 |
Hidden Markov Model, Viterbi Algorithm |
Assignment 1 out |
Week 4 |
Expectation-Maximization Algorithm (1) |
¡@ |
Oct 19 |
Expectation-Maximization Algorithm (2) |
Assignment 1 due
Assignment 2 out |
Oct 26 |
Unsupervised Labeling (Semantic role labeling, Word Sense
Disambiguation, POS Tagging) +Decipherment |
¡@ |
Nov 2 |
Discovery in Multi-relational Networks +
Explanation-based Discovery |
¡@ |
Nov 16 |
Clustering (by Prof. Chien-Yu Chen) |
Assignment 2 due, homework (short essay) out |
Nov 23 |
Social Networks Analysis + Interestingness measure |
Project Proposal Due |
Paper and Project Presentations |
¡@ |
Nov 30 |
Project Proposal Presentation |
¡@ |
Dec 7 |
Language Model Group |
¡@ |
Dec 14 |
Unsupervised Learning Group |
¡@ |
Dec 21 |
Link Discovery and Anomaly Detection
Group |
¡@ |
Dec 28 |
Machine Discovery Principle Group
|
¡@ |
Jan 4 |
Final Project Presentation |
¡@ |
Jan 11 |
Final Project Presentation |
Final Project Report Due, Homework(Short Essay) Due |