-
Notifications
You must be signed in to change notification settings - Fork 0
/
index.html
executable file
·694 lines (563 loc) · 31.3 KB
/
index.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
<html>
<head>
<title>William W. Cohen</title>
<!-- script type="text/javascript" src="http://shots.snap.com/snap_shots.js?ap=1&key=f189a52ff115e29092c9f9bb3678047a&sb=1&th=orange&cl=0&si=0&po=0&df=0&oi=0&lang=en-us&domain=wcohen.com"></script -->
<link rel="stylesheet" type="text/css" href="style.css"/>
</head>
<body bgcolor="white">
<table>
<tr>
<td>
<img align=center src="william-at-whiteboard-small.JPG" height="auto" width="75%" alt="Picture of William Cohen">
</td>
<td>
<h2 class="name">William W. Cohen</h2>
<h3 class="title">Visiting Professor, <a href="http://ml.cmu.edu>Machine Learning Dept, CMU</a></h3>
</td>
</table>
<table><tr><td>
<p>[</font>
<a class="nav" href="#bio">Bio</a> |
<a class="nav" href="#announce">Announcements and FAQs</a> |
<a class="nav" href="#teach">Teaching</a> |
<!-- <a class="nav" href="#proj">Projects</a> | -->
<a class="nav" href="#pubs">Publications</a> (<a class="nav" href="pubs-s.html">recent</a>, <a class="nav" href="pubs.html">all</a>) |
<a class="nav" href="#sw">Software</a> |
<a class="nav" href="#data">Datasets</a> |
<a class="nav" href="#talks">Talks</a> |
<a class="nav" href="#buddies">Students & Colleagues</a> |
<!-- <a class="nav" href="http://wcohen.blogspot.com">Blog</a> | -->
<a class="nav" href="#misc">Other Stuff</a>
]
<tr><td>Prospective visitors/students: see <a href="#announce">announcements</>
</table>
<h3 class="sec"><a name="bio"></a>Biography</h3 class="sec">
William Cohen is a Visiting Professor at Carnegie Mellon University in
the <a href="http://www.ml.cmu.edu">Machine Learning Department</a>.
He also holds a position as a Principal Scientist at Google, where he
worked full-time between May 2018 and March 2024. He received his
bachelor's degree in Computer Science from
<a href="http://www.duke.edu">Duke University</a> in 1984, and a PhD
in Computer Science from <a href="http://www.rutgers.edu">Rutgers
University</a> in 1990. From 1990 to 2000 Dr. Cohen worked at
AT&T <a href="http://www.bell-labs.com/">Bell Labs</a> and
later <a href="http://www.research.att.com">AT&T Labs-Research</a>,
and from April 2000 to May 2002 Dr. Cohen worked
at <a href="http://www.whizbang.com">Whizbang Labs</a>, a company
specializing in extracting information from the web. From 2002 to
2018, Dr. Cohen worked at Carnegie Mellon University in
the <a href="http://www.ml.cmu.edu">Machine Learning Department</a>,
with a joint appointment in
the <a href="http://www.lti.cs.cmu.edu">Language Technology
Institute</a>.
<p>
Dr. Cohen is a past president of
the <a href="http://www.machinelearning.org/">International Machine
Learning Society</a>. In the past he has also served as an action
editor for the
the <a href="http://secure.aidcvt.com/mcp/searchresult.asp?INPUT=AI&Type=Pass&PCS=MCP">AI
and Machine Learning</a> series of books published
by <a href="http://www.morganclaypool.com/">Morgan Claypool</a>, for
the
journal <a href="http://pages.stern.nyu.edu/~fprovost/MLJ/"><i>Machine
Learning</i></a>, the
journal <a href="http://www.elsevier.com/locate/artint"><i>Artificial
Intelligence</i></a>, the <a href="http://www.jmlr.org"><i>Journal of
Machine Learning Research</i></a>, and
the <a href="http://www.jair.org"><i>Journal of Artificial
Intelligence Research</i></a>. He was General Chair for
the <a href="http://icml2008.cs.helsinki.fi/">2008 International
Machine Learning Conference</a>, held July 6-9 at
the <a href="http://www.helsinki.fi/university">University of
Helsinki</a>,
in <a href="http://cc.oulu.fi/~thu/personal/Finland.html">Finland</a>;
Program Co-Chair of
the <a href="http://www.autonlab.org/icml2006/home.html">2006
International Machine Learning Conference</a>; and Co-Chair of
the <a href="http://www.cs.rutgers.edu/pub/learning94/learning94.html">1994
International Machine Learning Conference</a>. Dr. Cohen was also the
co-Chair for the <a href="http://www.icwsm.org/2009/index.shtml">3rd
Int'l AAAI Conference on Weblogs and Social Media</a>, which was held
May 17-20, 2009 in San Jose, and was the co-Program Chair for
the <a href="http://www.icwsm.org/2010/index.shtml">4rd Int'l AAAI
Conference on Weblogs and Social Media</a>. He is
a <a href="http://www.aaai.org/Awards/fellows-list.php">AAAI
Fellow</a>, and was a winner of the 2008
the <a href="http://www.sigmod.org/sigmod-awards/sigmod-awards#time">SIGMOD
"Test of Time" Award</a> for the most influential SIGMOD paper of
1998, the
2014 <a href="http://sigir.org/sigir-2014-best-paper-awards/"> SIGIR
"Test of Time" Award</a> for the most influential SIGIR paper of
2002-2004, and the 2023 Semantic Web Science
Association's <a href="https://swsa.semanticweb.org/content/swsa-ten-year-award">Ten-Year
Award</a> for the most influential paper of the ISWC-2013 conference.
<p>
Dr. Cohen's research interests include include question answering,
machine learning for NLP tasks, and neuro-symbolic reasoning, and he
has a long-standing interest in statistical relational learning. He
holds seven patents related to learning, discovery, information
retrieval, and data integration, and is the author of more than 300
publications.
<!-- <h3 class="sec"><a name="cv">Curriculum vita</cv></h3 class="sec">
<ul>
<li><a href="cv.pdf">My c.v. in PDF.</a>
</ul>
-->
<h3 class="sec"><a name="announce"></a>Announcements and FAQs</h3 class="sec">
<ul>
<li>May 2024: A new edition of <i>A Computer Scientist's Guide To
Biology</i> will be out later this summer! More information and an
excerpt is available from
my <a href="https://charleskcohen.com/science-writing/">co-author's
website</a>. I'm possibly biased but I
think <a href="https://charleskcohen.com">Charles Cohen</a> did a
great job with the update - the book is still quite compact, but
pretty much the whole book has been rewritten and updated. For
example the new version includes several new chapters on topics
like CRISPR which weren't even a thing back in 2007.
<ul><li>On a related note, here's a <a href="https://charleskcohen.com/ai-madness/">nice non-technical
description of LLM hallucinations</a> written by Charlie
(based in part on an interview with Vidhisha Balachandran).
</ul>
<li>March 2024: As you can see from my updated bio above, I am have
returned to CMU's ML department full-time (although I still have a
20% involvement a Google, so that email will work!) I'm really
looking forward to re-engaging with my friends at colleagues at CMU.
<li>Nov 2023: I'm honored to report that the paper <a href="https://link.springer.com/chapter/10.1007/978-3-642-41335-3_34">Knowledge
Graph Identification</a>, written by Jay Pujara, Hui Miao, Lise Getoor and myself,
won a <a href="https://iswc2023.semanticweb.org/awards/">10 year best paper award at
the International Semantic Web Conference, 2023.
<li>May 2023: I'm very honored to report that one of
the <a href="https://arxiv.org/abs/2209.12153">papers</a> I
co-authored at EACL 2023 (with Julian Eisenschlos, Jeremy Cole, and
Fangyu Liu) won an Outstanding Paper Award.
</ul>
<!--
<h3 class="sec"><a name="proj">Projects</a></h3 class="sec">
Projects I'm currently involved with include:
<ul>
<li><a href="http://curtis.ml.cmu.edu/gnat/">GNAT is an automatic KB
construction toolkit</a> that has been used to build KBs for several
different domains,
including <a href="http://curtis.ml.cmu.edu/gnat/biomed">consumer
health information</a>
and <a href="http://curtis.ml.cmu.edu/gnat/software">software</a>.
<li><a href="http://rtw.ml.cmu.edu/rtw/">NELL</a> is a web-scale
information extraction system.
</ul>
-->
<!-- <li><a href="http://sites.google.com/site/simstudentprojectweb/">SimStudent</a>, a project that adds learning-by-demonstration to <a href="http://ctat.pact.cs.cmu.edu/">CTAT</a>. -->
<!-- <li><a href="querendipity/">Querendipity</a>, an adaptive personal information management system for biologists. -->
<!--
<li><a href="http://boowa.com">SEAL</a>, a Google-Sets-like bootstrapping tool written by my former student, <a
href="http://rcwang.com">Richard Wang</a>. -->
<!--
<li><a href="http://murphylab.web.cmu.edu/services/SLIF2/">SLIF</a>, a system that analyzes the text and images
in online journal articles to find information about the subcellular localization of proteins. -->
<!--
<li><a href="http://teamcohen.github.com/MinorThird/">Minorthird</a>,
an open-source Java package of information extraction software. (Note: we've
migrated the code now from SourceForge to GitHub.)
-->
<h3 class="sec"><a name="teach"></a>Teaching</h3 class="teach">
For now <a href="http://www.cs.cmu.edu/~wcohen/">my old course
notes and lectures</a> are avilable through CMU.
<h3 class="sec"><a name="sw">Software and demos</a></h3 class="sec">
<li><a href="http://www.cs.cmu.edu/~enron">Enron email dataset</a>
(400Mb, once you get there) contains 800,000+ emails from 150 users+
organized into 4700+ folders.
<li><a href="classify.tar.gz">classify.tar.gz</a> (0.4Mb) contains
nine problems in which the goal is to classify short entity names.
This data was used in <i>Joins that Generalize: Text Classification
Using WHIRL</i> (KDD-98).
<li><a href="match.tar.gz">match.tar.gz</a> (0.7Mb) contains a suite of
<i>labeled</i> entity-name matching and clustering problems
(i.e. problems for which the correct matches/clusters are provided),
in a single consistent format. In most cases WHIRL's performance is
given as a benchmark. (These are also distributed in the <a
href="http://www.cs.utexas.edu/users/ml/riddle/data.html">RIDDLE
Repository</a>. Extraction-oriented versions of some of this data are
available on the <a
href="http://www.isi.edu/info-agents/RISE/repository.html">RISE
Repository</a>. (I.e., represented as a problem of extracting data from
a website, rather than matching two datasets).)
<li><a href="whirl-bench.tgz">whirl-bench.tgz</a> (1.1Mb) contains some
more WHIRL-format entity name matching problems.
</ul>
<h3 class="sec"><a name="talks">Talks and presentations</a></h3 class="sec">
<p>
<ul>
<li><a href="https://www.youtube.com/watch?v=JsB4T35We0w">Video for an invited talk, KR for KBQA and KBC<a> given at AKBC, June 2020.
(This is the whole first day of the conference, my talk starts about 17:42, after the opening remarks.)
<li><a href="kr-2018.pptx">An invited talk </a> given at
the <a href="http://ilp2018.unife.it/">The 16th International
Conference on Principles of Knowledge Representation</a> on Oct 27th -
Nov 2nd 2018, in Tempe, Arizona.
<li><a href="ilp-2018.pptx">An invited talk </a> given at
the <a href="http://ilp2018.unife.it/">28th International Conference
on Inductive Logic Programming</a> on September 2nd - 4th 2018, in
Ferrara, Italy (there's
also <a href="https://drive.google.com/file/d/1yU1EyIVwBgnW-1RS7uL0GzUjnciVQf6O/view">video
for this talk</a>).
<li><a href="declarative-learning-workshop-2018.pptx">An invited talk</a>
given at Third International Workshop on Declarative Learning Based
Programming (DeLBP), at AAAI-2018.
<li><a href="snl-2017.pptx">An invited talk given at SNL-2017</a> (the 1st International Workshop on Symbolic-Neural Learning) in July 2017.
<li><a href="wakbc-2016.pptx">An invited talk given at WAKBC-2016</a> in June 2016.
<li>Tutorial on statistical relational learning given at NAACL 2016 with
William Wang (a shorter version of this was also presented at IJCAI 2016):
<ul>
<li><a href="naacl-2016-talk1-final.ppt">Part 1 - overview on logic, probability, MLNs, and probabilistic DDBs</a>
<li><a href="naacl-2016-talk2-final.pptx">Part 2 - ProPPR and applications</a>
<li><a href="naacl-2016-talk3-final.ppt">Part 3 - TensorLog, and other recent and current work</a>
</ul>
<li>Series of three lectures on probabilistic logic programs given at
Singapore Management University in Feb 2016:
<ul>
<li><a href="smu-2016-talk1.pptx">Background on logic and probabilistic models</a></li>
<li><a href="smu-2016-talk2.pptx">Parameter learning and structure learning in ProPPR</a></li>
<li><a href="smu-2016-talk3.pptx">Joint learning in ProPPR and comparing to neural approaches</a></li>
</ul>
<li><a href="aaai-ss-2015.ppt">Can KR Represent Real-World Knowledge?</a>, invited talk given March 2015
at the AAAI Spring Symposium on Knowledge Representation and Reasoning: Integrating Symbolic and Neural Approaches
<li><a href="nlu-2014.ppt">Learning to Reason with Extracted Information</a>, keynote talk given March 2014
at Google's Natural Language Understanding Workshop, Zurich, Switzerland.
<li><a href="ilp-2013.ppt">Learning to Construct and Reason with a
Large KB of Extracted Information</a>, invited talk given August 2013
at the Inductive Logic Programming Conference, in Rio de Janeiro,
Brazil.
<li><a href="aaai-fs-2012.ppt">Reasoning With Data Extracted from The Biomedical Literature</a>,
invited talk at a joint session of the AAAI Fall Symposia on Discovery Informatics, and
Information Retrieval and Knowledge Discovery in Biomedical Text.
<li><a href="cikm-2012.ppt">Learning Similarity Relations Based on Random Walks in Graphs</a>,
invited talk at CIKM 2012, October, 2012.
<ul>
<li>Earlier version of talk:<a href="mlg-aug-2011.ppt">Learning Relationships Defined by
Linear Combinations of Constrained Random Walks</a>, invited talk at
the <a href="http://www.cs.purdue.edu/mlg2011/">9th Workshop on
Machine Learning and Graphs</a>, San Diego, CA, Aug 2011.
</ul>
<li><a href="lti-colloq-2012.ppt">Fast Effective Clustering for Graphs and Documents</a>, given at CMU's LTI Colloquium Feb 10, 2012.
<ul>
<li>Earlier versions given
at <a href="FastEffectiveClustering-v2.ppt">Virginia Tech in April
2010</a> and
<a href="FastEffectiveClustering.ppt">University of Pennsylvania
in Feb 2010.</a>
</ul>
<li><a href="psc-11-cohen.ppt">Learning to Extract a Broad-Coverage
Knowledge Base from the Web</a>, invited talk at the Symposium on
Data-Intensive Analysis, Analytics, and Informatics, Pittsburgh, PA Apr 2011.
<li><a href="nfais-11-cohen.ppt">Open Information Extraction Methods:
Computers that Learn to Read</a>, invited talk at National Federation
of Advanced Information Services (NFAIS), Philadelpha, PA, Feb 2011.
<li><a href="umd-sep-2010.ppt">Learning Proximity Relations Defined by
Linear Combinations of Constrained Random Walks</a>, given at a
seminar at the University of Maryland in Sep 2010.
<li><a href="block-lda-icml-ws-2010.ppt">Modeling Entity-Entity Links
and Entity-Annotated Text</a>, given at the ICML 2010 Workshop on
Topic Modeling.
<li><a href="MSM-2009.ppt">Predictively Modeling Social Media</a>,
invited talk given at
<a href="http://www.socialgamingplatform.com/msm09/">the 1st International Workshop on Mining Social Media</a>, co-located with 13th Conference of the Spanish Association for Artificial Intelligence (CAEPIA-TTIA 2009).
<li><a a href="IIWeb.ppt">Matching and clustering product descriptions
using learned similarity metrics</a>, invited talk given at
<a href="http://research.ihost.com/iiweb09/index.html">the IJCAI 2009 Workshop on Information Integration on the Web</a>, July 2009. (Powerpoint; 6.7M)
<li>Open information extraction talks:
<ul>
<li><a href="openIE-spain-2009.ppt">Graph-Based Methods for Open Information Extraction</a>, talk given at Nov 2009 at MAVIR in Madrid, Spain.
<li><a href="openIE-2009.ppt">Graph-Based Methods for Open Information Extraction</a>, earlier version of talk given at Stanford and Google March 2009.
<li><a href="nips-graph-ws-2008.ppt">Graph-Based Methods for Open Information Extraction</a>,
still earlier version of the same talk given at a 2008 NIPS workshop.
<li>A <a href="nipsgraphs2008_workshop_skit.mov">QT video of highlights</a> from the workshop talks, including an incisive technical question addressed to me from my colleague <a href="http://www.stat.cmu.edu/~fienberg/">Steve Fienberg</a>.</li>
</ul>
<li><a href="sigmod-08.ppt">Embodied Cognition and Knowledge:
Integration of Heterogeneous Databases without Common Domains Using
Queries Based on Textual Similarity</a>, talk given for my 10-year
"Test of Time" Award at <a
href="http://www.sigmod08.org/">SIGMOD-2008</a>(Powerpoint; 11Mb)</li>
<li><a href="linkedData-2008.ppt">Using Machine Learning to Discover
and Understand Structured Data</a>, invited talk given at <a
href="http://www.linkeddataplanet.com">LinkedData
2008</a>. (Powerpoint; 6Mb)</li>
<li><a href="icmla-2007.ppt">Machine Learning for Personal Information
Management</a>, invited talk given at <a
href="http://www.icmla-conference.org/icmla07/icmla07.html">ICMLA-2007</a>. (Powerpoint; 8Mb)</li>
<li><a href="iqis.ppt">A Framework for Learning to Query Heterogeneous Data</a>,
invited talk given at <a href="http://queens.db.toronto.edu/iqis2006/">IQIS 2006</a>. (Powerpoint; 8Mb)</li>
<li><a href="dbirday-06.ppt">On Beyond Hypertext: Searching in Graphs
Containing Documents, Words, and Actual Data</a>, invited talk given
at <a href="http://dbirday2006.rutgers.edu/">DB/IR Day 2006.</a> (Powerpoint; 6Mb)</li>
<li><a href="webdb-talk.ppt">A Century Of Progress On Information
Integration: A Mid-Term Report</a>, an overview of information
integration</a>, focusing modestly on my own work, given as invited
talk at <a
href="http://webdb2005.uhasselt.be/">WebDB-2005</a>. (Powerpoint;
12Mb)</li>
<p>
<li>Tutorials:
<ul>
<li><a href="ie-survey.ppt">Information extraction</a> (PowerPoint;
4.8Mb), aimed at folks somewhat familiar with statistical NLP
methods. And thanks to Thierry Poibeau, there's also a version <a
href="http://www-lipn.univ-paris13.fr/~poibeau/cours/fr_cohen_ie_tutorial.ppt"><i>en francais</i></a> (did I get that right, Thierry?)
Also, two earlier versions of this are also still around, both
given with Andew McCallum at recent conferences, <a
href="kdd2003-tutorial.ppt">KDD-2003</a>(PowerPoint; 6.8Mb) and <a
href="nips-ie-tutorial.ppt">NIPS-2002</a>.
<li><a href="text-cat-tutorial.ppt">Text classification</a>
(PowerPoint; 3Mb), given at a CALD Summer Course.
<li><a href="collab-filtering-tutorial.ppt">Collaborative
filtering</a> (PowerPoint; 9.1Mb), given at a DIMACS workshop.
</ul>
<p>
<li>A mini-course on record linkage and matching:
<ul>
<li><a href="Matching-1.ppt">Overview of record linkage methods</a>(PowerPoint; 250kb).
<li><a href="Matching-2.ppt">Overview of distance metrics for strings</a>(PowerPoint; 530kb).
<li><a href="Matching-3.ppt">Overview of using HMMs for normalizing
text in record linkage tasks</a>(PowerPoint; 640kb). <br>
It's not a presentation, but I have also put together a <a
href="matching/">short annotated bibliography of record linkage and
matching papers</a>.
<li>William Hayes has a nice summary of <a href="http://blog.williamhayes.org/2012/07/string-similarity.html">an extended discussion
of string-matching tools</a> on the BioNLP mailing list (July 2012).
</ul>
<p>
<li>Other technical talks:
<ul>
<li><a href="ijcai-2005.ppt">A presentation of my IJCAI-2005 results</a>
on "stacked sequential learning", presented in Edinburgh in August, 2005.
<li><a href="nips-2002.ppt">A presentation of my NIPS-2002 results</a>
on using bootstrapping techniques to improve web page classification,
given at CMU in October 2002. (PowerPoint; 3.2mb).
<li><a href="www-2002.pdf">A presentation of my WWW-2002 results</a>
on wrapper learning,
presented in April 2002. (PDF; 170kb).
<li><a href="whirl-talk.pdf">An overview of experiments with WHIRL.</a> (PDF; 800kb).
</ul>
</ul>
<h3 class="sec"><a name="pubs">Publications</a></h3 class="sec">
<ul>
Here's pointers to <a href="http://www.informatik.uni-trier.de/~ley/db/indices/a-tree/c/Cohen:William_W=.html">my DBLP page</a>,
<a href="https://scholar.google.com/citations?user=8ys-38kAAAAJ&hl=en&oi=sra">my Scholar page</a>,
and <a href="https://arxiv.org/search/cs?searchtype=author&query=Cohen%2C+W+W">my Arxiv page</a>.
<li><i>A Computer Scientist's Guide To Biology</i> is no longer
available from this web page, but is now <a
href="http://www.springer.com/west/home/generic/search/results?SGWID=4-40109-22-173702304-0">available from Springer</a>. Here is a <a
href="GuideToBiology-sampleChapter-release1.4.pdf">the TOC,
introduction, index, and a sample chapter</a>, from a late draft of
the book; and also <a
href="GuideToBiology-pictures-color-release1.5.ppt">all the figures
from the book in PowerPoint</a> and <a
href="GuideToBiology-pictures-color-release1.5.pdf">all the figures in
PDF</a>. (The figures are a little prettier than the ones in the
final book, which is black and white, not color).
<li><a
href="http://shop.omnipress.com/index.asp?PageAction=VIEWPROD&ProdID=33">ICML
2006 Proceedings</a> are available in print, for the true afficianado
of fine learning-related research. It's well worth the money for the
cover art alone (of course, all the papers are also available <a
href="http://www.autonlab.org/icml2006/technical/accepted.html">on-line
for free</a>.)
<li><a href="pubs-s.html">Recent and selected publications</a>. These
are some representative publications for which on-line copies can be
distributed.
<li><a href="pubs.html">All publications</a>. Here is an more-or-less
complete chronological list of my publications. The bibliography
includes pointers to on-line versions when I can provide them, but
unfortunately copyright restrictions don't allow me to make all of my
publications available on-line. Of course, reprints are always
available from me on request.
<li>Publications by topic:<img
src="cover.png" height=200 width=150 align="right"/><img height=200 src="icml-cover.png" align="right"/>
<ul>
<li><a href="pubs-m.html">Matching/Data Integration</a>
<li><a href="pubs-t.html">Text categorization</a>
<li><a href="pubs-x.html">Information Extraction</a>
<li><a href="pubs-r.html">Rule Learning</a>
<li><a href="pubs-c.html">Collaborative Filtering</a>
<li><a href="pubs-a.html">Applications</a>
<li><a href="pubs-f.html">Formal Results</a>
<li><a href="pubs-i.html">Inductive Logic Programming</a>
<li><a href="pubs-e.html">Explanation-Based Learning</a>
</ul>
</ul>
Recent papers I'm keeping in HTML or PDF (which requires <a
href="http://www.adobe.com/prodindex/acrobat/readstep.html">Adobe
Acrobat Reader</a> to view). Older papers are mostly in Postscript.
For Windows, I use the <a
href="http://www.cs.wisc.edu/~ghost/gsview/">GSView</a> reader for
postscript. Most of these papers are viewable in several formats in
<a href="http://www.researchindex.com">ResearchIndex</a>.
<h3 class="sec"><a name="buddies">Students and other colleagues</a></h3 class="sec">
<!-- Other: -->
Current students:
<p>
<!-- Students: -->
<ul>
<li>Daniel Spokoyny, LTI PhD student, co-supervised with Taylor Berg-Kirkpatrick.
</ul>
<p>
Former students/colleagues:
<p>
<ul>
<li>
<a href="http://www.cs.cmu.edu/~krivard/">Katie Rivard Mazaitis</a>, research programmer/analyst
<p>
<li><a href="https://leejayyoon.github.io/">Jay Yoon Lee</a>, CSD PhD student, co-advised with Jaime Carbonell, now a postdoc at UMass/Amherst.
<li><a href="http://www.cs.cmu.edu/~bdhingra/">Bhuwan Dhingra</a>, LTI PhD student, co-advised with Ruslan Salakhutdinov, now an Assistant Professor at Duke University.
<li><a href="http://kimiyoung.github.io/">Zhilin Yang</a>, former LTI PhD student, co-advised with Ruslan Salakhutdinov, now an Assistant Professor at Tsinghua University.
<li><a href="https://sites.google.com/site/rosecatherinek/home">Rose Catherine Kanjirathinkal</a>, former LTI PhD student, now at Yahoo.
<li><a href="http://www.cs.cmu.edu/~fanyang1/">Fan Yang</a> (former MLD PhD student)
<li><a href="http://www.cs.cmu.edu/~yww/">William Yang Wang</a>, former LTI PhD student, now an Assistant Professor at UCSB.
<li><a href="http://www.cs.cmu.edu/afs/cs/Web/People/dmovshov/">Dana Movshovitz-Attias</a> former CSD PhD student,
now at Google.
<li><a href="http://www.cs.cmu.edu/afs/cs/Web/People/bbd/">Bhavana Dalvi Mishra</a>former LTI PhD student
(co-advised with <a href="http://www.cs.cmu.edu/~callan/">Jamie Callan</a>, now at AI2.
<li><a href="http://www.cs.cmu.edu/~taey/">Tae Yano</a>, former LTI
PhD student, co-advised
with <a href="http://www.cs.cmu.edu/~nasmith/">Noah Smith</a>.
<li><a href="http://www.cs.cmu.edu/~nli1">Nan Li</a>, former CSD PhD
student, co-advised
with <a href="http://pact.cs.cmu.edu/koedinger.html">Ken
Koedinger</a>, now at D. E. Shaw.
<li><a href="http://www.cs.cmu.edu/~rbalasub/">Ramnath Balasubramanyan</a>, former LTI PhD student.
<li><a href="http://www.cs.cmu.edu/~maheshj/">Mahesh Joshi</a>, former LTI PhD student,
co-advised with <a href="http://www.cs.cmu.edu/~cprose/">Carolyn Rosé</a>
<li><a href="http://www.cs.cmu.edu/~frank/">Frank Lin</a>, former LTI PhD student
<li><a href="http://www.cs.cmu.edu/~nlao/">Ni Lao</a> former LTI PhD student
<li><a href="http://www.cs.cmu.edu/~rcwang">Richard C. Wang</a>, former LTI PhD student co-advised with <a
href="http://www.cs.cmu.edu/~ref/">Bob Frederking</a>.
<li><a href="http://www.cs.cmu.edu/~aarnold/">Andrew Arnold</a> former MLD PhD student, now at Amazon.
<li><a href="http://www.cs.cmu.edu/~einat">Einat Minkov</a> former LTI PhD student, now at Haifa University.
<li><a href="http://www.cs.cmu.edu/~vitor">Vitor Rocha de Carvalho</a> former LTI PhD student, now at SnapChat.
<li><a href="http://www.cs.cmu.edu/~woomy/">Zhenzhen Kou</a> former MLD PhD student, now at Google.
<p>
<li><a href="http://www.cs.cmu.edu/~yifengt/">Yifeng Tao</a>, CMU Comp Bio PhD student, now supervised by Russell Schwartz.
<li><a href="http://www.cs.cmu.edu/~eairoldi">Edoardo Airoldi</a>
former MLD/Stats PhD student, co-advised with <a href="http://www.stat.cmu.edu/~fienberg/">Steve Fienberg</a>, now at Harvard.
<li><a href="http://www.cs.cmu.edu/~pradeepr">Pradeep Ravikumar</a>
former MLD PhD student, co-advised with <a href="http://www.stat.cmu.edu/~fienberg/">Steve Fienberg</a>, now at CMU.
<p>
<li>Haitian Sun, former MLD MS student, now a PhD student at CMU.
<li><a href="http://www.cs.cmu.edu/~fanyang1/">Fan Yang</a>, MLD MS student.
<li><a href="https://andy-jqa.github.io/">Qiao Jin</a>, School of Medicine, Tsinghua University,
now supervised by Xinghua Lu.
<li>Ezra Winston, former MLD Master's student, now a PhD student in MLD.
<li>Lanxio (Karen) Xu, former MLD Master's student.
<li>Yuxing Zhang, former MLD Master's student.
<li>Jakob Bauer, former MLD 5th-year Master's student, now at Google.
.<li>Kavya Srinet, former MCDS Master's student.
<li>Bhawna Juneja, former MCDS Master's student.
<li>Tom Shen, former CMU CSD undergrad.
<li>Yu-Hsin Allen Kuo</a>, former LTI MLT student, formerly co-advised with <a href="http://www.cs.cmu.edu/~nmiskov/Natasas_website/Home.html">Natasa Miskov-Zivanov</a>
<li>Rahul Goutam</a>, former LTI MLT student, co-advised with <a href="http://www.cs.cmu.edu/~nmiskov/Natasas_website/Home.html">Natasa Miskov-Zivanov</a>
<li><a href="https://plus.google.com/102262489142071513958/posts">Malcolm Greaves</a>, former CSD master's student.
<li>Wen Haw Chong, PhD student at Singapore Management University,
visted CMU in 2015-2016.
<li><a href="http://www2.sis.smu.edu.sg/students/phd/class10/10_hoang_tuananh.asp">Tuan
Ahn Hoang</a>, PhD student at Singapore Management University,
visited CMU for 2012-2013 academic year.
<li><a href="http://freddychua.com/">Freddy
Chong Tat Chua</a> PhD student at Singapore Management University,
visited CMU for the academic year 2011-2012.
<p>
<li><a href="http://www.optimizelife.com/">Gustavo Lacerda</a>
former research assistant, co-supervised with Noboru Matsuda and Ken Koedinger.
<li><a href="http://www.cs.cmu.edu/~lbing/">Lidong Bing</a>, former
postdoc.
<li><a href="https://sites.google.com/site/rameshnallapati/">Ramesh Nallapati</a>
former postdoc, co-supervised with <a
href="http://www.cs.cmu.edu/~lafferty/">John Lafferty</a>.
<li><a href="http://www.cs.cmu.edu/~mazda">Noboru Matsuda</a>
former postdoc, co-supervised with <a href="http://pact.cs.cmu.edu/koedinger.html">Ken Koedinger</a>.
<p>
<li><a href="http://www.csie.ncu.edu.tw/~chia/">Ja-Hui Chang</a>
visiting faculty from National Central University, Taiwan, 2007-2008.
<!-- External members -->
<p>
<li>I have been an external committee member for the PhD theses of
<ul>
<li><a href="http://mcsp.wartburg.edu/zelle/">John Zelle</a> (degree
from U Texas)
<li><a href="http://research.microsoft.com/en-us/um/people/mbilenko/">Misha
Bilenko</a> (from U Texas)
<li><a href="http://www-users.cs.york.ac.uk/~kudenko/">Daniel Kudenko</a>
(Rutgers)
<li>Chumki Basu (Rutgers)
<li>Ananlada Chotimongkol (CMU)
<li>Wei-Hao Lin (CMU)
<li>Cenk Gazen (CMU)
<li>David Nadeau (U Ottowa)
<li><a href="http://cs.cmu.edu/~htong">Hanghang Tong</a> (CMU)
<li>Ben van Durme (Rochester)
<li><a href="http://www.cis.upenn.edu/~partha/">Partha Talukdar</a> (U Penn)
<li><a href="http://www.cs.cmu.edu/~acarlson/">Andy Carlson</a> (CMU)
<li><a href="http://www.cs.cmu.edu/~hyifen/">Yifen Huang</a> (CMU)
<li><a href="http://www.cs.pitt.edu/~swapna/Main.html">Swapna Sundaran</a> (U
Pitt)</a>
<li><a
href="http://www.cs.cmu.edu/~mheilman/">Michael Heilman</a> (CMU)
<li><a
href="http://www.cs.cmu.edu/~jelsas/">Jon Elsas</a> (CMU)
<li><a href="http://www.cs.cmu.edu/~dipanjan/Home.html">Dipanjan Das</a> (CMU)
<li><a href="http://www.cs.cmu.edu/~fanguo/">Fan Guo</a> (CMU)
<li><a href="http://www.andrew.cmu.edu/user/jdiesner/">Jana Diesner</a> (CMU)
<li><a href="http://freddychua.com/">Freddy Chong Tat Chua</a> (Singapore Management University).
<li><a href="https://sites.google.com/site/hoqirong/">Qirong Ho</a> (CMU)
<li>Danai Koutra (CMU)
<li>Reyyan Yeniterzi (CMU)
<li>YiChi Wang (CMU)
<li>Steven Gardiner (CMU)
<li>Jay Pujara (Univ Maryland)
<li>Derry Wijaya (CMU)
<li>Lingjia Deng (Univ of Pittsburgh)
<li>Chenyan Xiong (CMU)
<li>Tiancheng Zhao (CMU)
<li>Pradeep Dasigi (CMU)
<li>Shashank Srivastava (CMU)
<li>Abulhair Saparov (CMU)
<li>Danish Pruthi (CMU)
</ul>
I have also been an external committee member for the Master's theses of
<a href="http://www.cs.cmu.edu/~mehrbod/">Mehrbod Sharifi</a> (CMU) and
Weam Abu-Zaki (CMU).
<p>
I am currently a PhD committee member for Vidhisha Balachandran,
Zhengbo Jiang, Luyu Gao, and Sankey Vaibhav Mehta.
<!-- Other: -->
<h3 class="sec"><a name="misc">Other Stuff</a></h3 class="sec">
<p>For those many friends whose research I have built on, be warned.
My full name, "William Weston Cohen", is an anagram of the phrase "I
now cite shallow men". (From <a
href="http://iew3.technion.ac.il/~sarac/">Sara Cohen</a> - no
relation! - comes this warning: "Women's rights activists would
probably request you to use the following anagram instead: 'I shall
now cite women'".)
<p>Through my advisor, Alex Borgida, I can trace
my <a href="lineage.html">"academic lineage"</a> back to luminaries
like Leibniz, Newton and Alfred Whitehead. <b>Update:</b> My former
student <a href="https://andrewoarnold.com">Andrew Arnold</a> has gone back
even <a href="https://andrewoarnold.com/genealogy.png">further</a>, to
Galileo!
<p>In 2014 I unearthed a strange relic from the past, a sort of
game/website I wrote for my
son <a href="https://charleskcohen.com">Charlie</a> back in...I'm
gonna say, 1994, 1995, something like that, and I sort of made it work
again, although JavaScript has changed a bit in the last couple of
decades. (The main bugs have to do with sound-file presentation - in
1994 these were played by mime-file configured helper programs, not
natively by the browser, so now you need to hit 'back' about 1/2 the
time after a sound plays.) <a href="dict/stuff">Historically
interesting? You decide!</a>
<p><a href="hp.html">Poetry anyone?</a>
<hr>
<script src="http://www.google-analytics.com/urchin.js" type="text/javascript">
</script>
<script type="text/javascript">
_uacct = "UA-2090677-1";
urchinTracker();
</script>
</BODY>
</HTML>