Skip to content
jasonbaldridge edited this page Apr 26, 2013 · 2 revisions

Course project, Phase 5

DUE: May 6, 2013, 1pm CST

Preamble

This project phase ideally involves building the work you did for phase four. Basically, this is your chance to extend that work and possibly improve the way you present it. I have shared all the reports from phase four with the class, so this will give you a sense of how you can (a) improve your own write-ups and (b) take advantage of work that others have done. As before, I encourage you to collaborate as a class to do something better than you can in your individual groups.

Submitting your solutions

Write up what you did in as a file <lastname>_<firstname>_p5.pdf. Submit this on Blackboard.

Your code submission will be contained in your project code repository (which will not necessarily be a fork of tshrdlu). You should tag the submission version of your repository as "PP5".

What to do

Basically, develop your work further, incorporating feedback from phase four. I'm of course happy to discuss any of these ideas or your own further, either in person or over email.

Things I'd like to see if you are doing a bot:

  • It would be very interesting to have actions/responses that are based on the previous discussion and not just the current utterance.
  • Interactive classification in which training data is supplied as tweets to a classifier (or classifiers) in the bot. E.g. the bot retweets a message about the wrong "scala" and if you respond with a message like "bad" which helps it retrain a model for what is the scala of interest for you.
  • Do actions/responses based on the context of the social network of the user who is communicating with the bot.

Other things I'd be excited to see:

  • Network analysis and visualization, possibling doing influencer analysis (though this should all have some text analysis component to it and not just be based on follower relations)
  • Semi-supervised and/or active learning of classifiers.
  • Anything multilingual, especially if it requires dealing with languages that are quite different from English (e.g. much more complex morphology, freer word order, etc).
  • Construction and use of complex text processing pipelines, e.g. using UIMA or some other solution.

I'm even more excited if you do something cool that I haven't mentioned.

Recall: this is still a project phase, so try to do something cool---and if you fall a bit short of making it all come together, it's okay because it is a project phase, not the final project. Basically, this will set you up to get your final project in great shape while being feasible.

Requirements

  • Length and formatting. Your report must be written using the ACL System Demonstrations style files (you are free to use LaTeX or Word). It should be 3-4 pages long. (Your final submission will be in the same style, but conforming to the full submission instructions described at the above link.)
  • References. You should have at least five (more is fine), at least three of which do not come from UT Austin. These should be academic papers of relevance to what you have done. If you are citing blogs, manuals, news articles, etc, those should just be footnotes and don't count as references.
  • Documentation in code. All of the major methods, classes, traits, etc. should be documented using Scaladoc style. Minimally, they should describe the relevant piece of code, and it would be nice if the inputs and outputs are also described using flags like @param.
  • Evaluation plan. You should discuss how you can evaluate your work empirically. E.g. you could train a classifier via messages to your bot, and then evaluate it's performance on some held-out test set. Or, you could do small user studies in which you ask people to talk to your bot for at least 10-15 turns, and then rate the interaction. For the final submission, you'll need to actually do the evaluation.

Rubric

Your submission will be scored using the following rubric. Qualities of full-point submissions are given below each area. Notice the emphasis on code for this project phase.

It is fine to use text from project phase four for this submission -- at this point, you are basically iteratively refining that document.

  • Coding: 40
    • The code involves non-trivial implementation, including appropriate use of data and algorithms.
    • The code demonstrates thought about program dependencies and flow.
    • The code is organized and documented.
    • IMPORTANT: It is possible for the instructor to clone the repository and run a simple command to interact with your program and/or obtain output from it.
  • Writing: 20
    • The write-up clearly explains what was done.
    • The write-up has examples, including relevant output.
    • The write-up provides analysis of output, as appropriate.
    • The write-up discusses some of the implementation challenges and design choices.
    • The write-up has references to any papers, blog posts, or other resource that were used to complete the work.
    • The write-up is professionally done (organized, free of spelling and grammar errors).
    • IMPORTANT: The write-up ends with 1-2 paragraphs discussing what you will complete by the time you hand in the final project report. (No conclusion necessary.)
  • Creativity: 20
    • The work shows original thought in selection of task, choosing algorithms for solving it, solutions to coding challenges, and analyzing their output.
    • The work combines different ideas from the class in new ways.
  • Overall quality: 20
    • The work as a whole is high quality.

Please look at the write-ups by the others for project phase four to get ideas for how you should structure and present your own.

Clone this wiki locally