JavaNLP meeting notes for 10/16/2002 Thanks everyone for coming to the first (since the previous) JavaNLP meeting yesterday. Sorry it was a bit rocky, but the first one is always the roughest. Here's a summary of what we agreed to: - Meetings will now be on Thursdays at 11:15 (before NLP lunch) and we'll keep it to 45 minutes to make sure we don't get bogged down in specifics (unless that's on the agenda!). So the text meeting will be 10/24. - Before each meeting, I'll send out an agenda so everyone knows what to prepare for, and to what extent their input will be required. Each week I hope for a brief round-table update (quicker and higher-level than yesterday) and then we'll focus on one or two areas that we're trying to get consensus on or push forward. - After each meeting, I'll send out a notes email summarizing what we talked about and what people are supposed to be doing. - I'll put up a JavaNLP web page that links to the current javadocs, cvs guide, and archived meeting notes. I'll also put some links to javadoc guides, etc, and please let me know what else you'd like to see there. - The JavaNLP mailing list is now open, so everyone should sign up if they haven't done so already. We'll monitor all signups so we don't risk having random people on the list (but I doubt that will happen since the list is private). So now for what everyone said they'd work on this week: EVERYONE: - review what code you have in the repository and remove stuff that's not necessary - javadoc whatever's in there, at least package.html docs but preferably at least a one-liner at the top of each class. Dan: - graph package - plan to talk to others about API decisions? - parser package - general purpose API and Processor wrapper - make sure Tim's code works with it (whenever that appears) Sep: - finish converting Processors to use Documents in and out - wrap up Tim's PTB as a tokenizer - Look at using Chris's io stuff for DataCollection (to traverse files) Kristina: - make sure tagger works and can be used as a processor - make sure some standard trained version is in /obj to be loaded - make test program that creates a document and tags it - move maxent.tagger package to just tagger? - since ideally there would be multiple implementations and a single interface Joris: - work on Tim's ie parsing stuff once it gets put in the repository Huy: - ie HMM stuff - finish structure rehaul - front end for application Joseph: - Finish rehauling Document class - make constructors into init methods in BasicDocument - convert other Document subclasses to use new system - move miler's tokenizer out of annotation package? - make DiscDocument for streaming data instead of holding it in memory - Rehaul URLParser stuff - really it's just a html tag stripper with some extra stuff - JavaNLP web page (see above) OK I think that's about it for this week. Please try to get started javadocing and looking at your code ASAP so you can get on to the fun stuff and we can make progress next week. Let me know if you have any questions or suggestions about anything. Thanks, js