JavaNLP Quick Guide for Using CVS

JavaNLP now uses Subversion, but these instructions remain for certain other CVS repositories on the NLP machines.

Setting up your computer to talk to CVS

NLP machines

Add the following line to your .cshrc, .tcshrc, or .login (wherever you tend to put this stuff--look at where similar commands go if you're not sure):

setenv CVSROOT /u/nlp/cvsroot

Now log out and log back in again, or type source filename where filename is the file you just edited.

Remote UNIX machines

Add the following lines to your .cshrc, .tcshrc, or .login (wherever you tend to put this stuff--look at where similar commands go if you're not sure):

setenv CVSROOT username@jamie.stanford.edu:/u/nlp/cvsroot
setenv CVS_RSH ssh

Now log out and log back in again, or type source filename where filename is the file you just edited.

If you don't want to type your password in every time you execute a cvs command, you should generate a private/public key pair and put the public key on the nlp machines. Note that the Solaris machines want an RSA1 public key, whereas the Linux machines want an RSA2 public key. You can generate both of them by doing as follows:

On your remote machine, type the following commands:

mkdir -p ~/.ssh
chmod 700 ~/.ssh
cd ~/.ssh
ssh-keygen -C "username@nlp" -f identity -t rsa1
ssh-keygen -C "username@nlp" -f id_rsa -t rsa

After each ssh-keygen command, you'll be prompted for a passphrase; enter one of your choosing or just hit return. It's convenient if you use the same passphrase for both the rsa1 and rsa2 keys, because you can activate both keys at once for a single session with ssh-agent; see below for details. (Using a blank passphrase is officially not recommended and in a moment we'll describe how to set things up so that you only need to enter the passphrase once per session, but nevertheless some people find it much easier to just go with an empty passphrase, and the result isn't that insecure, particularly on a personal computer which you physically control.) This will generate your private key (called identity) and your public key (called identity.pub). Now you have to put the public key on the nlp machines, so type the following commands :

scp identity.pub id_rsa.pub jamie.stanford.edu:
ssh -l username jamie.stanford.edu
mkdir -p ~/.ssh
touch ~/.ssh/authorized_keys
cat identity.pub id_rsa.pub >> ~/.ssh/authorized_keys
chmod 700 ~/.ssh ~/.ssh/*
rm identity.pub id_rsa.pub
logout

Now try connecting to nlp again by typing ssh -l username jamie.stanford.edu. If it worked, you shouldn't have to type in a password.

If you entered a non-empty passphrase and don't want to type it in every time you use CVS, you can use ssh-agent and ssh-add to type it only once per session. Invoke ssh-agent [command] to start your session (an X session or an interactive shell, for example). Then invoke ssh-add, and you will be prompted for a passphrase for every key in your ~/.ssh/identity file. After this, everything that happens in the session under ssh-agent will have access to those keys. So for example:

ssh-agent /usr/local/bin/tcsh
ssh-add

Presto! You can now use CVS (and ssh in general) for this session to your heart's content, without the bother of typing in a passphrase or password at every command.

Windows machines

You should be able to get things to work with any Windows client that speaks standard CVS (providing it can use ssh as the transport protocol). The commonest way to use CVS on windows is to download WinCVS, which is a CVS GUI, and to have it use CygWin SSH with a private/public keypair so you don't have to type in your password. Below you will find step by step directions for WinCVS.

But you should be able to use other Windows clients, and people have used various tools, including TortoiseCVS and the CVS support inside IntelliJ IDEA. TortoiseCVS is another good CVS client for Windows. It comes with ssh support out of the box, and it integrates into Windows Explorer, so you don't need to open up a separate application to use CVS. The download contains puttygen.exe for generating a private/public key pair, and TortoisePLink.exe for setting up an SSH connection on the command-line. (You will still need to download Pageant to manage your keys.)

Instructions for WinCVS

These instructions are adapted from a page once at the address http://ghettobox.dhs.org/~andross/cvs-ssh-win32.html.

  1. Download SSH1 Binaries for Windows (115 KB).
  2. Extract ssh-1_2_14-win32bin.zip to C:\SSH.
  3. Set your HOME and PATH environment variables:
    In Windows NT/2000/XP right click on My Computer select Properties. Click on the tab Advanced, then click Environment Variables. Under System variables change Path by adding C:\SSH; at the beginning. Next, under User variables, add the variable HOME and set its value to C:\SSH.
    In Windows 95/98/ME add the following lines to C:\autoexec.bat:

    SET HOME=C:\SSH
    SET PATH=%PATH%;C:\SSH


    You may have to have log out or restart for these changes to take effect.
  4. Test out the ssh install by opening a command line prompt (e.g., Start Menu, Run..., enter cmd) and typing ssh -l username jamie.stanford.edu. If it works, type logout to return to the command line prompt. If it doesn't work, make sure you set the environment variables correctly and the files did indeed get extracted into C:\SSH. If all else fails, restart. :)
  5. Generate a private/public keypair so you won't have to type in your password all the time. From the windows command line prompt, type the following commands:

    cd C:\SSH\.ssh
    ssh-keygen -C "username@nlp" -f identity -t rsa1
    ssh-keygen -C "username@nlp" -f id_rsa -t rsa

    When prompted for a passphrase, just hit enter (twice) to keep it blank. This will generate your private key (called identity) and your public key (called identity.pub). Now you have to put the public key on the nlp machines, so type the following commands :

    scp identity.pub id_rsa.pub username@jamie.stanford.edu:
    ssh -l username jamie.stanford.edu
    mkdir -p ~/.ssh
    touch ~/.ssh/authorized_keys
    cat identity.pub id_rsa.pub >> ~/.ssh/authorized_keys
    chmod 700 ~/.ssh ~/.ssh/*
    rm identity.pub id_rsa.pub
    logout

    Now try connecting to nlp again by typing ssh -l username jamie.stanford.edu. If it worked, you shouldn't have to type in a password.

  6. Download WinCVS. It is simplest to use this local copy of WinCVS v1.2.0 (3,618 KB) or you can go to the page to download the current version.
  7. Extract WinCvs120.zip to a temporary directory, and then run Setup.exe.
  8. Configure WinCVS to use SSH. Start WinCVS, open Admin|Preferences and under the general tab set CVSROOT to: username@jamie.stanford.edu:/u/nlp/cvsroot . Set authentication to SSH server. Under the WinCVS tab, HOME folder to C:\SSH.
  9. Test the configuration by trying to checkout the javanlp module. Go to Create|Checkout Module... and enter javanlp for module name and specify the folder where you want the javanlp files to live on your computer. If it works, you should see a bunch of checkout lines in the bottom frame of the window, and all the files should magically appear on your local computer.

Using CVS (UNIX)

Checking out JavaNLP files for the first time on a machine

Once you've set up your environment variables, you need to get an initial copy of the javanlp files on your computer. To do this, go to the directory you want the files to live and type cvs checkout javanlp. You should see a lot of lines being printed for the files you're getting, and you should end up with a javanlp folder.  You only have to do this the first time you get the files, from now on you'll just update and commit (see below).

Updating your files to get the latest copy of everything

Once you've checked out the files, all you need to do to refresh them to the latest version is to type cvs update -Pd.  This will update all the files in your current directory and below. NOTE: The -Pd option is very important--it makes sure that new directories get added and old directories get removed when you update. You will see a U in front of files that other people have modified since your last update, an M in front of files that you've modified but not yet committed, and a ? by files that you've created locally but never checked in. It's important to update regularly so that you're always using the latest official version of files, and so that you don't end up modifying a file that someone else has also modified.

Committing your changes so other people receive them

When you've made some changes and tested that they work correctly, commit the changes to the repository by typing cvs commit. CVS will go through the current directory on down and commit any files you've modified. It will prompt you to enter a little log message describing the change. Once you've committed, when other people do an update, they'll receive your changes. You should commit fairly regularly, especially when fixing minor issues, but you should never commit until your code compiles and is stable and usable.

Adding new files to the repository

If you create a new file, you have to explicitly add it to the repository by typing cvs add filename. When you do an update, you should see an A in front of the file (before the add it would show a ?). When you next commit, the file will be added permanently.

Adding new packages to the repository

You should put in your .cshrc or equivalent:

setenv CVSUMASK 002
as otherwise directories will not be created so they are group writeable. But even if your umask and CVSUMASK is correctly set to 002, we still seem to have sometimes had problems problem with CVS where permissions aren't set correctly for new packages added to the repository from remote machines. Consequently, when you add a new package to the repository, it's always a good idea to make sure that the directory is group-writeable in the repository. Just track down the directory under /u/nlp/cvsroot on any of the nlp machines, and, if necessary, type chmod 775 directory_name.

Adding new libraries to the repository

If you add code that requires any new external libraries, then you must add those libraries to CVS in the lib directory. IMPORTANT: You must tell CVS when you add binary files such as jar files, or they will not work on different platforms such as Windows/Unix, or for anyone if they happen to have certain substitution strings in them. You do this by specifying -kb flag to cvs add. For more information and for instructions on how to fix things after you forgot to specify -kb at first, see the CVS Manual.

Removing old files from the repository

If you have old files you want to remove, it's not enough to just rm them, because cvs will replace them next time you update. To permanently remove files, first rm them, then type cvs remove filename(s). Next time you commit, the files will be deleted from the repository. Note that if you want to remove all the files in a directory, you can also just delete them all and then type cvs remove dir. You don't ever rm the dir, CVS does it for you.

Tracking changes to a file

To determine who was responsible for changes in a particular file (e.g. in case of broken builds), type cvs annotate file. CVS will list the user and revision number for each line in the file.

Merging different versions of files

The merge command is cvs update -j. To merge in the differences between version x and your currently checked-out file, do: To merge the differences between version x and version y into your currently checked-out file, do If you give the version numbers in reverse order, the changes are applied in reverse. This provides the right way to undo changes in CVS. To back out the changes between version x and version y, do:

Creating and maintaining branches

Presentation on Branching and Merging

In the normal course of development, changes are checked in serially. The current state of a file represents all the changes that have been made up to this point. Branching creates an alternate timeline of changes, starting from some common point in the past. Branching can shield you from other people's checkins when you're getting close to a deadline.

To branch the JavaNLP tree, change to the top-level directory of your checked-out tree (the directory containing src, bin, etc.). Execute the command:

This command creates the branch in the repository, but to use that branch you have to check it out. To do this, specify the branch name with the -r option to cvs checkout or cvs update:

Checking out on the branch places a sticky tag on your local copy of the file: it remembers that it is checked out on that branch. As long as the sticky tag is in place, all checkins go onto the branch. You can see what sticky tags you have on a file with:

To check out on the trunk of the tree, you need to clear the branch sticky tag with:

The merge command can be used to merge changes between the branch and the trunk. HEAD is an alias for the latest version (the tip) of the file on the trunk. So, to pull in all the changes between your current version and the tip of the trunk, do:

A few more hints for using branches:

For more information

CVS man page

Type man cvs to get more details on the commands, or consult the online version at http://www.cvshome.org/docs/ref.html.

CVS manual online

The official cvs home page (http://www.cvshome.org) contains extensive documentation, accessible at http://www.cvshome.org/docs/.

Supplemental Notes on Getting Started with JavaNLP

After getting set up with CVS, you may have to follow these additional steps before you can effectively develop for JavaNLP:

  1. Make sure version 1.5 of javac is in your path. Since it's in different places on different machines, the Makefile just refers to javac without a path, so you need it in your path to compile the JavaNLP files. If you're on the nlp machines (one of the Solaris ones), it should be in /u/nlp/packages/java/. It's also now in /usr/pubsw/bin for AFS machines. If you're not sure whether javac is in your path, try typing javac at the command prompt. Either you'll get javac: Command not found. (bad) or you'll get some printout that tells you the options to use (good). If you don't have javac in your path, ask your sysadmin where it's located (and make sure it's 1.4). Then edit your path variable (probably in your .cshrc or some similar file) to include it.
  2. Make sure you have the JavaNLP external libs. JavaNLP relies on several third-party libraries, and you need to make sure you have a copy on your machine and in your classpath before you can build. If you're on the nlp machines (or db machines I believe), the libs are in /u/nlp/java/lib and if you just type source bin/setup.csh in your main checked out javanlp/ dir, it should take care of putting them all in your classpath. If you're on a remote machine, you need your own copy. I've placed a tarred gziped version at /u/nlp/java/javanlp-lib.tar.gz which you can copy and extract to your machine. You must then add the extracted dir and all the .jar files inside it to your CLASSPATH (this is an environment variable you can set in your .cshrc or similar file).
  3. Make sure you do cvs update -Pd when you update. The -P option gets rid of old (removed) directories, and the -d creates new directories that have been added to the repository since you last updated. These flags are vital to having an updated and working repository so make sure you always use them. You'd think they'd be standard options, but for whatever reason, they're not.
  4. Use ant to compile. On the NLP (and db) machines, we've installed ant, a great Java build tool that's smart about only compiling the classes that have changed since the last build. You should use ant to compile unless you have some special circumstance. To compile things normally, cd into your checked out javanlp dir, type source bin/setup.csh if you haven't already done so, then type ant to compile. (It's in /u/nlp/bin.) If ant runs out of memory during compilation, you can increase the amount of memory that ant gives to the compiler by setting the environment variable ANT_OPTS to "-Xmx200m".
  5. Desperately need an old version of something. The following recipe may help. Decide what package you need to downdate, and to what date, and do the following:
    cd package
    cvs update -D 'date'
    Note that that package will thenceforth be 'sticky' -- i.e., locked at that old date and never updated -- until you release it with:
    cd package
    cvs update -A

    However, we often find it easier to just look at the old version using cvsweb, and then use copy-and-paste from a browser window.

Send e-mail to Huy Nguyen (htnguyen@cs.stanford.edu) if this is confusing or you need help.

[ JavaNLP Project Home Page ]


Contact: Huy Nguyen (htnguyen@cs.stanford.edu)

Contact us!

If you have problems with anything above, or have general questions or suggestions, and you can't find the answer in the man page or online, please email me at htnguyen@cs.stanford.edu.

[ JavaNLP Project Home Page ]


Contact: Huy Nguyen (htnguyen@cs.stanford.edu)
Last Modified: 10/16/2005