The Stanford Natural Language Processing Group

About

SUTime is a library for recognizing and normalizing time expressions. That is, it will convert next wednesday at 3pm to something like 2016-02-17T15:00 (depending on the assumed current reference time). SUTime is available as part of the Stanford CoreNLP pipeline and can be used to annotate documents with temporal information. It is a deterministic rule-based system designed for extensibility. The rule set that we distribute supports only English, but other people have developed rule sets for other languages, such as Swedish.

SUTime was developed using TokensRegex, a generic framework for definining patterns over text and mapping to semantic objects. An included set of powerpoint slides and the javadoc for SUTime provide an overview of this package.

SUTime was written by Angel Chang. These programs also rely on classes developed by others as part of the Stanford JavaNLP project.

There is a paper describing SUTime. You're encouraged to cite it if you use SUTime.

Angel X. Chang and Christopher D. Manning. 2012. SUTIME: A Library for Recognizing and Normalizing Time Expressions. 8th International Conference on Language Resources and Evaluation (LREC 2012).

Usage

SUTime annotations are provided automatically with the StanfordCoreNLP pipeline by including the ner annotator. When a time expression is identified, the NamedEntityTagAnnotation is set with one of four temporal types (DATE, TIME, DURATION, and SET) and the NormalizedNamedEntityTagAnnotation is set to the value of the normalized temporal expression. The temporal type and value corresponds to the TIMEX3 standard for type and value. (Note the slightly weird and non-specific entity name 'SET', which refers to a set of times, such as a recurring event.) For more details on the annotations, see also the TimeML Annotation Guidelines Version 1.2.1, Guidelines for Temporal Expression Annotation for English for TempEval 2010, and the TIDES 2003 Standard for the Annotation of Temporal Expressions (TIMEX2 v1.3), which is still useful for its detailed discussion, even though partially superseded by TIMEX3. TIMEX3 is an extension of ISO 8601, and for the core cases of definite times, you're probably best off starting off by just reading about it.

SUTime also sets the TimexAnnotation key to an edu.stanford.nlp.time.Timex object, which contains the complete list of TIMEX3 fields for the corresponding expressions, such as "value", "tid", "type", "peridocity", "alt_value". This might be useful to developers interested in recovering complete TIMEX3 expressions. The field "alt_value" is our extension of TIMEX3. It is used when we can't give back a standard TIMEX value. It's typically used for unresolved dates – either because there was no reference date given or it was too complicated to resolve. For instance, "today" would give "THIS P1D" if there was no document date to resolve it against. It's like the "logical form" for the time if there is no denotation.

There is also a stand-alone SUTimeMain class for invoking SUTime. It can read certain temporal text data sets and can annotate text files. It is mainly intended for validating the performance of SUTime.

When writing your own Java code, one way to use SUTime is just to use a CoreNLP pipeline. But you can also quite easily make your own custom annotation pipeline, if you only need the output of SUTime. Below is a complete example.

import java.util.List;
import java.util.Properties;

import edu.stanford.nlp.ling.CoreAnnotations;
import edu.stanford.nlp.ling.CoreLabel;
import edu.stanford.nlp.pipeline.*;
import edu.stanford.nlp.time.*;
import edu.stanford.nlp.util.CoreMap;

public class SUTimeDemo {

  /** Example usage:
   *  java SUTimeDemo "Three interesting dates are 18 Feb 1997, the 20th of july and 4 days from today."
   *
   *  @param args Strings to interpret
   */
  public static void main(String[] args) {
    Properties props = new Properties();
    AnnotationPipeline pipeline = new AnnotationPipeline();
    pipeline.addAnnotator(new TokenizerAnnotator(false));
    pipeline.addAnnotator(new WordsToSentencesAnnotator(false));
    pipeline.addAnnotator(new POSTaggerAnnotator(false));
    pipeline.addAnnotator(new TimeAnnotator("sutime", props));

    for (String text : args) {
      Annotation annotation = new Annotation(text);
      annotation.set(CoreAnnotations.DocDateAnnotation.class, "2013-07-14");
      pipeline.annotate(annotation);
      System.out.println(annotation.get(CoreAnnotations.TextAnnotation.class));
      List<CoreMap> timexAnnsAll = annotation.get(TimeAnnotations.TimexAnnotations.class);
      for (CoreMap cm : timexAnnsAll) {
        List<CoreLabel> tokens = cm.get(CoreAnnotations.TokensAnnotation.class);
        System.out.println(cm + " [from char offset " +
            tokens.get(0).get(CoreAnnotations.CharacterOffsetBeginAnnotation.class) +
            " to " + tokens.get(tokens.size() - 1).get(CoreAnnotations.CharacterOffsetEndAnnotation.class) + ']' +
            " --> " + cm.get(TimeExpression.Annotation.class).getTemporal());
      }
      System.out.println("--");
    }
  }

}

SUTime Rules

To extend SUTime rules, you can configure SUTime to use rules specified in files:

Create rules file (see SequenceMatchRules for format of the rule file).
Example: Sample English rules for SUTime are included in the distribution (sutime/defs.sutime.txt, sutime/english.sutime.txt)

Configure the rules to be used by SUTime:

sutime.rules = [path to rules file]

Example:

sutime.rules = sutime/defs.sutime.txt, sutime/english.sutime.txt

SUTime Annotator

To get annotations on a phrase level instead of on the token level, a separate TimeAnnotator is provided. To add a TimeAnnotator that uses rules to the pipeline:

Create rules file (see SequenceMatchRules for format of the rule file).
Example: Sample English rules for SUTime are included in the distribution (sutime/defs.sutime.txt, sutime/english.sutime.txt)

Configure the TimeAnnotator

customAnnotatorClass.[name]=edu.stanford.nlp.time.TimeAnnotator
[name].rules = [path to rules file]

Example:

customAnnotatorClass.sutime=edu.stanford.nlp.time.TimeAnnotator
sutime.rules = sutime/defs.sutime.txt, sutime/english.sutime.txt

Add the annotator to the pipeline
Example: java -cp stanford-corenlp-2012-05-22.jar:stanford-corenlp-2012-05-22-models.jar:xom.jar:joda-time.jar -Xmx3g edu.stanford.nlp.pipeline.StanfordCoreNLP -annotators tokenize,ssplit,pos,lemma,ner,sutime -properties sutime.properties -file input.txt

Using SUTime to annotate a file with TIMEX3 tag

To annotate a text file with TIMEX3 tags:

Example:


java -Dpos.model=edu/stanford/nlp/models/pos-tagger/english-left3words/english-left3words-distsim.tagger -cp stanford-corenlp-2012-07-06.jar:stanford-corenlp-2012-07-09-models.jar:xom.jar:joda-time.jar -Xmx3g edu.stanford.nlp.time.SUTimeMain -in.type TEXTFILE -date <YYYY-MM-dd> -i <input.txt> -o <output file>

Python: sutime Python wrapper by Frank Blechschmidt: PyPI, GitHub
Swedish: sutime-swedish: SUTime for Swedish configuration file by Andreas Klintberg.

Mailing Lists

We have 3 mailing lists for SUTime, all of which are shared with other JavaNLP tools (with the exclusion of the parser). Each address is at @lists.stanford.edu:

java-nlp-user This is the best list to post to in order to ask questions, make announcements, or for discussion among JavaNLP users. You have to subscribe to be able to use it. Join the list via this webpage or by emailing java-nlp-user-join@lists.stanford.edu. (Leave the subject and message body empty.) You can also look at the list archives.
java-nlp-announce This list will be used only to announce new versions of Stanford JavaNLP tools. So it will be very low volume (expect 1-3 messages a year). Join the list via this webpage or by emailing java-nlp-announce-join@lists.stanford.edu. (Leave the subject and message body empty.)
java-nlp-support This list goes only to the software maintainers. It's a good address for licensing questions, etc. For general use and support questions, you're better off joining and using java-nlp-user. You cannot join java-nlp-support, but you can mail questions to java-nlp-support@lists.stanford.edu.

Online Demo

We have an online demo of SUTime.

Release History

Version 1.3.3	2012-07-09	SUTimeMain supports annotation of text files
Version 1.3.2	2012-05-22	SUTime can be configured using rules
Version 1.2.0	2011-09-14	Initial version of SUTime time phrase recognizer added to NER annotator

Software > Stanford Temporal Tagger