We report on ongoing research into the building blocks of spoken language. Strictly decompositional theories about the architecture of the language faculty, with a computationalsystem (grammar) combining elements from a list of words (lexicon), can hardly account for the fact that actual language use is full of recurrent word combinations of various degrees of idiomaticity andabstractness. Alternative theories, such as Construction Grammar and Construction based HPSG, are claimed to do better in this respect.
In our talk, we will discuss our methods to isolate recurrent word combinations interesting from the Spoken Dutch Corpus (CGN). We will present first results, and we will reflect on grammatical frameworks to describe them.
Time permitting, we will also touch upon issues of implementation.