English   Italiano

CQP Search

The CQP Search provides support for queries in the Corpus Query Processor query language (CQP) of the IMS Open Corpus Workbench. The search box serves as a command line for entering search commands.

As in the Simple and Advanced Search mode, results can be restricted to "easy sentences" and dependency diagrams can be displayed for the results. Easy sentences are selected based on predefined readability criteria. Both options are activated by selecting the respective check-boxes. If dependency diagrams are displayed for all hits the context defaults to one sentence.

Examples of queries in CQP syntax

To search for unspecified tokens (e.g. "any word") the regular expression ".*" can be used. To learn more about how to use regular expressions, see here.

Display settings

Using the CQP commands "set" and "show" (see CQP tutorial section 2.3) the user can customize the display of search results.

Examples:

Search results show the words followed by their attributes, separated by slashes.

example attributes

The current settings are displayed in an area called "Settings", below the search options.

example settings

Word-level attributes

The following word-level attributes are supported:

id
unique identifier of a word within its text
lemma
base form of the word
coapos
coarse part of speech (or also here)
pos
part of speech (or also here)
head
identifier of the head word of the dependency relation (unique within the text)
feats
feature values of the word (e.g. case, gender, number)
deprel
dependency relation of the word

Structural attributes

Furthermore, searches can be restricted by structural attributes of the text. The following structural attributes are supported:

text_id
unique identifier of the text
text_url
source URL that the text was taken from
text_tok
number of words (tokens) in text
text_ttr
type-token ratio within text
text_advvoc
number of words (tokens) in text that are non-basic vocabulary
text_sent
number of sentences in text
text_gulpidx
readability index 'indice Gulpease' of the text
s_advvoc
number of words (tokens) in sentence that are non-basic vocabulary

Example: Searching for...

For details on how to use structural attributes in CQP queries, see CQP tutorial section 4.2

Named subcorpora

Search results can be stored in named subcorpora. This is done by typing "NAME = " in front of the query. Names have to start with capital letters and can be composed of letters, numbers and underscore.

Example:

Named subcorpora, as created by the user, show up in the corpus dropdown menu and can be used for subsequent querying. The list of subcorpora is available in the Simple, Advanced and CQP search as well as in the Filter interface. The subcorpus called "Last" always stores the results of the most recent query or filtering carried out by the user. Be aware that only the search match is stored to the subcorpus.

Examples:

Subcorpora can also be defined in the Filter interface. For details see here.

All subcorpora expire after 24 hours or once the browser is closed.

Examples of complex queries

We provide a list of precompiled examples of linguistically-motivated queries. These examples include queries for:

By clicking on a query example, the corresponding search is submitted. The CQP search box displays the corresponding query, which can be modified according to the user's needs.

Limitations on CQP search

Due to security restrictions some of the general CQP functionalities were not made available. It is not possible to use the count, sort, group, tabulate, dump and reduce commands.

You need more help? See here for an overview of our help pages.