English Italiano
Sentences with high readability
The search option "easy sentences only", applies a filtering criterion to the data retrieved from the corpus. Sentences are filtered according to three criteria:
- Sentence length - only sentences that contain between 5 and 25 words are included.
- Readability score according to the Gulpease Index - only sentences from texts with Gulpease Index between 60 and 120 are included.
- Number of words from non-basic vocabulary - only sentences are included that contain at most eight words of "advanced" vocabulary.
Advanced vocabulary
The non-basic vocabulary includes all words that are not contained in the "basic vocabulary" (about 7000 words) as defined by Tullio de Mauro.
Type-token ratio
The type-token ratio (TTR) is a percentage value that indicates the variety of different words used in a text. The higher the value the higher the variability of the vocabulary used in the text. The TTR is calculated by dividing the number of different words used in a text (types) by the number of words of a text (tokens) multiplied by 100.
Readability score "Gulpease Index"
The "Gulpease Index" is a measure that calculates the readability of a text based on the length of words (measured in number of letters), the number of words and the length of sentences.
A higher Gulpease score indicates higher readability of a text, with the following benchmarks defined:
- texts with a score below 80 are taken to be difficult for a 5th grade reading level (primary school level: 6 to 10 age range)
- texts with a score below 60 are taken to be difficult for a 8th grade reading level (junior secondary school level: 11 to 13 age range)
- texts with a score below 40 are taken to be difficult for a 13th grade reading level (secondary school level: 14 to 18 age range)
You need more help? See here for an overview of our help pages.