Each entry is input in the Headword field in the dictionary, in citation form, using the Shiraz romanization method. The citation form entered for nonverbal elements is the noninflected form or stem, and the citation form entered for verbal elements is the infinitival. In case entries have other orthographic variants, they are also included in a specific field called Variants.
Vowels generally known as short vowels (a, e, o) are usually not
written in Persian; only the long vowels (y, u, A) are represented in text.
Therefore, words with different short vowels are input as one entry in
the dictionary. This, of course, creates certain ambiguities. Since
the short vowels are not inscribed, the word krm, for instance,
can be pronounced with different vowel combinations resulting in five
possible lexical elements as shown below. A reader uses the context to
determine the word in the sentence.
A great number of words in Persian language exist as compounds, such as light verbs, compound nouns, and a number of prepositions. They are input in the dictionary with a space between their constituent elements.
The POS field holds the Part of Speech for the entry. The main Open Class parts of speech in the Shiraz dictionary are: Noun, Adjective, Proper Name, Verb, Light Verb. The latter consist of one or more preverbal elements, which could be a noun, adjective or preposition, followed by a verb which has lost its original meaning; they are categorized as LightVerb in our dictionary. For example, asrar krdn[esrAr kardan], meaning "insist", which consists of the noun asrar "insistence", and the verb krdn "do".
Closed Class items in the dictionary are: Prepositions, Postposition (object marker ra[rA]), Conjunctions, Relativizers, Numerals (numbers and digits), Determiners, Interrogatives, Interjections, Titles, Phrases, Numeratives (classifiers used to form numeral expressions), Number Units (which refer to numbers such as hzar[hezAr] "thousand", mylyvn[milion] "million"). Pronouns are also among the Closed Class items. They are twofold: Personal Pronouns, such as mn[man] "I" or av[au] "he/she", and Quantifier Pronouns like hmh[hame] "everyone".
In the current version of the dictionary there are certain POS categories, such as POSNotAvailable, for entries whose POS were not clear to the lexicographer; they need to be edited.
Persian contains a large number of Arabic loan words. The main area in which the Arabic borrowings are noticeable is in the formation of plural nouns. These plural forms follow the "broken" plural formation in Arabic, based on a consonantal root. The rules for forming these plurals are not used productively in Persian, however; instead, the forms derived from the Arabic morphological paradigm have been lexicalized into the language. These plural nouns are input in the Shiraz dictionary as lexical elements, with the feature Number set to plural. These entries are treated as irregular and are not analyzed for number by the morphological analyzer.
There are also a number of irregular ordinal numbers that can not be
derived from their cardinal forms. These ordinal numbers, that do not
follow the morphological rules for ordinal formation, are input in the dictionary as irregular
with the feature Ordinal indicating the number type.
There exist three types of ordinal values in Persian depending on
their morphological structure and syntactic behavior. When an
irregular ordinal is entered in the dictionary, its corresponding
ordinal type should also be set. For example, one of the ordinal forms
of the cardinal number yk[yek] meaning "one" is
avl[aval] meaning "first". In this case, the irregular ordinal
number avl[aval] is input in the dictionary and its number type
is set to Ordinal Third. (The form avly is of type Second and
avlyn is of type First.)
Regular Feature
If an entry is treated as an irregular (i.e., if the Number
feature is set to Plural, or if it is marked as an
Ordinal), the value of the Regular feature is
automatically set to False. This information is used by the
morphological and syntactic components in analysis.
| Top of Page |