WAYS OF TRANSMITTING SEMANTIC GROUPS OF VERBS IN THE INTERNAL CORPUS
DOI:
https://doi.org/10.48371/PHILS.2023.69.2.014Keywords:
national corpus, meta-markup, sign-code, verb, lexico-semantic markup, lexico-grammatical specificity, digitalization, subcorpusAbstract
In the article, the author examines the possibilities of using language categories, in particular linguistic markup, in the modern digital space, which can be used in certain areas of linguistics, in scientific papers and research in the Kazakh language. Implementation of the linguistic markup system all works on the automation of language resources have been prepared by scientists of the A.Baitursynov Institute of Linguistics. It should be noted that now the platform "National Corpus of the Kazakh language" is working on the institute's website. The database of texts consisting of 21 million word uses and the entire system of linguistics, consisting respectively of such sections as morphology, word formation, vocabulary, phonetics, has been automated.
The article describes the work on the classification of verbs into macro and micro semantic groups with the definition of lexico-semantic, lexico-grammatical markup codes. As an example, we can cite the question of how this is automated by labeling lexico-semantic markings on borrowed verbs. The process of preparing the lexico-semantic subcorpus of a verb begins primarily with the division of the collected verbs into macro- and micro-groups, depending on their semantic specificity. Next are the methods of preparing the development of lexico-grammatical, morphological, word-formation, connotational markings.
At the same time, first of all, the models of verb classification into lexico-semantic groups are differentiated by major grammarians. Currently, the latest sorted classification model is being supplemented to introduce markup with a comparison with the values of scientists. A classification of semantic groups of verbs has been developed, which is necessary to create a lexico-semantic markup program, through which the Institute will discuss ways to introduce lexico-semantic markup of a class of verb words into the subcorpus.
The article widely used computer-software methods for the development of linguistic designations of verbs included in the subcorpus of lexico-semantic notation, descriptive methods in addition to a new system of theoretical and practical knowledge of corpus linguistics, as well as the method of morphological, word-formation, lexical analysis of verbs introduced into the subcorpus. The proposed research work can be used in the future to determine the lexico-grammatical, word-formation activity of any language learner, student, linguist, universal verb and to determine the lexico-semantic group. The article describes the ways of setting the markup in accordance with this goal. This increases the value of the work.