Types of language errors in texts generated by artificial intelligence
DOI:
https://doi.org/10.58423/2786-6726/2026-2-114-130Keywords:
artificial intelligence, generated text, language errors, Ukrainian language, language norm, text quality, editingAbstract
The article addresses the issue of the quality of texts generated by artificial intelligence. It emphasizes that, despite the ability of modern systems to produce grammatically correct and stylistically coherent texts, they are unable to ensure consistent adherence to language norms. Generated texts often contain various types of linguistic deviations, which affect the accuracy of meaning transmission, reduce trust in information, and may destabilize the standard language norm. The insufficient study of Ukrainian-language texts generated by AI is also noted.
The aim of the study was to identify the main types of linguistic errors in Ukrainian texts created by generative language models and to analyse the patterns of their occurrence, taking into account the specific features of how language models operate. The research material consisted of texts generated by different versions of ChatGPT in academic and popular science styles on philological topics.
It was established that the most frequent errors are lexical-semantic and stylistic ones, including calques, tautologies, excessive verbalization, clichéd wording, and formulaic expressions. Syntactic deviations also constitute a significant share; these are manifested in overly complex constructions, template-like structures, and a tendency to use passive forms. Semantic and logical errors related to the phenomenon of “hallucination” were also identified, leading to the emergence of inaccurate or unreliable information. At the same time, morphological and spelling errors occur relatively rarely, which indicates a high level of formal literacy in such texts.
The main causes of linguistic deviations are identified as the probabilistic nature of generation, the influence of heterogeneous and partially incorrect training data, cross-linguistic interference, and the uneven representation of the Ukrainian language in training corpora. The need for a systematic study of linguistic errors and the development of their typology is emphasized, as this is a prerequisite for the effective diagnosis and editing of generated texts.
The results obtained are of practical significance for the development of automated text quality control tools, the improvement of editorial practices, and the formulation of recommendations for the responsible use of generative artificial intelligence in the Ukrainian-language communicative space.
References
1. Zahnitko, A. 2012. Slovnyk suchasnoi linhvistyky: poniattia i terminy [Dictionary of Contemporary Linguistics: Concepts and Terms]. Donetsk: Donetskyi natsionalnyi universytet imeni Vasylia Stusa. (In Ukrainian)
2. Kravets, L. V. 2023. Semantychna deryvatsiia v ukrainskomu publichnomu dyskursi [Semantic derivation in Ukrainian public discourse]. Slobozhanskyi naukovyi visnyk. Seriia: Filolohiia 3: s. 74–79. https://doi.org/10.32782/philspu/2023.3.14 (In Ukrainian)
3. Kravets, L. V. 2025. Ukrainska mova v epokhu tsyfrovoi komunikatsii: tendentsii, zminy, perspektyvy [The Ukrainian language in the age of digital communication: trends, changes, and prospects]. Slobozhanskyi naukovyi visnyk. Seriia: Filolohiia 12: s. 18–22. https://doi.org/10.32782/philspu/2025.12.3 (In Ukrainian)
4. Kulias, P. P. 2015. Typolohiia pomylok: pidruchnyk-monohrafiia [Typology of Errors: Textbook-Monograph]. Kyiv: NPU im. M. P. Drahomanova. (In Ukrainian)
5. Slovnyk terminiv u sferi shtuchnoho intelektu [Dictionary of Terms in the Field of Artificial Intelligence] / editors: Chumachenko D., Mishkin D., Andriienko O., Krakovetskyi O., Turuta O., Dubno O., Khrushchova D., Kobrin A., Avdieieva T., Kravets I., Herasymiak V., Shabanov O., Bystrytska A. Kyiv: Ministerstvo tsyfrovoi transformatsii Ukrainy, 2024. (In Ukrainian)
6. Telpis, D. M. – Kutuza, N. V. 2025. Movni deviatsii yak identyfikatsiia roli shtuchnoho intelektu u formuvanni IpsO [Linguistic deviations as an identification of the role of artificial intelligence in the formation of information-psychological operations]. In: Filatova, O. S. ed. Zhurnalistyka i media v umovakh tsyfrovykh transformatsii. Mykolaiv: NUK im. adm. Makarova, s. 205–207. (In Ukrainian)
7. Tur, O. M. – Shabunina, V. V. – Sarancha, V. I. 2025. Dyskursyvni osoblyvosti vykorystannia terminolohii heneratyvnoho shtuchnoho intelektu u fakhovii komunikatsii: analiz tendentsii ta perspektyv [Discursive features of the use of generative artificial intelligence terminology in professional communication: an analysis of trends and prospects]. Acta Academiae Beregsasiensis, Philologica 4/3: s. 140–157. https://doi.org/10.58423/2786-6726/2025-3-140-157 (In Ukrainian)
8. Bender, E. M. – Gebru, T. – McMillan-Major, A. – Shmitchell, S. 2021. On the Dangers of Stochastic Parrots: Can Language Models Be Too Big? Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency. New York: Association for Computing Machinery, pp. 610–623. https://doi.org/10.1145/3442188.3445922
9. Brown, T.B. – Mann, B. – Ryder, N. – Subbiah, M. – Kaplan J. – Dhariwal, P. – Neelakantan, A. – Shyam, P. – Sastry, G. – Askell, A. – Agarwal, S. – Herbert-Voss, A. – Krueger, G. – Henighan, T. – Child, R. – Ramesh, A. – Ziegler, D. M. – Wu, J. – Winter, C. – Hesse, Ch. – Chen, M. – Sigler, E. – Litwin, M. – Gray, S. – Chess, B. – Clark, J. – Berner, Ch. – Candlish, S. – Radford, A. – Sutskever, I. – Amodei, D. 2020. Language Models are Few-Shot Learners. arXiv. Cornell University, pp. 1–75. https://doi.org/10.48550/arXiv.2005.14165
10. Devlin, J. – Uesato, J. – Singh, R. – Kohli, P. 2017. Semantic Code Repair using Neuro-Symbolic Transformation Networks. arXiv. Cornell University, pp. 1–11. https://doi.org/10.48550/arXiv.1710.11054
11. Jumelet J. – Denić M. – Szymanik J. – Hupkes D. – Steinert-Threlkeld S. 2021. Language Models Use Monotonicity to Assess NPI Licensing. In: Findings of the Association for Computational Linguistics. Association for Computational Linguistics, pp. 4958–4969. https://doi.org/10.18653/v1/2021.findings-acl.439
12. Kwok, D. – Altintas, G. S. – Raffel, C. – Rolnick, D. 2025. The Butterfly Effect: Neural Network Training Trajectories Are Highly Sensitive to Initial Conditions. arXiv. Cornell University, pp. 1–29. https://doi.org/10.48550/arXiv.2506.13234
13. Shannon, C. E. 1948. A Mathematical Theory of Communication. The Bell System Technical Journal 27/3: pp. 379–423. https://doi.org/10.1002/j.1538-7305.1948.tb01338.x
14. Shimabucoro, L. – Ustun, A. – Fadaee, M. – Ruder, S. 2025. A Post-trainer’s Guide to Multilingual Training Data: Uncovering Cross-lingual Transfer Dynamics. arXiv. Cornell University, pp. 1–18. https://doi.org/10.48550/arXiv.2504.16677
15. Sorensen, T. – Choi, Y. 2025. Opt-ICL at LeWiDi-2025: Maximizing In-Context Signal from Rater Examples via Meta-Learning. In: Proceedings of the The 4th Workshop on Perspectivist Approaches to NLP. Suzhou: Association for Computational Linguistics, pp. 228–241. https://doi.org/10.18653/v1/2025.nlperspectives-1.20
16. Terčon, L. – Dobrovoljc, K. 2025. Linguistic Characteristics of AI-Generated Text: A Survey. arXiv. Cornell University, pp. 1–26. https://doi.org/10.48550/arXiv.2510.05136
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2026 Larysa Kravets, Natálka Libák

This work is licensed under a Creative Commons Attribution 4.0 International License.
Authors retain copyright and grant the journal the right of first publication. The work is simultaneously licensed under a Creative Commons Attribution 4.0 International License (CC BY 4.0), which permits others to share the work with appropriate credit given to the author(s) and the initial publication in this journal.















