Types of language errors in texts generated by artificial intelligence

Authors

DOI:

https://doi.org/10.58423/2786-6726/2026-2-114-130

Keywords:

artificial intelligence, generated text, language errors, Ukrainian language, language norm, text quality, editing

Abstract

The article addresses the issue of the quality of texts generated by artificial intelligence. It emphasizes that, despite the ability of modern systems to produce grammatically correct and stylistically coherent texts, they are unable to ensure consistent adherence to language norms. Generated texts often contain various types of linguistic deviations, which affect the accuracy of meaning transmission, reduce trust in information, and may destabilize the standard language norm. The insufficient study of Ukrainian-language texts generated by AI is also noted.

The aim of the study was to identify the main types of linguistic errors in Ukrainian texts created by generative language models and to analyse the patterns of their occurrence, taking into account the specific features of how language models operate. The research material consisted of texts generated by different versions of ChatGPT in academic and popular science styles on philological topics.

It was established that the most frequent errors are lexical-semantic and stylistic ones, including calques, tautologies, excessive verbalization, clichéd wording, and formulaic expressions. Syntactic deviations also constitute a significant share; these are manifested in overly complex constructions, template-like structures, and a tendency to use passive forms. Semantic and logical errors related to the phenomenon of “hallucination” were also identified, leading to the emergence of inaccurate or unreliable information. At the same time, morphological and spelling errors occur relatively rarely, which indicates a high level of formal literacy in such texts.

The main causes of linguistic deviations are identified as the probabilistic nature of generation, the influence of heterogeneous and partially incorrect training data, cross-linguistic interference, and the uneven representation of the Ukrainian language in training corpora. The need for a systematic study of linguistic errors and the development of their typology is emphasized, as this is a prerequisite for the effective diagnosis and editing of generated texts.

The results obtained are of practical significance for the development of automated text quality control tools, the improvement of editorial practices, and the formulation of recommendations for the responsible use of generative artificial intelligence in the Ukrainian-language communicative space.

Author Biographies

Larysa Kravets, Ferenc Rákóczi II Transcarpathian Hungarian University

doctor of philological sciences, professor. Ferenc Rákóczi II Transcarpathian Hungarian University, Department of Philology, professor

Natálka Libák, Ferenc Rákóczi II Transcarpathian Hungarian University

PhD. Ferenc Rákóczi II Transcarpathian Hungarian University, Department of Philology, associate professor

References

1. Zahnitko, A. 2012. Slovnyk suchasnoi linhvistyky: poniattia i terminy [Dictionary of Contemporary Linguistics: Concepts and Terms]. Donetsk: Donetskyi natsionalnyi universytet imeni Vasylia Stusa. (In Ukrainian)

2. Kravets, L. V. 2023. Semantychna deryvatsiia v ukrainskomu publichnomu dyskursi [Semantic derivation in Ukrainian public discourse]. Slobozhanskyi naukovyi visnyk. Seriia: Filolohiia 3: s. 74–79. https://doi.org/10.32782/philspu/2023.3.14 (In Ukrainian)

3. Kravets, L. V. 2025. Ukrainska mova v epokhu tsyfrovoi komunikatsii: tendentsii, zminy, perspektyvy [The Ukrainian language in the age of digital communication: trends, changes, and prospects]. Slobozhanskyi naukovyi visnyk. Seriia: Filolohiia 12: s. 18–22. https://doi.org/10.32782/philspu/2025.12.3 (In Ukrainian)

4. Kulias, P. P. 2015. Typolohiia pomylok: pidruchnyk-monohrafiia [Typology of Errors: Textbook-Monograph]. Kyiv: NPU im. M. P. Drahomanova. (In Ukrainian)

5. Slovnyk terminiv u sferi shtuchnoho intelektu [Dictionary of Terms in the Field of Artificial Intelligence] / editors: Chumachenko D., Mishkin D., Andriienko O., Krakovetskyi O., Turuta O., Dubno O., Khrushchova D., Kobrin A., Avdieieva T., Kravets I., Herasymiak V., Shabanov O., Bystrytska A. Kyiv: Ministerstvo tsyfrovoi transformatsii Ukrainy, 2024. (In Ukrainian)

6. Telpis, D. M. – Kutuza, N. V. 2025. Movni deviatsii yak identyfikatsiia roli shtuchnoho intelektu u formuvanni IpsO [Linguistic deviations as an identification of the role of artificial intelligence in the formation of information-psychological operations]. In: Filatova, O. S. ed. Zhurnalistyka i media v umovakh tsyfrovykh transformatsii. Mykolaiv: NUK im. adm. Makarova, s. 205–207. (In Ukrainian)

7. Tur, O. M. – Shabunina, V. V. – Sarancha, V. I. 2025. Dyskursyvni osoblyvosti vykorystannia terminolohii heneratyvnoho shtuchnoho intelektu u fakhovii komunikatsii: analiz tendentsii ta perspektyv [Discursive features of the use of generative artificial intelligence terminology in professional communication: an analysis of trends and prospects]. Acta Academiae Beregsasiensis, Philologica 4/3: s. 140–157. https://doi.org/10.58423/2786-6726/2025-3-140-157 (In Ukrainian)

8. Bender, E. M. – Gebru, T. – McMillan-Major, A. – Shmitchell, S. 2021. On the Dangers of Stochastic Parrots: Can Language Models Be Too Big? Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency. New York: Association for Computing Machinery, pp. 610–623. https://doi.org/10.1145/3442188.3445922

9. Brown, T.B. – Mann, B. – Ryder, N. – Subbiah, M. – Kaplan J. – Dhariwal, P. – Neelakantan, A. – Shyam, P. – Sastry, G. – Askell, A. – Agarwal, S. – Herbert-Voss, A. – Krueger, G. – Henighan, T. – Child, R. – Ramesh, A. – Ziegler, D. M. – Wu, J. – Winter, C. – Hesse, Ch. – Chen, M. – Sigler, E. – Litwin, M. – Gray, S. – Chess, B. – Clark, J. – Berner, Ch. – Candlish, S. – Radford, A. – Sutskever, I. – Amodei, D. 2020. Language Models are Few-Shot Learners. arXiv. Cornell University, pp. 1–75. https://doi.org/10.48550/arXiv.2005.14165

10. Devlin, J. – Uesato, J. – Singh, R. – Kohli, P. 2017. Semantic Code Repair using Neuro-Symbolic Transformation Networks. arXiv. Cornell University, pp. 1–11. https://doi.org/10.48550/arXiv.1710.11054

11. Jumelet J. – Denić M. – Szymanik J. – Hupkes D. – Steinert-Threlkeld S. 2021. Language Models Use Monotonicity to Assess NPI Licensing. In: Findings of the Association for Computational Linguistics. Association for Computational Linguistics, pp. 4958–4969. https://doi.org/10.18653/v1/2021.findings-acl.439

12. Kwok, D. – Altintas, G. S. – Raffel, C. – Rolnick, D. 2025. The Butterfly Effect: Neural Network Training Trajectories Are Highly Sensitive to Initial Conditions. arXiv. Cornell University, pp. 1–29. https://doi.org/10.48550/arXiv.2506.13234

13. Shannon, C. E. 1948. A Mathematical Theory of Communication. The Bell System Technical Journal 27/3: pp. 379–423. https://doi.org/10.1002/j.1538-7305.1948.tb01338.x

14. Shimabucoro, L. – Ustun, A. – Fadaee, M. – Ruder, S. 2025. A Post-trainer’s Guide to Multilingual Training Data: Uncovering Cross-lingual Transfer Dynamics. arXiv. Cornell University, pp. 1–18. https://doi.org/10.48550/arXiv.2504.16677

15. Sorensen, T. – Choi, Y. 2025. Opt-ICL at LeWiDi-2025: Maximizing In-Context Signal from Rater Examples via Meta-Learning. In: Proceedings of the The 4th Workshop on Perspectivist Approaches to NLP. Suzhou: Association for Computational Linguistics, pp. 228–241. https://doi.org/10.18653/v1/2025.nlperspectives-1.20

16. Terčon, L. – Dobrovoljc, K. 2025. Linguistic Characteristics of AI-Generated Text: A Survey. arXiv. Cornell University, pp. 1–26. https://doi.org/10.48550/arXiv.2510.05136

Published

2026-05-30

How to Cite

Kravets, L., & Libák, N. (2026). Types of language errors in texts generated by artificial intelligence. Acta Academiae Beregsasiensis, Philologica, 5(2), 114–130. https://doi.org/10.58423/2786-6726/2026-2-114-130