Contenu de l'article

Titre Analyse du corpus ORTHOTEL : apport du traitement automatique à la classification des déviations orthographiques
Auteur Véronique Auberge, Nadia Ghneim, Rahia Belrhali
Mir@bel Revue Langue française
Numéro no 124, décembre 1999 L'orthographe et ses scripteurs, sous la direction de Jean-Pierre Chevrot
Page 90-103
Résumé anglais The aim of this study is to present organized statistical data extracted from a large corpus of 15 000 forms showing spelling errors. This corpus, ORTHOTEL, is the result of Minitel users wondering about word spelling. An automatic treatment has been applied to the corpus to separate and analyse erors. Half of the forms of the corpus are rightly spelled. It indicates the users' degree of linguistic insecurity. An automatic text-to-phone system applied on the badly spelled words shows that a great part are homophone to a correct word taken from a reference lexicon of 80 000 canonical forms. An alignment algorithm has classified the orthographic transformations which account for deviations from the reference lexicon .
Source : Éditeur (via Persée)
Article en ligne http://www.persee.fr/doc/lfr_0023-8368_1999_num_124_1_6308