    Intelligibility is essential to communication, and impairment diminishes the capacity to interact [2]. It represents the degree to which the speaker’s message can be understood by the listener: “proportion of speech understood” or, in the assessment context, “proportion of words correctly transcribed” [3–5]. It is thus a mat-ter of perception by others, which makes it LOXO-101 difficult to quantify,
    due to human compensation strategies enhancing signal decoding by cues relying on language structure and meaning [6].
    Speech disorder severity is therefore an alternative index in clin-ical contexts. It is based on degree of intelligibility impairment associated to other speech signal variables such as acoustic-phonetic code emission quality, speech speed and other temporal and/or prosodic parameters relevant to the perceived difficulty [7–9]. Disorder severity is a broader concept, including intelligi-bility plus compensation strategies. r> Speech disorder assessment by speech therapists is based on perceptual evaluation. One of the most widely used instruments in France is the BECD clinical dysarthria assessment battery (Batterie d’Évaluation Clinique de la Dysarthrie) [10]. Perceptual assessment may be made at two levels [11]:
    • phoneme emission: is the phoneme perceived by the listener the intended one? This concerns acoustic-phonetic coding, or intel-ligibility;
    • oral communication, or “running speech intelligibility”, which involves top-down mechanisms.
    Speech assessment needs to be two-fold: qualitative, to determine the mechanisms underlying the impairment, and quan-titative, to score the degree of impairment. Thus two strategies may be used: a grading scale such as a visual analogue scale (from per-ceived severe disorder to no disorder), or measurements (e.g., rate of error between intended and perceived phonemes).
    All such perceptual tests, however, have a drawback: reliabil-ity and reproducibility are listener-dependent [11]. The listener’s familiarity with the patient or with the task increases predictabil-ity, and results may differ between panels of experts and of naïve observers. Reproducibility is likewise subject to variation. Even so, these are the most widely used clinical tests, mainly for reasons of ease of use.
    We therefore sought to identify the most useful tests, feasible in everyday clinical practice, for perceptual evaluation of speech to provide reference values for progression assessment.
    The aim was to compare results on perceptual tests of disor-der severity and intelligibility impairment between two situations: reading, and semi-spontaneous speech.
    The study hypothesis was that severity assessed on semi-spontaneous picture description is more clinically relevant than intelligibility assessed on reading.
    2. Material and methods
    The study protocol was designed as part of the Carcinologic Speech Severity Index (C2SI) project [12], the aim of which is to measure the impact of head and neck (oral and pharyngeal) cancer treatment on speech production by automated speech processing compared to perceptual methods.
    The corpus was built up from patients seen in follow-up after oral cavity or oropharyngeal cancer treatment in 2015 and 2016 at the Institut Universitaire du Cancer Oncopole cancer center in Toulouse, France.
    Inclusion criteria comprised: 6 months post-treatment, and clinical remission so that speech disorder would be as stable as pos-sible, whether perceptible to the naked ear or not (so as to include the mildest deficits). The study was thus conducted in a context of chronic speech disorder.
    Exclusion criteria comprised speech disorder potentially associ-ated with some other pathology such as stroke or fluency disorder (stammering).
    Patients’ speech was recorded on a reading task (READ) and a picture description task (DESC).
    All recordings were made in the oncorehabilitation unit of the Institut Universitaire du Cancer Toulouse – Oncopole. During follow-up consultation, patients entered a soundproof recording booth for speech assessment tasks. Audio files were recorded in WAVE format on a digital recorder with microphone and popshield filter, to optimizer recording quality and minimize measurement bias (misassessment due to recording issues).