Last month we explored the application and interpretation of various measures for the signs of ocular disease: the scales we use to rate redness, lid swelling or corneal staining. An analogous methodology is also applied to quantify symptoms of ocular disease such as pain, itch or other forms of discomfort. Because we rely on our patients to report the intensity, character and progression of disease symptoms, it’s crucial that we take great care in how we pose that deceptively simple question, “How are you feeling?”

This month, we delve into the nuances of symptomatic assessment. We’ll discuss both the importance of establishing an approach that allows us to best employ the information our patients provide for diagnosis and treatment assessment, and the role this information plays in developing new therapies. Issues surrounding the optimization of symptom evaluation are critical in developing dry-eye treatments, and they’re also of interest in therapies for allergy, ocular inflammation and postoperative ocular pain.

Ask Better Questions
Dry-eye disease is an ocular condition defined by patients’ symptoms: The name of the condition is itself a symptomatic description. We’re aware that the central confounder in DED is that symptoms and signs are often discordant, and patients often report significant symptomatic disease with little or no ocular surface staining or tear-film dysfunction.1-2 This has led to a focus on clinical refinement of the tools we use to quantify symptomatic DED. There are dozens of variations of the questionnaires used to provide an objective measure of dry-eye symptoms, and the particular value of each is a function of its specific use; some are designed for epidemiologic studies3-7 while others are constructed for use as clinical diagnostic tools. Often the diagnostic questionnaires are also employed in therapeutic trials, but these can leave room for improvement in that specific role.

Dry-eye questionnaires such as the National Eye Institute visual function questionnaire, the Women’s Health Study questionnaire and the Dry Eye Epidemiology Project’s questionnaire are tools designed to assess the scope of ocular disease, that were later used to evaluate symptoms. In general, these were limited by either having too many questions or too few dry-eye related queries. Another group of these symptomatic measures, including the Ocular Surface Disease Index, the McMonnies Dry Eye Index and the Dry Eye Questionnaire, focused more on the key issues of DED symptom characteristics, symptom severity and disease impact. The McMonnies and the OSDI were characterized in greater detail and validated for use in clinical trials.

What makes a good questionnaire? The goal is to allow the patient to provide accurate information about his symptoms in a way that precisely and reproducibly reflects the disease state. Most questions are formatted so that the patient responds with a scalar numeric; for example, a question from the OSDI asks:

“Have you experienced the following in the last week: painful or sore eyes (4) all of the time (3) most of the time (2) half of the time (1) some of the time (0) none of the time.”

Structuring questions in this way, where higher numerical scores are associated with increased disease severity, allows questionnaires with 10 to 15 questions like the OSDI to generate a summated score in which ranges of total scores are designated as mild, moderate or severe symptomatic disease.

One of the confounding aspects of treating dry-eye disease is that a patient with an objective sign, such as staining, may not have corresponding symptoms, and vice versa.
Several other factors are critical to a reliable, reproducible questionnaire that can provide an accurate DED evaluation. First, the number of questions should be kept to a minimum, allowing patients to maintain a consistent degree of focus in answering each question. Our experience tells us that questionnaires should be no longer than the 12 to 15 questions of the McMonnies or the OSDI; and it may be that four or five well-structured questions are sufficient, or even superior, in terms of the goals stated above. This is particularly true in a clinical trial setting, where a greater number of questions provides more—but not necessarily better—data, and the resulting greater variability in the dataset often confounds the outcome.

Simplifying Symptom Scoring
Greater numbers of questions in the setting of a clinical trial also provides a greater potential for the Hawthorne Effect, where subjects interject their perceptions regarding right or wrong/better or worse answers in response to a perceived need on the part of those administering the test. This subjective influence can ultimately blur distinctions between treatment and control groups.

A second factor that can have a big impact on symptom questionnaires is the degree to which questions rely on recollection. Questions that ask, “How often in the last week  …” or “How many days in the past month ...” are dependent upon a subject’s recall and therefore have an inherent potential for inaccuracy. As with longer surveys, those with questions that focus on a historical perspective from the patient’s view will allow for an interpretive response with an increased subjective impact. Recent studies suggest that recall bias is a particularly significant issue in older patient populations, and that recall issues may introduce systematic biases into the responses and, therefore, the symptomatic assessments.8 In the context of a clinical trial, however, it’s also important to remember that if the survey is used to assess primary or secondary endpoints, then it needs to specify the time frame and optimize this based on the predicted temporal characteristics of the intervention being tested.9 Questions ideally need to strike a balance between contemporaneous and retrospective reporting of a patient’s experience.

In an ideal world, our symptom questionnaire would have a single question and a correlation with disease severity of 1. As we live in the real world, we strive for a few simple questions and a correlation of 0.7 or better. Further complicating the refinement of symptomatic assessments are the contrasting aspirations of questionnaires designed for the clinic and those optimized for a clinical trial: While an inclusion error is good in the practice of medicine, an exclusion error is better in a drug-development setting.

Several approaches have been used to address the pitfalls of current questionnaires by combining old and new questionnaires or focusing on specific questions with a symptom survey. One study employed the OSDI questionnaire in combination with a simpler, four-question survey of subjects’ current dry-eye symptomatology (Ora Calibra Four-Symptom Scale).10 Another trial used the OSDI, but then defined a secondary symptomatic endpoint based upon the mean change from baseline in the visual-related function subscale score of the OSDI.11 In this case, several OSDI questions that focus on visual function were plucked from the survey, predefined as a unique endpoint and scored separately. This approach may be the best of both worlds: inclusion of more extensive, validated methods such as OSDI allows for an assessment comparable to other trials, while simplified scales such as the Four-Symptom Scale are key to a more objective and tighter dataset able to rigorously describe the efficacy of an intervention.

Beyond Dry Eye
When we talk about disease symptoms, we refer in almost every case to some type of pain or discomfort: dryness; grittiness; burning; itching; and even photophobia. Many of these are associated with conditions other than dry eye, and so should be part of the diagnosis for those conditions as well. Eye pain and photophobia, for example, are two hallmark symptoms of migraine headache.12 There are many different pain-assessment instruments available that range from extensive questionnaires to simple cartoon graphics, but, as with the dry-eye surveys, the goal here is to provide a reliable assessment of patient symptoms as a means to guide treatment or therapy development.

Pain is often a key sequela to procedures such as cataract or refractive surgery, and in these cases the most reliable measure is obtained with a visual analog scale. Comparisons of numerical scales without verbal descriptors and those with graded descriptors such as mild, moderate or severe show that it is primarily the anchoring descriptors that impact the reliability of the scale.13

Use of symptomatic data in clinical trials is always more difficult than endpoints with clearly defined, objective metrics. Despite this, they’re often the most clinically meaningful measures, and so their inclusion is often essential from a regulatory perspective. An approach that is seen with increasing frequency to address this and other trial design issues is the composite endpoint, in which specific treatment goals are combined into one amalgamated efficacy measure.14,15 For example, use of a composite endpoint including a structural endpoint (such as retinal optical coherence tomography) combined with some measure of visual function has been suggested for glaucoma trials, where limitations of each measure could be mitigated by their combination.15 In this case, where disease progression is slow and often not tightly correlated with anatomical changes, the combination allows for an amplification of potential therapeutic effects.

Often, composite endpoints function to increase the statistical power of a study without increasing the number of subjects required; an example of this is the total nasal symptom score used in trials of allergy therapies. The TNSS combines symptomatic scores for rhinorrhea, nasal congestion, nasal itching and sneezing, and has become the metric for therapeutic assessment of anti-allergics.16 Of course, combining measures of multiple symptoms implies that all are equal contributors to patient symptomatology. Because this may not always be the case, there is a risk that treatments with specific effects are approved for more extensive indications. In the case of the TNSS score, for example, a drug with strong anticholinergic action may potently abolish rhinorrhea but have little effect on other components of the composite score. With an overwhelming effect on one component, such a drug could generate statistically significant reductions in TNSS and thus potentially receive an indication for relief of all the symptoms included in the composite. In this way, use of composite endpoints can lead to artifacts in the regulatory process and an artificial broadening of the therapeutic indication. This reminds us to look carefully at each component of a composite score with the goal of selecting those that equally reflect the symptomatology of the target condition.

Despite these caveats, composite endpoints do have a potential utility, whether for enhancement of our ability to address symptomatic diseases or as a tool in the broader context of clinical therapeutic development. Key to the process of symptom assessment, regardless of the specific measure or metric selected, is the fact that it must be based on a clear and honest conversation between patient and physician.  REVIEW

Dr. Abelson is a clinical professor of ophthalmology at Harvard Medical School. Dr. Hollander is chief medical officer at Ora, and assistant clinical professor of ophthalmology at the Jules Stein Eye Institute at the University of California, Los Angeles. Dr. McLaughlin is a medical writer at Ora Inc. Dr. Abelson may be reached at

1. Abelson MB, Ousler GW 3rd, Nally LA, Emory TB. Dry eye syndromes: Diagnosis, clinical trials and pharmaceutical treatment—‘improving clinical trials’. Adv Exp Med Biol. 2002;506(Pt B):1079-86.
2. Abelson MB,Ousler GW 3rd, Maffei C. Dry eye in 2008. Curr Opin Ophthalmol 2009;20:4:282-6.
3. Methodologies to diagnose and monitor dry eye disease: Report of the Diagnostic Methodology Subcommittee of the International Dry Eye WorkShop. Ocular Surface 2007;5:2:108-152.
4. Kanellopoulos AJ, Asimellis G. In pursuit of objective dry eye screening clinical techniques. Eye Vis (Lond) 2016;18:3:1.
5. Grubbs JR, Tolleson-Rinehart S, Huynh K, Davis RM. A review of quality of life measures in dry eye questionnaires. Cornea 2014;33:2:215-8.
6. Amparo F, Schaumberg DA, Dana R. Comparison of two questionnaires for dry eye symptom assessment. Ophthalmology 2015;122:1498-1503.
7. Barabino S, Labetoulle M, Rolando M, Messmer EM. Understanding symptoms and quality of life in patients with dry eye syndrome. Ocul Surf 2016;14:3:365-76.
8. Brusco NK, Watts JJ. Empirical evidence of recall bias for primary health care visits. BMC Health Serv Res 2015;15:15:381.
9. FDA guidance document of PRO Accessed 24 August 2016.
10. Meerovitch K, Torkildsen G, Lonsdale J, Goldfarb H, Lama T, Cumberlidge G, Ousler GW 3rd. Safety and efficacy of MIM-D3 ophthalmic solutions in a randomized, placebo-controlled Phase 2 clinical trial in patients with dry eye. Clin Ophthalmol 2013;7:1275-85.
11. Sheppard JD, Torkildsen GL, Lonsdale JD, D’Ambrosio Jr. JA, et al. Lifitegrast ophthalmic solution 5.0% for treatment of dry eye disease. results of the opus-1 phase 3 study. Ophthalmology 2014;121:475-483.
12. Walters AB, Smitherman TA. Development and validation of a four-item migraine screening algorithm among a nonclinical sample: the migraine-4. headache. 2016;56:1:86-94.
13. Hjermstad MJ, Fayers PM, Haugen DF et al. Studies comparing numerical rating scales, verbal rating scales, and visual analogue scales for assessment of pain intensity in adults: A systematic literature review. J Pain Symptom Manage 2011;41:6:1073-93.
14. Landy S, White J, Lener SE, McDonald SA. Fixed-dose sumatriptan/naproxen sodium compared with each monotherapy utilizing the novel composite endpoint of sustained pain-free/no adverse events. Ther Adv Neurol Disord. 2009;2:3:135-41.
15. Medeiros FA. Biomarkers and surrogate endpoints in glaucoma clinical trials. Br J Ophthalmol 2015;99:5:599-603.
16. FDA TNSS guidance Accessed 16 August 2016.