Evidence-based medicine (EBM) is a judicious way of practicing medicine and proposes that, in order to apply the best evidence for patient care, clinical research articles be assessed comprehensively, as well as taking into account the physician's experience and the patient's preferences. Evidence-based medicine is carried out following 5 fundamental steps: formulation of the clinical question, search for evidence, critical reading, application of the results and evaluation of the impact on the patient.
Its application and teaching in our country are still limited, and the situation of critical reading, an essential element of the EBM, is of great concern. Its current level is unknown, particularly among training professionals who make decisions regarding patient care. In addition, there are a huge number of scientific papers published each month in high- and low-impact journals. This makes it almost impossible for the doctor to read all the information available on a certain subject of interest(2,3)
. To do so, the physician must perform an adequate critical reading to select the articles relevant to their clinical expertise and determine what article is appropriate for applying results in patient care and whether it is worthwhile to read more(2)
However, among physicians in training as medical specialists or residents, critical reading is low, as shown by a study published by Gonzáles-Ávila et al(4)
, carried out in oncology residents, where an insufficient level of critical reading of clinical research articles was found. On the other hand, Amanda Galli et al(5)
, in another study conducted in 169 cardiology residents, found similar results, concluding that the critical reading ability of young professionals is insufficient. Similarly, in a study of family medicine residents at Aguascalientes, they found a low critical reading of clinical research papers(6)
As can be seen, there is a lack of knowledge of critical reading among resident physicians and there is still a long way to go in its teaching. For this reason, we conducted the present review to provide practical bases to help the resident physician in medical specialties to apply critical reading of clinical studies in a simpler way.
WHAT IS CRITICAL READING OF CLINICAL STUDIES?
From the point of view of EBM, critical reading is a structured and systematic reading that allows us to assess the validity and relevance of the results and their applicability to the management of our patient(7)
. In other words, critical reading responds to three fundamental questions: Are the results valid? What were the results? How do I apply them to my patient?(7)
HOW TO APPROACH A CLINICAL STUDY FOR CRITICAL APPRAISAL?
In order to address the critical reading of a clinical study, a series of guidelines should be considered. We recommend that clinicians follow the steps of the Red CASPe (Critical Appraisal Skills Programme en Español) as their recommendations are focused on the clinical field(7,8)
. These three basic questions must be answered:
- Are the results valid?
- What were the results?
- How do I apply them to my patient?
In the following, we provide a practical description of each stage of critical reading that we will develop during the review:
VALIDITY ASSESSMENT: TYPE OF STUDIES AND CLINICAL TOOLS
- Validity assessment: We will review the most practical way to critically approach the validity of the evidence based on the study design and the methodological tools we can apply.
- Evaluation of findings: We will review the findings and evaluate their clinical impact, as well as clinical relevance on statistical significance.
- Application of the results: We will describe the steps to follow to apply the results of the evidence to our patient
First, it is important to determine what type of clinical question you want to answer with the study to be evaluated and whether the design is appropriate to answer that question. Table 1
describes the types of clinical questions and the most appropriate research designs to answer them. We must not forget that if we move to a secondary design as described in table 1
, there is a higher risk of bias and a lower quality of evidence. As a result, we cannot make the best decisions for the health of our patients. The clinical research question lies in the purpose of the article.
Table1. Types of studies and suitable designs for the type of question.
Main study design
Secondary study design
Statistical test of relevance
Among overweight patients, what are the factors that increase the risk of developing an acute myocardial infarction within five years?
Odds Ratio, Logistic regression
In patients with acute respiratory symptoms Is the stool antigen test valid in relation to the nasopharyngeal swab for the diagnosis of COVID-19?
Operational characteristics of diagnostic tests (sensitivity, specificity, positive and negative predictive values, positive and negative likelihood ratios, area under the curve).
Therapy and/or prevention
In patients diagnosed with urosepsis, is the application of vitamin C safe and effective in relation to placebo to reduce days in hospital?
Clinical Trial, Systematic Review of Clinical Trial
Cohort studies, Case-control studies
Relative risk, absolute risk reduction and number needed to treat
In patients diagnosed with pneumonia caused by COVID-19, does the neutrophil-lymphocyte ratio above 3 predict severity at 7 days of hospitalization?
Prospective cohort study
In critically ill patients diagnosed with COVID-19-related pneumonia, is the use of 6 mg IV dexamethasone every 24 hours cost-effective compared to placebo for mortality reduction?
Cost-effectiveness studies. Systematic reviews of cost-effectiveness studies.
Cost-effectiveness analysis, Cost-utility, Cost minimization
Source: Elaborated by the authors. The clinical questions described are fictional questions, not actual cases. They are only an example of formulation.
It is important to briefly define what a clinical question is. The clinical question is a patient-centred question, which means that when answered, decisions can be made for patient management(7)
. In addition, depending on their syntax, they may have two types, structured and unstructured(7)
. Table 2
sets out the differences between a structured clinical question and an unstructured clinical question. The clinician should ask patient-centred and structured questions, as this will help him/her to systematize more effectively the problem he/she is seeking to solve, and with a little more experience, he/she will be able to see in the question the strategy he/she will use to answer it.
Table 2. Structured and unstructured clinical questions.
In patients with multiple myeloma who start chemotherapy with Lenalidomide, is the prophylactic use of enoxaparin at 60mg SC* every 24 hours compared to placebo safe and effective in reducing the number of venous thromboembolic events at one month of treatment?
They are based on the syntax of the PICO type question.
¿Which is the best prophylactic for preventing venous thromboembolism in patients with multiple myeloma receiving Lenalidomide in chemotherapy programs?
They are not based on PICO syntax. But they are the basis for the formulation of structured questions and are the first questions that should be asked and rephrased with PICO syntax.
Source: Elaborated by the authors *Subcutaneous
Once the research design has been evaluated and determined, it is important to assess the risk of bias. This is one of the most important parts of critical reading, because a bias systematically deviates the effect found from the true value and would give us conclusions far from reality, thus avoiding the existence of clear benefits for the patient. Therefore, the risk of bias assessment is very important in clinical trials, since a bias would lead us to provide the wrong treatment to a patient or that the patient does not require it and increases the risk of adverse effects(9)
. It is important to know that there are different methodological tools available to assess this risk of bias based on the research design. Table 3 shows the research designs and methodological tools that will help in the critical evaluation. These tools are checklists of relevant sections that the articles to be assessed should contain based on the study design. And according to this, studies can be classified as having a low, moderate or high risk of bias. Another way of evaluating articles is based on the critical reading tools or instruments provided by the Red CASPe(8)
, which are very practical and targeted for each type of design, and which can be easily downloaded at the following link: https://www.redcaspe.org/herramientas/instrumentos
. We always recommend using the one that best suits you, you should also keep in mind that new tools continue being developed, including PROBAST, a tool developed for the assessment of the risk of bias in predictive model studies published in 2019(10)
. For observational studies such as case-control, cohort and cross-sectional studies, we can use the New Castell-Otawa tool(11)
. For systematic reviews we can use the AMSTAR-II tool, which allows us to assess the risk of bias for this type of study(12)
. We invite readers who wish to learn more about the tools for assessing the risk of bias to review the following references:(10-13)
Table 3. Methodological tools for critical reading according to research designs.
Cochrane risk of bias tool
Source: Elaborated by the authors.
Once the risk of bias has been assessed, the validity of the article is determined and therefore, if there is a need to continue reading or start a new one.
We share some key points for the general assessment, which will help to provide a more exhaustive review:
HOW WE ASSESS OUTCOMES: CLINICAL RELEVANCE AND IMPACT
- The registration of protocols, mainly of systematic reviews and clinical trials, is of great relevance(14). For two reasons, the first refers to the evaluation of the study by the evaluation team of the registries where it is submitted, providing evidence of the quality of the study. And second, to compare the registered protocol with the final article and evaluate whether all the outcomes defined in the protocol were reported in the final article, as well as the congruence with the objectives and hypotheses. If this congruence is not found, we could affirm that the results were hidden or obviated and we are probably facing a problem of scientific integrity. Currently, the protocols of observational studies are also being registered, being in Peru the PRISA database managed by the National Institute of Health (https://prisa.ins.gob.pe/).
- Outcome assessment. Outcomes that are not clinically relevant for patients are frequently used. According to the GRADE (Grading of Recommendations, Assessment, Development and Evaluation) system, these can be critical for decision making, as well as important and noncritical(15). What is important is that the outcomes, whether primary (main objective of the study) or secondary (secondary objectives of the study), are clinical outcomes. In clinical epidemiology, the clinical outcomes are 5: death, disease, discomfort, disability, dissatisfaction(16). As an additional outcome, it is proposed to add poverty, since the disease has economic consequences for the patient(16). Intermediate or surrogate outcomes are frequently used, such as blood pressure, glucose levels, tumor diameter, hemoglobin level, among others that are not necessarily related to clinically important outcomes such as death and the others listed above. Efforts should be made to evaluate clinically relevant outcomes, especially mortality, since they bring notable benefits to patients.
- When planning a clinical study, there must always be consistency and coherence between the clinical question, objective and hypothetical(17). Similarly, when the study is published, this consistency and coherence should be maintained, in particular between methods, statistical tests and conclusions. This axis must not be changed or diverted from the objective of the study. For instance, if we want to evaluate the effectiveness of folic acid plus iron compared to folic acid for reducing the symptoms of symptomatic anemia after immunotherapy in patients with primary autoimmune hemolytic anemia, the conclusion should give us a reason why the measure taken is the most efficient. And for no reason should it give us a different conclusion. If this inconsistency is detected, the study is unlikely to yield clear results, the risk of bias is high, and in this case we are facing a publication bias.
- Evaluate sample size and selection. A robust study with adequate inferences close to the true effect is based on probability samples and adequate sample sizes(18). The inclusion of an excessive number of subjects makes the study more expensive. In addition, a study with an insufficient sample size will estimate a parameter with low precision or will be unable to detect differences between groups, leading to erroneous conclusions(18). The lack of sample size and sample selection described in the article implies that the patients were chosen for convenience, limiting the applicability of the results and restricting their use only to the population in which they were performed.
The new findings and contribution to knowledge are found in the outcome section(19)
. Once we have determined the validity of the study, it is necessary to assess the outcomes. In this evaluation, we recommend that it be done in three steps:
- What were the outcomes?
- What is the clinical impact of the outcomes?
- Clinical relevance of the outcomes?
It is important to observe what the outcomes of the different studies were. Table 1
summarizes the most important statistical tests based on the research design for each type of question. We would like to emphasize that, for intervention studies, which are the most frequently consulted on the web, the most frequently used measures to assess the magnitude of the effect are relative risks (RR). These are easily interpreted: if it is greater than 1, the intervention is associated with an increased risk compared to the control group; if it is equal to 1, it makes no difference compared to the control group; and if it is less than 1, the intervention is protective compared to the control group. Subsequently, we must evaluate the confidence interval, which is generally framed within a range with a confidence level of 95%. If the unit is included within this range of values, the RR is statistically not significant(20)
. Similarly, we must determine the strength of association, whether it is strong or very strong. When the RR is greater than 5 or less than 0,2, the association is very strong. When the RR is greater than 2, or less than 0,5, it is strong(15)
. The problem arises when these values are less than 2 or greater than 0.5, here the clinical relevance and expertise of the physician is important to determine whether these small changes are important for the patient's health.
This step is relevant, because not only the presence of association must be evaluated. But also the magnitude of this association. Among the most important measures, we have Cohen's Delta, a measure used to evaluate the effect size in studies comparing independent groups with quantitative outcomes (means). For example, in a clinical trial where they want to evaluate the effect of a new drug compared to no treatment for raising the hemoglobin level in patients with sepsis, here the main outcome is the hemoglobin level (continuous quantitative variable). And to detect the effect, they will use a mean difference between the two groups with their respective 95% confidence interval (if the confidence interval includes zero, then it will not be statistically significant). However, it is important to quantify the magnitude of the effect. This is where Cohen's Delta comes into play, and when calculated as 0,8 (0.2: small; 0,5: medium and 0,8: large), it reveals that the effect size of the drug to increase hemoglobin is strong and can be considered for decision making(21)
. In the following link we share a web application in which it is possible to calculate Cohen's Delta (https://www.socscistatistics.com/effectsize/default3.aspx
). For correlation studies, Pearson's R is used, which indicates whether the correlation is strong, moderate or low. We recommend the following link so that you can go deeper into the evaluation of the effect size for the most important measures of association (https://www.academia.edu/42011025/5._Estad%C3%ADstica_-_Tama%C3%B1o_de_efecto
The third and final step is to evaluate clinical relevance over statistical significance. The relevance depends on the magnitude of the difference, the seriousness of the problem to be investigated, the vulnerability, the morbidity and mortality generated by it, its cost and its frequency, among other elements(22)
. It is recommended that in order to consider the usefulness of an intervention in clinical practice, the minimum difference between groups should be 10% of direct superiority to the other 25(23)
. However, relative risk reductions of 50% almost always and 25% frequently are considered to be clinically relevant regardless of statistical significance(22)
. It is interesting to note that clinical expertise and the physician's knowledge of the disease come into play here. We recommend evaluating the evidence also on the basis of the differences between the number of cases in the different groups. As an example we will take the application of a new drug to increase the number of platelets in patients with thrombocytopenic purpura refractory to treatment compared to Rituximab. After administration, we evaluated how the number of platelets increased between the two groups, and we can see that those who received the new drug increased platelets by 50,000 compared to Rituximab, which only increased platelet levels by 20,000. However, the group sizes are small, 30 for the new treatment group and 28 for the Rituximab group. Because of the sample size, the statistical significance is most likely null, but here we observe something important, which is clinically relevant, because a platelet elevation as achieved by the new treatment is important for a patient with chronic thrombocytopenic disease and brings about remarkable improvements, compared to Rituximab which did not achieve that level, and the results can be used for decision making. This is done on the basis of clinical expertise.
HOW WE APPLY THE OUTCOMES: APPLICATION TO THE PATIENT
There is no general rule that allows us to decide whether the outcomes of an investigation are directly applicable to a specific patient. There is a degree of variability that limits us and makes us take the evidence for decision-making very carefully.
Here we present some steps based on our experience and on the published literature previously cited in this article in order to be able to apply in the most judicious way the results in our patient.
- Compare and corroborate that the selection criteria (inclusion and exclusion criteria) of the study that we have critically read are present in our patient.
- Sex and age, evaluate whether these two variables are present in our patient, In other words, we will not be able to apply the results of studies performed in women over 60 years of age to a 30-year-old male patient. If it is a study that includes both populations, it is best to evaluate the effect in the subgroup of gender and age of our patient.
- Risk factors, it is necessary to evaluate whether the risk factors of the population studied are the same risk factors that our patient has.
- Severity of the disease, also being one of the most important criteria, because the severity of the disease has a great impact on the prognosis of the patient's disease, and interventions performed in patients of greater severity will not necessarily have similar results in patients of lesser severity, thus the relevance of evaluating this important aspect in the population.
In summary, ideally, the study population should be as similar as possible to our patients. However, in daily practice, this is very difficult to achieve. Because of this, the application of the outcomes should be as cautious and responsible as possible and should be based on populations similar to our patients, with evaluation of clinically important outcomes.
Critical reading of clinical studies is essential for the resident physician. Its status in this country, which must probably be deficient as has been the case in other countries, is still unknown. Structured reading of clinical evidence should be inculcated during practical training, since it is largely neglected during hospital training. We hope that these practical bases for the critical reading of clinical studies will be useful for residents of different clinical specialties in the country.
Authorship contributions: All authors contributed to the conception of the present idea, to the drafting and to the final version of the manuscript.
Conflicts of interests: The authors declare that there is no conflict of interest.
Received: July 16, 2020.
Approved: May 20, 2021.
Correspondence: Rafael Pichardo-Rodriguez.
Address: Av. Brasil cuadra 9. Residencial esmeralda.
Telephone: (+51)986 332 210
3. Sackett DL, Rosenberg WMC, Gray JAM, Haynes RB, Richardson WS. Evidence based medicine: what it is and what it isn’t. BMJ. 1996;312(7023):71-2. DOI: 10.1136/bmj.312.7023.71.
4. González-Ávila G, González FAL-. Lectura crítica de artículos de investigación clínica en médicos residentes de oncología. Rev Médica Inst Mex Seguro Soc. 2009;47(6):689-95. Disponible en: González-Ávila G, González FAL-. Lectura crítica de artículos de investigación clínica en médicos residentes de oncología. Rev Médica Inst Mex Seguro Soc. 2009;47(6):689-95.
Galli A, Pizarro R, Blanco P, Kevorkian R, Grancelli H, Lapresa S, et al. Evaluación de la capacidad de los residentes para hacer una lectura crítica de las publicaciones científicas. Investig En Educ Médica. 2017;6(22):127. DOI: https://dx.doi.org/10.7775/rac.v85.i2.10533
Calvache JA, Barajas-Nava L, Sánchez C, Giraldo A, Alarcón JD, Delgado-Noguera M. Evaluación del «riesgo de sesgo» de los ensayos clínicos publicados en la Revista Colombiana de Anestesiología. Rev Colomb Anestesiol. 2012;40(3):183-91. DOI: https://doi.org/10.1016/j.rca.2012.05.013
10. Wolff RF, Moons KGM, Riley RD, Whiting PF, Westwood M, Collins GS, et al. PROBAST: A Tool to Assess the Risk of Bias and Applicability of Prediction Model Studies. Ann Intern Med. 2019;170(1):51-8. DOI: 10.7326/M18-1376.
11. Stang A. Critical evaluation of the Newcastle-Ottawa scale for the assessment of the quality of nonrandomized studies in meta-analyses. Eur J Epidemiol. 2010;25(9):603-5. DOI: 10.1007/s10654-010-9491-z
Shea BJ, Reeves BC, Wells G, Thuku M, Hamel C, Moran J, et al. AMSTAR 2: a critical appraisal tool for systematic reviews that include randomised or non-randomised studies of healthcare interventions, or both. BMJ. 2017;358:j4008. DOI: https://doi.org/10.1136/bmj.j4008
Higgins JPT, Altman DG, Gøtzsche PC, Jüni P, Moher D, Oxman AD, et al. The Cochrane Collaboration’s tool for assessing risk of bias in randomised trials. BMJ. 2011;343:d5928. DOI: https://doi.org/10.1136/bmj.d5928
Dal-Ré R, Delgado M, Bolumar F. El registro de los estudios observacionales: es el momento de cumplir el requerimiento de la Declaración de Helsinki. Gac Sanit. 2015;29(3):228-31. DOI: http://dx.doi.org/10.1016/j.gaceta.2014.10.006
Aguayo-Albasini JL, Flores-Pastor B, Soria-Aledo V. Sistema GRADE: clasificación de la calidad de la evidencia y graduación de la fuerza de la recomendación. Cir Esp. 2014;92(2):82-8. DOI: 10.1016/j.ciresp
Tapia LI, Palomino MA, Lucero Y, Valenzuela R. Pregunta, hipótesis y objetivos de una investigación clínica. Rev Médica Clínica Las Condes. 2019;30(1):29-35. DOI: 10.1016/j.rmclc