Ampliar

Pages 33-40 (January 2006)

Agreement Between Centers on the Interpretation of Exercise Echocardiography

Concordancia intercentros en la interpretación de la ecocardiografía de ejercicio

Jesús PeteiroaÁngel M AlonsobRafael FlorencianocCarlos González JuanateydGonzalo de la MorenacIgnacio IglesiaseMar MorenofMiguel A Rodrígueze

a Unidad de Ecocardiografía, Hospital Juan Canalejo, A Coruña, Spain.

b Unidad de Ecocardiografía, Hospital Txagorritxu, Vitoria, Álava, Spain.

c Unidad de Ecocardiografía, Virgen de la Arrixaca, Murcia, Spain.

d Unidad de Ecocardiografía, Hospital Xeral, Lugo, Spain.

e Unidad de Ecocardiografía, Hospital de León, León, Spain.

f Unidad de Ecocardiografía, Hospital Gregorio Marañón, Madrid, Spain.

https://doi.org/10.1016/S1885-5857(06)60046-7

View PDF

Lea este artículo en español

Options

Year/month	Html	Pdf	Total
2024 October	2	0	2
2024 September	70	9	79
2024 August	40	42	82
2024 July	43	19	62
2024 June	20	22	42
2024 May	36	9	45
2024 April	33	23	56
2024 March	41	24	65
2024 February	24	22	46
2024 January	24	15	39
2023 December	26	12	38
2023 November	33	12	45
2023 October	35	20	55
2023 September	22	12	34
2023 August	29	10	39
2023 July	39	13	52
2023 June	45	15	60
2023 May	63	23	86
2023 April	44	14	58
2023 March	100	10	110
2023 February	62	17	79
2023 January	36	9	45
2022 December	29	17	46
2022 November	44	18	62
2022 October	49	30	79
2022 September	36	18	54
2022 August	32	27	59
2022 July	28	20	48
2022 June	27	21	48
2022 May	33	32	65
2022 April	40	30	70
2022 March	34	27	61
2022 February	43	20	63
2022 January	30	22	52
2021 December	30	26	56
2021 November	46	27	73
2021 October	26	26	52
2021 September	20	19	39
2021 August	24	29	53
2021 July	26	23	49
2021 June	30	19	49
2021 May	39	21	60
2021 April	81	44	125
2021 March	69	28	97
2021 February	67	13	80
2021 January	52	9	61
2020 December	44	16	60
2020 November	21	14	35
2020 October	31	14	45
2020 September	38	9	47
2020 August	32	13	45
2020 July	51	8	59
2020 June	28	10	38
2020 May	35	11	46
2020 April	29	14	43
2020 March	33	8	41
2020 February	30	10	40
2020 January	30	15	45
2019 December	27	24	51
2019 November	25	15	40
2019 October	12	9	21
2019 September	27	24	51
2019 August	37	21	58
2019 July	135	47	182
2019 June	78	68	146
2019 May	56	48	104
2019 April	41	18	59
2019 March	51	15	66
2019 February	52	16	68
2019 January	55	17	72
2018 December	80	21	101
2018 November	58	13	71
2018 October	65	19	84
2018 September	25	12	37
2018 August	45	8	53
2018 July	69	17	86
2018 June	66	5	71
2018 May	63	11	74
2018 April	46	6	52
2018 March	45	6	51
2018 February	58	6	64
2018 January	48	8	56
2017 December	40	7	47
2017 November	27	8	35
2017 October	34	5	39
2017 September	15	7	22
2017 August	25	10	35
2017 July	23	7	30
2017 June	49	8	57
2017 May	51	6	57
2017 April	34	4	38
2017 March	37	26	63
2017 February	110	4	114
2017 January	37	7	44
2016 December	43	10	53
2016 November	33	5	38
2016 October	102	7	109
2016 September	158	8	166
2016 August	47	8	55
2016 July	57	8	65
2016 June	73	12	85
2016 May	68	21	89
2016 April	74	10	84
2016 March	45	13	58
2016 February	73	20	93
2016 January	84	16	100
2015 December	59	8	67
2015 November	63	15	78
2015 October	70	23	93
2015 September	71	23	94
2015 August	98	18	116
2015 July	72	11	83
2015 June	44	7	51
2015 May	95	7	102
2015 April	58	8	66
2015 March	53	12	65
2015 February	60	13	73
2015 January	38	4	42
2014 December	53	11	64
2014 November	38	4	42
2014 October	61	6	67
2014 September	57	6	63
2014 August	38	9	47
2014 July	30	9	39
2014 June	44	7	51
2014 May	54	7	61
2014 April	56	7	63
2014 March	69	12	81
2014 February	72	6	78
2014 January	61	14	75
2013 December	63	22	85
2013 November	53	11	64
2013 October	51	19	70
2013 September	64	31	95
2013 August	64	27	91
2013 July	58	36	94
2013 June	59	31	90
2013 May	57	18	75
2013 April	43	21	64
2013 March	31	12	43
2013 February	50	10	60
2013 January	30	7	37
2012 December	27	12	39
2012 November	16	14	30
2012 October	8	8	16
2012 September	1516	0	1516

Keywords

Exercise echocardiography

Intercenter agreement

Accuracy

INTRODUCTION

One of the main limitations of stress echocardiography is its variability. Hoffmann's first study, although carried out with fundamental imaging and without uniform reading criteria, found only low agreement in the interpretation of dobutamine stress echocardiography.1 This improved in a subsequent study by the same author when using harmonic imaging and uniform reading criteria.2

However, and surprisingly, although exercise echocardiography (EE) is the oldest,3 most sensitive and safest4,5 method of administering stress, as well as being the most widely used,6 no study has been done to investigate intercenter agreement using this technique. Thus, the purpose of this study was to evaluate: a) intercenter agreement on EE, and b) the sensitivity, specificity, and diagnostic accuracy of the technique under blinded conditions.

PATIENTS AND METHODS

Six centers participated in the study, each having broad experience with stress echocardiography and, in particular, with EE (having carried out between 1000 and 7000 EE). Each of the 6 centers sent 25 study results. Of these, 15 were positive or negative EE studies on consecutive patients undergoing coronary angiography within 3 months of EE; and the other 10 studies were on non-diabetic patients, also consecutive, asymptomatic or with non-coronary chest pain and with a <10% pretest probability of coronary artery disease (CAD) according to sex, age, and risk factors.7 Thus, each center evaluated 150 cases: 125 under blinded conditions (data from other centers) and 25 from their own center with knowledge of the clinical data.

State-of-the-art equipment was used with second harmonic imaging and stress digitalization packs (Sonos-5500, Philips, used by 4 centers and Vivid-5, GE, used by 2 centers). Each study was sent to the coordinating center on optical disk, which then re-distributed them to the other centers either in the same format or on video tape, depending on each center's capabilities. Apical 4- and 2-chamber and parasternal long-axis and short-axis views were compared, at rest and under stress in quad-screen format.

Reading Criteria

Uniform reading criteria8 were used. A positive EE was defined when there was at least 1 abnormal segment at rest or under stress, or tardokinesia in the event that there were no alterations in conduction, and negative EE when no segment was abnormal at rest or under stress, or there was hypokinesia isolated from the posterobasal and/or septobasal segment, unless accompanied by dyssynergy in one adjacent segment.

Each center categorized every positive result as necrosis (regional alteration in wall motion that persisted or improved with stress), ischemia (alteration in wall motion with stress), ischemia plus necrosis in the same territory (alteration in baseline wall motion that worsened in the same territory with stress), or ischemia at a distance (alteration in wall motion in 1 or more territories at baseline, with the appearance of new alterations in wall motion in a different territory with stress). Wall motion score index at rest and under stress was calculated in each reading by dividing the left ventricle into 16 segments.9 The territories affected in each study were determined according to whether they were dependent on the left anterior descending coronary artery (LAD), circumflex artery (Cx), right coronary artery (RC), or a combination of them.

In addition, each center objectively and subjectively assessed the quality of each study. A segment quality score was used for the objective assessment where a score of 3 was assigned to each segment with good visibility (thickness and displacement), 2 to those with fair visibility, 1 to those with poor visibility, and 0 to the non-visible. For the subjective assessment, each study was qualified as good, fair, poor, or non-interpretable.

Statistical Analysis

The SPSS 12.0 statistical package was used. Continuous variables are presented as mean±SD. Discrete variables are presented as percentages. Comparisons between patients with and without CAD were done via χ² test for discrete variables and Student t test for continuous variables. Agreement between 2 centers was estimated by the percentage agreement (negative or positive EE) found after analyzing studies from other centers without including the cases of the centers themselves (150-50 cases=100 cases). The percentage agreement and kappa coefficients (κ) (proportion of agreement higher than that due to chance) were as follows: a κ coefficient between 0 and 0.20 was considered very low; between 0.21 and 0.40, low; between 0.41 and 0.60, moderate; between 0.61 and 0.80, good; and between 0.80 and 1.0, excellent.10 The sensitivity, specificity, and diagnostic accuracy for each center were calculated by the centers assessing their own cases, as well as by blinded assessment of the other centers'cases. Sensitivity was defined as the percentage of cases with positive EE among patients with significant coronary stenosis in at least 1 vessel. Specificity was defined as the percentage of cases with negative EE among patients without angiographically demonstrated coronary lesions or with a low pretest probability. Diagnostic accuracy was defined as the percentage of successes (cases with positive EE and CAD, plus cases with negative EE and absence of CAD) from total patients.

RESULTS

One hundred and forty-nine studies were available for analysis (1 study was excluded due to poor images). Contrast agents were used for left ventricular opacification in 9 studies (6%) and the stress study was done with peak stress imaging in 124 cases (83%).

Baseline Clinical Characteristics
and Response to Stress

Significant CAD was found in 58 patients (39%) as defined by stenosis ≥50% in ≥1 coronary artery, main branch, or coronary artery bypass graft, whereas 91 patients (61%) had angiographically demonstrated non-significant CAD (n=37), or low pretest probability according to the previous definition (n=54). There was 1-vessel disease in 24 patients with CAD, 2-vessel disease in 18, and 3-vessel disease in 16. The LAD was stenosed in 40 patients, the RC in 39 and the Cx in 29. Table 1 shows baseline clinical characteristics, medication, and baseline electrocardiogram (ECG) data in patients with and without CAD. Table 2 shows data on response to stress in patients with and without CAD.

Image Quality

The subjective assessment of the quality of the studies differed significantly between the different centers. Some centers described a high percentage of studies as good (≥80% of studies), whereas others only considered less than half the cases as good and between 0 and 8% as non-interpretable (Figure 1). The same differences were found when the different centers calculated the quality of the segment wall motion score (Figure 2). In general, the centers that qualified the others as worse tended to have better quality images according to the other centers.

Figure 1. Percentage of studies qualified as good, fair, poor, and non-interpretable according to the different centers.

Figure 2. Scoring of quality of studies from other centers according to the referring center (light columns) and scoring of quality of studies from each center according to the other centers (dark columns).

Agreement

Four or more of the 5 centers that assessed each case under blinded conditions agreed on a positive diagnosis of CAD in 51 patients and on a negative diagnosis in 65 patients, which means that there was agreement on a total of 116 of the 149 patients (78%). There was agreement regarding a positive or negative diagnosis of CAD in 4.1±0.9 centers out of the 5 centers. There was a mean κ coefficient of 0.48 between the different centers, with mean intercenter κ coefficients ranging from 0.45 to 0.52. The percentage agreement and the κ coefficients in different scenarios are shown in the Table 3. The percentage agreement and the κ coefficient differed according to the diagnosis of regional contractility anomalies by the referring center, and the percentage agreement was greater when the referring center had detected baseline anomalies in regional contractility in a given territory, contractility anomalies at rest and/or with stress in the LAD territory, or when a worse wall motion score index with stress were reported (Table 4).

Sensitivity, Specificity, and Diagnostic Accuracy

The percentage of positive and negative readings, as well as the sensitivity, specificity, and diagnostic accuracy differed between the different centers when assessed under blinded conditions (Figure 3). There were 2 centers with high sensitivity but low specificity and 1 where the opposite occurred.

Figure 3. Sensitivity, specificity and diagnostic accuracy of each center that assessed, under blinded conditions, the cases referred by the other centers.

The mean sensitivity, specificity, and diagnostic accuracy of the 6 centers regarding stenosis ≥50% in at least 1 vessel (according to visual estimation) was 68%, 66%, and 67%, respectively. The mean sensitivity and specificity of the different centers was similar in tests which were higher or lower than submaximal (68% vs 64% and 66% vs 65%, respectively). These data contrast with the mean sensitivity, specificity and diagnostic accuracy of each center when they assessed their own cases (Figure 4). If we consider that the positive cases of coronary artery disease were those with stenosis ≥50% in at least 1 vessel or with a history of acute myocardial infarction (AMI) and baseline dyssynergy according to the diagnosis of the referring center, then the sensitivity, specificity, and diagnostic accuracy in the blinded reading were similar to those obtained with stenosis ≥50% as the only criterion: 69% (intercenter range, 53%-82%), 70% (range, 49%-89%) and 69% (range, 64%-78%). The sensitivity, specificity, and diagnostic accuracy according to the decisions of the majority (4 or more centers when 5 centers were assessing data; 3 or more when 4 centers were assessing data) were similar regarding ≥50% stenosis in one vessel and for a criterion of ≥50% stenosis or a history of AMI and baseline dyssynergy depending on the referring center (72% vs 73%; 74% vs 80%; and 73% vs 77%, respectively; P=NS):

Figure 4. Sensitivity, specificity and diagnostic accuracy of exercise echocardiography in detecting coronary lesions with ≥50% stenosis assessed under blinded conditions or with knowledge of clinical data and response to stress. The means of the centers and intercenter ranges are shown.

False Positive Readings

Out of 403 readings corresponding to 83 patients without coronary stenosis, previous AMI, or baseline dyssynergy according to the referring center, there were 124 false (+) readings corresponding to 31% of the assessments without CAD, with a wide intercenter range (11%-51%). These false (+) readings were mainly due to ascribing contractile alterations to the RC territory (36% of the readings) or to the LAD (35%), and less often to the Cx territory (10%) or to several territories (19%). The segment wall motion score index (WMSI) measured by the assessing centers in these cases was 1.1±0.2 at rest and 1.3±0.2 with stress.

False Negative Readings

There were 319 readings corresponding to 66 patients. Of those who had angiographically demonstrated coronary stenosis some had a history of AMI. In those who did not undergo coronary angiography, or where this was negative, all had a medical history of AMI and dyssynergy according to the referring center. There were 102 false (-) readings which corresponded to 32% of the assessments with CAD (intercenter range, 18%-47%). In most of these cases there was only 1-vessel disease (45%; LAD disease in 23 of them) or 2-vessel disease (34%), and on fewer occasions 3-vessel disease (9%) or disease in no vessels (13%). The referring center reported dyssynergy in 32 of these 66 patients (48%), which was severe (WMI, 1.50) in 12 of them (18%).

DISCUSSION

The main interest of this study lies in it being the first in which intercenter agreement on exercise echocardiography has been assessed. The main findings were as follows: a) the intercenter agreement on exercise echocardiography was moderate, and b) the sensitivity, specificity and diagnostic accuracy of the technique when carried out under blinded conditions were lower than those commonly reported when baseline characteristics and patient response stress are known.

Intercenter Agreement on Exercise Echocardiography

Although Hoffmann et al studied intercenter agreement on dobutamine stress echocardiography,1,2 there are no similar studies on exercise echocardiography despite being more frequently used, sensitive and safe.4,6 Low agreement was observed (κ=0.37) in Hoffmann's first study1 which was carried out with fundamental imaging and without uniform reading criteria, whereas in the second study, carried out with harmonic imaging and uniform reading criteria, agreement was moderate (κ=0.55).2 The improvement in agreement seemed to be due both to using harmonic imaging and the standardization of the reading criteria, since the degree of agreement on the same patients studied with fundamental imaging was greater than in Hoffman's first study. We used the same reading criteria as in Hoffmann's second study,2 which, in general, did not involve any change in the normal clinical practice followed in each center. It could be expected that the degree of agreement on exercise echocardiography would be less than that carried out with dobutamine, since there should be better quality images with the latter technique. However, by means of uniform reading criteria and harmonic imaging the percentage agreement was moderate, with a mean κ coefficient of 0.48, which is better than that of Hoffmann's first study and similar to the author's second study. The percentage agreement was greater in 3-vessel disease, in left anterior descending coronary artery disease, when there were baseline alterations in regional contractility and when the referring center reported dyssynergy in the LAD territory or serious dyssynergy: the fact of higher percentage agreement in these circumstances has clinical diagnostic and prognostic relevance, since the patients with these characteristics have a worse prognosis.11-13

Intercenter Agreement on Other Diagnostic Techniques

Concern over the degree of agreement is not exclusive to stress echocardiography. Different degrees of variability in interpretation have been observed with other techniques. Thus, very low levels of agreement have been reported regarding the interpretation of ST-segment elevation (κ=0.05) or ST-segment depression (κ=0.38) between 2 centers in patients with acute coronary syndrome.14 Studies on myocardial perfusion with nuclear medicine procedures also present difficulties in interpretation, since these techniques are subjective and, as in exercise echocardiography, the experience of the observer and image quality can influence the interpretation. A moderate-high agreement has been reported with thallium imaging, with κ coefficients ranging between 0.56 and 0.74 in 2 studies.15,16 However, in a multicenter study with 25 participating hospitals, agreement between different centers without uniform reading criteria was low (κ=0.27).17 In a study by Candell-Riera et al,18 good agreement was found with exercise technetium-99m tetrofosmin myocardial perfusion single-photon emission computed tomography, with κ coefficients between 0.62 and 0.70 depending on whether topographical images or polar mapping were evaluated.18 This study also found that the sensitivity of the report under blinded conditions was significantly lower than that reported when the clinical data of the patient were known.

Sensitivity, Specificity, and Diagnostic Accuracy

Although we present mean sensitivity and specificity scores for the different centers, the variability among them regarding interpretation under blinded conditions is of more interest. However, the sensitivity, specificity, and diagnostic accuracy of the technique carried out under blinded conditions were lower for each center than when they assessed their own cases and the baseline characteristics of the patients and their response to stress were known. This finding is not surprising, but it gives us an idea of the limitations of the technique when the pretest probability, hemodynamic, clinical, and ECG response to stress are not known. It is clear that EE can and should be used in clinical practice, but not under blinded conditions.

Limitations

The reading format of the studies was the same for all the centers (apical 4- and 2-chamber and parasternal long-axis and short-axis views at baseline and with stress), although the quality differed since, depending on whether the centers could read optical disks, the study was recorded on video or was sent via optical disk. However, the percentage agreement and the κ coefficients were similar for studies of optimal and suboptimal quality. The complete study recorded on video was not sent out as was done in the study by Hoffmann et al.1 This fact could lead to overestimating agreement, since the operator tends to acquire and store the images he/she considers more representative taking into account other test characteristics different from those of the image. Twenty-five percent of the patients were receiving treatment with beta-blockers and up to 40% of the tests were lower than submaximal. This fact could have led to underestimating sensitivity, although we have not found higher sensitivity in the tests that were maximal in comparison to those that were submaximal.

See editorial on pages 9-11

Study financed by the RECAVA Cardiovascular Network.

Correspondence: Dr. J.C. Peteiro.
P.o Ronda, 5, 4.o izqda. 15011 A Coruña. España.
E-mail: pete@canalejo.org

Received June 9, 2005.
Accepted for publication October 18, 2005.

Bibliography

[1]

Hoffmann R, Lethen H, Marwick T, Arnese M, Fioretti P, Pingitore A, et al..

Analysis of interinstitutional observer agreement in interpretation of dobutamine stress echocardiograms..

J Am Coll Cardiol, (1996), 27 pp. 330-6