Development of a consensus algorithm to improve interobserver agreement and accuracy in the determination of tricuspid regurgitation severity
Grant, ADM and Thavendiranathan, P and Rodriguez, LL and Kwon, D and Marwick, TH, Development of a consensus algorithm to improve interobserver agreement and accuracy in the determination of tricuspid regurgitation severity, Journal of the American Society of Echocardiography, 27, (3) pp. 277-284. ISSN 0894-7317 (2014) [Refereed Article]
Background Multiparametric scoring of valvular regurgitation may compromise interobserver agreement, as readers weight parameters differently. The aims of this study were to quantify interobserver variability in the grading of chronic tricuspid regurgitation (TR), develop an algorithm for grading TR, and assess the effect of this algorithm on concordance and accuracy. Methods On the basis of current guidelines, two experts graded the severity of TR by consensus in 40 patients with a spectrum of TR severity. A subgroup of patients (n = 18) also had TR severity assessed by cardiac magnetic resonance. Sixteen cardiologists independently graded the first 20 cases as severe or nonsevere TR. After group review, a grading algorithm to differentiate severe and nonsevere TR was devised by consensus. The same observers used the algorithm to grade the second set of cases. Results Baseline differentiation of severe from nonsevere TR showed modest reliability and accuracy compared with an expert read (multirater κ = 0.55; overall agreement, 78%; accuracy, 81%). The consensus algorithm for severe TR was a suggestive color jet and at least one of (1) right atrial area > 18 cm2 and inferior vena cava diameter > 2.5 cm; (2) vena contracta width > 0.7 cm and jet area > 10 cm2; (3) a dense, triangular TR Doppler profile; and (4) holosystolic reversal of hepatic vein flow. Application of this algorithm improved the multirater κ coefficient to 0.80, the level of agreement to 90% (P =.033), and mean reader accuracy to 92% (P =.001). Conclusions Only modest baseline agreement was found between readers on the distinction of severe and nonsevere TR. An objective, structured grading algorithm improved both interrater agreement and accuracy. Copyright 2014 by the American Society of Echocardiography.