HCC prediction models in chronic hepatitis B patients receiving entecavir or tenofovir: a systematic review and meta-analysis

We identified 23 publications for the systematic review and meta-analysis after screening 4374 studies from four databases, which included 153,445 CHB patients and 5133 HCC cases (Fig. 1). External validation was not performed in the original research for three model derivation investigations (mREACH-B, PAGE-B, CAGE-B, and SAGE-B). External validation studies or derivation and external validation studies of six models made up the remaining research (HCC-RESCUE, CAMD, mPAGE-B, AASL-HCC, REAL-B, and aMAP). PAGE-B and mPAGE-B were the most commonly externally validated in 19 and 14 studies, respectively, whereas REAL-B and mREACH-B were only validated in one study, respectively. Other models were also frequently validated as follows: CAMD (n = 6), HCC-RESCUE (n = 5), AASL-HCC (n = 4,), CAGE-B and SAGE-B (n = 3), and aMAP (n = 2).

Fig. 1
figure 1

Flow diagram for the systematic analysis and meta-analysis

Characteristics of the included studies

The participants were recruited retrospectively using hospital medical records, whereas Hsu et al. and Yip et al. used an insurance database and the Clinical Data Analysis and Reporting System to perform their studies [11, 22]. Different from other studies, Gui et al. compared model performance in cirrhotic patients [23], and Kim et al. studied veterans in United States [24]. Most models were developed in Asian populations, except for PAGE-B, CAGE-B, and SAGE-B, which were derived from Caucasian populations. Except for REAL-B, which was developed in individuals whose treatment regimen included other oral antiviral medicines, most models were developed in patients treated with entecavir or tenofovir. And aMAP was developed in mixed patients with a treatment proportion of 78%. The number of parameters in the models ranged from three to seven. Age and sex were nearly included in all models and other parameters included albumin, total bilirubin, platelets, cirrhosis, liver stiffness measurement, ALT, HBeAg status, diabetes, alcohol abuse, and alpha-fetoprotein (Table 1). The REAL-B and aMAP derivation cohorts were not included in the meta-analysis because their participants did not match the inclusion criteria.

Table 1 Summary of hepatitis B virus-hepatocellular carcinoma prediction models in the derivation studies

Risk of bias and applicability assessment

The details of the risk of bias and applicability were depicted in Table S12 and Figure S12. According to PROBAST, the predictors and outcome had a low risk of bias, but the participants and analysis had a high risk of bias in 17.4% and 52.1% of studies, respectively. In terms of analysis, model calibration was not performed in eight studies (34.8%) and four studies (17.4%) had a small number of HCC cases. Except for 17.4% of research, which had a high risk of participants, most models had a low risk of applicability.

Fig. 2
figure 2

The discrimination (A), calibration (B) performance and negative predictive values in the low-risk group (C) of HCC prediction models in meta-analysis. aHCC events were not reported by Hsu et al. [11], which included 17,984 participants in the study. AUROC, area under the receiver operator characteristic curve; CI, confidence interval; O:E ratio, observed events versus expected events ratio; NPV, negative predictive value; HCC, hepatocellular carcinoma; mREACH-B, Modified Risk Estimation for Hepatocellular Carcinoma in Chronic Hepatitis B; PAGE-B, Platelet, Age, Gender and HBV; mPAGE-B, modified Platelet, Age, Gender and HBV; HCC-RESCUE, HCC-Risk Estimating Score in CHB patients Under Entecavir; CAMD, the Cirrhosis, Age, Male sex, and Diabetes Mellitus Score; AASL-HCC, Age, Albumin, Aex, Liver Cirrhosis-HCC scoring system; aMAP: the Age-Male-ALBI-Platelets Score; CAGE-B, Cirrhosis and Age Score; SAGE-B, Stiffness and Age Score; REAL-B, Real-world Effectiveness from the Asia Pacific Rim Liver Consortium for HBV

Meta-analysis

The characteristics of included studies in meta-analysis were shown in Table 2. The pooled 3-, 5-, and 10-year AUROC varied from 0.72 to 0.84 (aMAP: 0.72, 95% CI 0.70–0.75; REAL-B: 0.84, 95% CI 0.78–0.90), 0.74 to 0.83 (mREACH-B: 0.74, 95% CI 0.71–0.76; AASL-HCC: 0.83, 95% CI 0.79–0.86), and 0.76 to 0.86 (SAGE-B: 0.76, 95% CI 0.70–0.83; mPAGE-B: 0.86, 95% CI 0.83–0.89), respectively (Fig. 2, Figure S3, Table S3). When predicting 3-year HCC incidence, REAL-B, AASL-HCC, and HCC-RESCUE models had better discrimination with an AUROC > 0.80, while mREACH-B, PAGE-B and aMAP showed an AUROC  0.75, while mREACH-B showed an AUROC  0.75).

Table 2 Characteristics of derivation and external validation cohorts included in the meta-analysis

The pooled 5- and 10-year total O:E ratio ranged from 0.25 to 1.83 (mPAGE-B: 0.25, 95% CI 0.18–0.31; CAMD: 1.83, 95% CI 1.31–2.35) and 1.99 to 2.10 (SAGE-B: 1.99, 95% CI 0.99–2.99; CAGE-B: 2.10, 95%CI 1.02–3.17), respectively (Fig. 2, Figure S4). The pooled 3-year total O:E ratio of CAMD was 0.77 (95% CI 0.51–1.04). HCC-RESCUE, PAGE-B, and mPAGE-B exhibited an overestimation of HCC development, while AASL-HCC, aMAP, CAMD, CAGE-B, and SAGE-B exhibited an underestimation of HCC development.

The pooled 3-, 5-, and 10-year NPVs ranged from 98.3 to 100% (aMAP: 98.3%, 95% CI 96.3-100.3%; REAL-B: 100%, 95% CI 99.5-100.5%), 99.58–100% (aMAP: 99.6%, 95% CI 99.2–100.0%; AASL-HCC: 100%, 95% CI 99.5-100.5%; HCC-RESCUE: 100%, 95% CI 99.6-100.5%; REAL-B: 100%, 95%CI 99.5-100.5%), and 99.6–100% (PAGE-B: 99.6%, 95% CI 98.4-100.8%; CAGE-B: 100%, 95% CI 99.4-100.7%; SAGE-B: 100%, 95%CI 99.4-100.6%), respectively (Fig. 2, Table S4). All models had a high NPV over 99.5% except for aMAP. The proportion of low-risk population ranged from 14.4 to 53.0% (CAGE-B: 14.4%, 95% CI 12.9–16.0%; aMAP: 53.0%, 95% CI 28.5–77.6%) (Table 3). More than half of the population was identified as low-risk by HCC-RESCUE and aMAP.

Table 3 The proportion of low-risk population classified by the models in meta-analysis

Subgroup analysis and meta-regression

The results of subgroup analysis for discrimination and calibration were presented in Table S5. Only three researches compared the performance of PAGE-B, mPAGE-B, and aMAP in cirrhotic and non-cirrhotic individuals. The discrimination performance was generally better in non-cirrhotic patients than cirrhotic patients. PAGE-B, mPAGE-B, HCC-RESCUE, CAMD, and aMAP exhibited greater AUROC in Caucasians, whereas AASL-HCC, CAGE-B, and SAGE-B had comparable discrimination performance in Asians and Caucasians. Several articles reported the model calibration performance, but the difference in cirrhotic and non-cirrhotic population was not reported. The calibration of the subgroup analysis by race was same as that in meta-analysis. And the underestimation of CAMD seems to be more pronounced in Asians than in Caucasians (O:E ratio 2.38 vs. 1.55). We did a meta-regression analysis for model discrimination and calibration performance and found the heterogeneity could not be explained by race (Figure S5).

Publication bias and sensitivity analysis

The funnel plots for the PAGE-B and mPAGE-B model on 5-year discrimination performance were symmetric visually (Fig. 3). Funnel Plots for other models were not analyzed because the number of included studies was small. We mainly discussed the 5- and 10-year predictive performance of model discrimination and calibration, NPV in low-risk, and proportion of low-risk. External validation investigations of REAL-B and mREACH-B were insufficient for sensitivity analysis. After excluding any one research, the pooled 5- or 10-year AUROC of PAGE-B, mPAGE-B, HCC-RESCUE, CAMD, AASL-HCC, CAGE-B, SAGE-B, and aMAP did not change considerably, as shown in Figure S6-7. Sensitivity analysis of calibration was shown in Figure S8-9, and variations in 5-year O:E ratio prediction of CAMD were evident in studies by Hsu and Kim [11, 25]. In Yip et al’s 5-year NPV estimate [22], there was a clear variance in PAGE-B and mPAGE-B (Figure S10). The proportion of low-risk patients detected by AASL-HCC, aMAP, CAMD, PAGE-B, and mPAGE-B did not change significantly in the sensitivity analysis (Figure S11).

Fig. 3
figure 3

Funnel plot with pseudo 95% confidence limits of 5-year AUROC of PAGE-B (A) and mPAGE-B (B)

PAGE-B, Platelet, Age, Gender and HBV; mPAGE-B, modified Platelet, Age, Gender and HBV; AUROC, area under the receiver operator characteristic curve

Pair-wise comparison between HCC-RESCUE and other models

We further explored the meta-values of HCC-RESCUE and other models within the same investigations. Only 4 studies have compared the predictive performance of HCC-RESCUE with PAGE-B, mPAGE-B, CAMD, or AASL-HCC. As depicted in Fig. 4, the 5-year AUROC were 0.81 (95% CI 0.77–0.86), 0.80 (95% CI 0.73–0.86), 0.81 (95% CI 0.75–0.87) for HCC-RESCUE, PAGE-B, and mPAGE-B, respectively. The discrimination was also similar between HCC-RESCUE/CAMD (0.81 vs. 0.81) and HCC-RESCUE/AASL-HCC (0.81 vs. 0.83). The proportion of low-risk patients detected by HCC-RESCUE was significantly higher than that by PAGE-B or mPAGE-B (52.4% vs. 23.3% vs. 30%, Table S6).

Fig. 4
figure 4

The pair-wise comparison of 5-year AUROC between HCC-RESCUE and other models within the same investigations. (A) HCC-RESCUE, PAGE-B, and mPAGE-B; (B) HCC-RESCUE and CAMD; (C) HCC-RESCUE and AASL-HCC. AUROC, area under the receiver operator characteristic curve; CI, confidence interval; PAGE-B, Platelet, Age, Gender and HBV; mPAGE-B, modified Platelet, Age, Gender and HBV; HCC-RESCUE, HCC-Risk Estimating Score in CHB patients Under Entecavir; CAMD, the Cirrhosis, Age, Male sex, and Diabetes Mellitus Score; AASL-HCC, Age, Albumin, Aex, Liver Cirrhosis-HCC scoring system

Published
Categorized as Virology

Leave a Reply