Abstract:Ultrasound image examination has become the preferred choice for diagnosing non-alcoholic fatty liver disease (NAFLD) due to its non-invasive. Computer-aided diagnosis (CAD) technology can help doctors avoiding deviations of detection and classification in NAFLD. Therefore, we propose a hybrid model that combines the pre-trained VGG16 network combined with the attention mechanism and the stacking ensemble learning model, which has ability of multi-scale feature aggregation based on the self-attention mechanism and multi-classification model fusion (Logistic regression, random forest, support vector machine) based on stacking ensemble learning. The proposed hybrid method achieves four classifications of normal, mild, moderate, and severe fatty liver based on ultrasound images, and it reaches an accuracy of 91.34%, which is slightly better than traditional neural network algorithms. Experimental results show that compared with the pretrained VGG16 model, adding the self-attention mechanism improves the accuracy by 3.02%. Using the stacking ensemble learning model as a classifier further increases the accuracy to 91.34%, exceeding any one single classifier such as LR (89.86%) and SVM (90.34%) and RF (90.73%). The proposed hybrid method can effectively improve the efficiency and accuracy of NAFLD ultrasound image detection.