Abstract: The field of image generation models has seen substantial progress, characterized by a proliferation of diverse generative models and their associated outputs. However, there currently exists a deficiency in methodologies that can concurrently and effectively evaluate both the intrinsic quality of generated images and the alignment between image features and textual prompts. To address these challenges, we propose a novel Framework for evaluating Visual elegance and Sentiment resonance (FVS). The FVS incorporates a novel image aesthetic assessment model, specifically trained to assess the visual attractiveness of the generated images. Additionally, it evaluates the sentiment and aesthetic consistency between textual prompt and the generated image. Experimental results verify that the evaluations from our framework align more closely with human preferences. Moreover, we apply our framework to filter and construct a higher-quality training set of generated images. This curated dataset is then exploited to adapt the generative model, resulting in enhanced generation quality.