Abstract: Pre-trained vision-language models (VLMs) and language models (LMs) have recently garnered significant attention due to their remarkable ability to represent textual concepts, opening up new ...
Scene text image super-resolution (STISR) aims to improve the visual clarity of the text in low-resolution scene images. Due to the intrinsic lack of detailed text appearance information in the ...