Abstract: Mixed reality (MR) devices using a near eye display (NED) are gathering attentions with plenty of expectation. Since the MR NED is a head-mounted type device, the form factor of it is one of ...
Abstract: In this paper, we propose a method to improve the accuracy of speech emotion recognition (SER) by using vision transformer (ViT) to attend to the correlation of frequency (y-axis) with time ...