Can speech be unsupervisedly discretized? Is it better to represent speech with discrete units than with continuous features in direct ST? How much benefit can useful MT techniques bring to ST?
Abstract: This paper presents an infinite-impulse response (IIR) low-pass filter that operates in a discrete-time chargerotating manner and achieves a very sharp roll-off characteristic. The key idea ...
2026-01-26 Neural Multi-Speaker Voice Cloning for Nepali in Low-Resource Settings Aayush M. Shrestha et.al. 2601.18694 null 2026-01-26 UrgentMOS: Unified Multi-Metric and Preference Learning for ...