Sign Language Segmentation Using a Transformer-based Approach

Pérez Villegas, Luis Francisco2024-05-202024-05-202022-09-01https://hdl.handle.net/20.500.14468/14662Continuous Sign Language Recognition (CSLR), predicting the meaning of the signs in sign language sentences, is one of the current challenges in translation between sign and spoken languages, that would benefit people with hearing impairment. An important limitation of this research field is the lack of annotated datasets, which could be minimized with Sign Segmentation approaches by automating the costly task of manually annotating the beginning and ending of each sign. The goal of this paper is to study the performance of an architecture which combines I3D CNN extracted features with a transformer-based model called ASFormer which was created specifically for Action Segmentation task. In our approach ASFormer, instead of separating actions in motions is separating signs in a signed speech. Several ablation studies are performed, and it is shown that ASFormer is suitable for segmenting the signs, with a performance near the ones of the state-of-the-art models, confirming the promising benefits of using attention-based approaches in this field.eninfo:eu-repo/semantics/openAccessSign Language Segmentation Using a Transformer-based Approachtesis de maestría