|

The making of an AI news anchor-and its implications.

Bohacek, Matyas; Farid, Hany.

Proc Natl Acad Sci U S A ; 121(1): e2315678121, 2024 Jan 02.

Article En | MEDLINE | ID: mdl-38150500

MuTr: Multi-Stage Transformer for Hand Pose Estimation from Full-Scene Depth Image.

Kanis, Jakub; Gruber, Ivan; Krnoul, Zdenek; Bohácek, Matyás; Straka, Jakub; Hrúz, Marek.

Sensors (Basel) ; 23(12)2023 Jun 12.

Article En | MEDLINE | ID: mdl-37420676

This work presents a novel transformer-based method for hand pose estimation-DePOTR. We test the DePOTR method on four benchmark datasets, where DePOTR outperforms other transformer-based methods while achieving results on par with other state-of-the-art methods. To further demonstrate the strength of DePOTR, we propose a novel multi-stage approach from full-scene depth image-MuTr. MuTr removes the necessity of having two different models in the hand pose estimation pipeline-one for hand localization and one for pose estimation-while maintaining promising results. To the best of our knowledge, this is the first successful attempt to use the same model architecture in standard and simultaneously in full-scene image setup while achieving competitive results in both of them. On the NYU dataset, DePOTR and MuTr reach precision equal to 7.85 mm and 8.71 mm, respectively.

Hand , Upper Extremity , Hand/diagnostic imaging , Benchmarking , Electric Power Supplies , Knowledge

Protecting world leaders against deep fakes using facial, gestural, and vocal mannerisms.

Bohácek, Matyás; Farid, Hany.

Proc Natl Acad Sci U S A ; 119(48): e2216035119, 2022 11 29.

Article En | MEDLINE | ID: mdl-36417442

Since their emergence a few years ago, artificial intelligence (AI)-synthesized media-so-called deep fakes-have dramatically increased in quality, sophistication, and ease of generation. Deep fakes have been weaponized for use in nonconsensual pornography, large-scale fraud, and disinformation campaigns. Of particular concern is how deep fakes will be weaponized against world leaders during election cycles or times of armed conflict. We describe an identity-based approach for protecting world leaders from deep-fake imposters. Trained on several hours of authentic video, this approach captures distinct facial, gestural, and vocal mannerisms that we show can distinguish a world leader from an impersonator or deep-fake imposter.

Artificial Intelligence , Deception , Gestures

One Model is Not Enough: Ensembles for Isolated Sign Language Recognition.

Hrúz, Marek; Gruber, Ivan; Kanis, Jakub; Bohácek, Matyás; Hlavác, Miroslav; Krnoul, Zdenek.

Sensors (Basel) ; 22(13)2022 Jul 04.

Article En | MEDLINE | ID: mdl-35808537

In this paper, we dive into sign language recognition, focusing on the recognition of isolated signs. The task is defined as a classification problem, where a sequence of frames (i.e., images) is recognized as one of the given sign language glosses. We analyze two appearance-based approaches, I3D and TimeSformer, and one pose-based approach, SPOTER. The appearance-based approaches are trained on a few different data modalities, whereas the performance of SPOTER is evaluated on different types of preprocessing. All the methods are tested on two publicly available datasets: AUTSL and WLASL300. We experiment with ensemble techniques to achieve new state-of-the-art results of 73.84% accuracy on the WLASL300 dataset by using the CMA-ES optimization method to find the best ensemble weight parameters. Furthermore, we present an ensembling technique based on the Transformer model, which we call Neural Ensembler.

Algorithms , Sign Language , Humans