hello Tomer We are Reema Nasr and Basel Haddad want your approval of one of this two choices: 1. Dual-Branch Graph Transformer Network for 3D Human Mesh Reconstruction from Video
Paper Link: 2412.01179 Year of Publication: 2024
Abstract: This paper introduces the Dual-Branch Graph Transformer Network (DBGTrans), a novel framework designed to reconstruct 3D human meshes from video sequences. Unlike traditional methods, DBGTrans leverages a dual-branch architecture to effectively extract both spatial and temporal features. The approach integrates graph neural networks with transformers to enhance the accuracy of mesh reconstruction by modeling intricate human body structures over time. Experimental results demonstrate significant improvements in motion fidelity and reconstruction precision across several benchmark datasets, making it a promising solution for applications in animation, gaming, and virtual reality.
2. AVSegFormer: Audio-Visual Segmentation with Transformer
Paper Link: 2307.01146 Year of Publication: 2023
Abstract: The paper presents AVSegFormer, a state-of-the-art transformer-based framework for audio-visual segmentation. By effectively fusing auditory and visual signals, the model excels in segmenting objects and scenes in complex environments. AVSegFormer employs a multi-modal transformer that aligns and processes audio-visual data to capture complementary features, surpassing previous methods in segmentation accuracy. The architecture is lightweight and scalable, making it suitable for real-world applications such as autonomous driving, robotics, and interactive media. Extensive evaluations validate its effectiveness on multiple challenging datasets.