IN DEVELOPMENT
Multi-modal Control for Robotic Devices

The transition toward Industry 5.0 necessitates robotic systems capable of seamless, intuitive collaboration with humans, moving beyond pre-programmed tools to adaptive partners. However, current Human-Robot Interaction (HRI) frameworks often struggle to interpret natural, multimodal human cues in dynamic environments, relying on rigid control interfaces that lack contextual reasoning. This research aims to bridge the gap between high-level cognitive reasoning and low-level motor control by developing a robust, multimodal HRI framework for a 7-Degree-of-Freedom (DoF) robotic manipulator. The methodology integrates three core components: a Convolutional Neural Network (CNN) based system (GESTID) for real-time static and dynamic gesture recognition; a Goal-Oriented Action Planning (GOAP) engine for dynamic task sequencing; and a Large Language Model (LLM) interface for interpreting natural language commands. These components are unified via the Robot Operating System (ROS) to control a Franka Emika Panda robot. The study specifically investigates the use of LLMs to perform semantic mapping of ambiguous speech instructions into deterministic robotic actions.

Experimental evaluations across complex scenarios, such as beverage preparation, demonstrated high system efficacy. The vision-based gesture recognition module achieved a classification accuracy of 99.1% with low latency, enabling realtime teleoperation. The speech-based interaction, integrated with GOAP, yielded a task success rate of 85.4% under dynamic conditions, while the LLM-driven semantic mapping correctly interpreted natural language instructions with 94.67% accuracy. These findings indicate that fusing visual perception with linguistic reasoning significantly enhances robotic adaptability. This thesis contributes a validated, scalable framework for multimodal HRI, offering tangible implications for industrial automation, healthcare, and assistive technologies where intuitive, hands-free control is paramount.

Dr Almas Baim

Dr Almas Baim

Lead Software Architect, Co-Founder

Sajjad Hussain Sajjad Hussain

Multi-modal Robotics Control

Ola Ezekiel Ola Ezekiel

Emotion Recognition and Analysis

Awes Mubarak Awes Mubarak

Determinism in Large Language Models

Hansen Han Hansen Han

Text-conditioned Human Motion Generation

Nacho Cabrera Martin Nacho Cabrera Martin

Sign Language Recognition for Mobile Devices

Simi Ibraheem Simi Ibraheem

Generative AI for Context-Aware NPC Dialogue

TBC (November 2026) TBC (November 2026)

TBC

Study for your PhD

We welcome any applications related to AI and/or Robotics. You can find out more at UoB Computing PhD .

Work with our team

We are always looking to create new networks and support academic and industrial collaborations.

Contact Us

University of Brighton

Advanced Engineering Building

Brighton, United Kingdom, BN2 4AT

Please contact Almas for any queries related to the AI Robotics Lab.