Skip to content

RoBorregos@Home Docs

Person Description

RoBorregos@Home Docs

Welcome to RoBorregos @Home
Overview
Overview
Areas
Areas
Team Members
2024
2024
- Achievements from 2024
- Computer Vision
  Computer Vision
  - Computer Vision
  - Human Analysis
    Human Analysis
    
    Face detection and recognition
    
    Person Counting and Finding
    
    Person Description Person Description
    Table of contents
    
    Moondream
    
    Person Tracking
  - Object Detection
    Object Detection
    
    Dataset generation
    
    Seat detection
    
    Shelf Object detection
  - Utils
    Utils
    
    ZED_Simulation
- Human Robot Interaction
  Human Robot Interaction
  - HRI 2024 Summary
  - Areas
    Areas
    
    Keyword Spotting (KWS)
    
    Local command extraction
    
    Local TTS
    
    Respeaker
    
    RoboMetrics
    
    Improved speech-to-text module
    
    Speech and NLP pipeline upgrades
    
    Local Entities Similarity
- Integration
  Integration
  - Integration
  - Project Structure
  - Troubleshooting
    Troubleshooting
    
    Javier AGX Flashing - Board ID not recognized
- Manipulation
  Manipulation
  - Manipulation
- Navigation
  Navigation
  - Navigation
2023
2023
- Achievements from 2023
- Team Members 2023-2024
- Computer Vision
  Computer Vision
  - Computer Vision
- Electronics and Control
  Electronics and Control
  - Index
- Human Robot Interaction
  Human Robot Interaction
  - Human Robot Interaction
  - Human Physical Analysis
    Human Physical Analysis
    
    Face following
  - Robot Interface
    Robot Interface
    
    Display
  - Speech
    Speech
    
    NATURAL LANGUAGE PROCESSING (NLP)
    
    Human Speech Processing
- Integration and Networks
  Integration and Networks
  - Integration and Networks
- Manipulation
  Manipulation
- Mechanics
  Mechanics
  - Mechanics
- Navigation
  Navigation
  - Navigation
2022
2022
- Achievements from 2022 - June 2023
- Team Members 2022-2023
- Computer Vision
  Computer Vision
  - Computer Vision
  - Human Analysis
    Human Analysis
    
    Overview
    
    Pose Estimation with MediaPipe
  - Object Detection
    Object Detection
    
    Overview
    
    Dataset Automatization
    
    Custom Models
    Custom Models
    
    TensorFlow Lite Model Maker
    
    YOLOv5
- Electronics and Control
  Electronics and Control
  - Control
  - Electronics
  - Boards
    Boards
    
    Boards
- Human Robot Interaction
  Human Robot Interaction
  - Human Robot Interaction
  - Speech
    Speech
    
    Overview
    
    GPT-3 API
    
    Speech To text
    
    Text To Speech
- Integration and Networks
  Integration and Networks
- Mechanics
  Mechanics
  - DashGO x ARM
    DashGO x ARM
    
    Dash Go + xARM
  - RBGS
    RBGS
    
    Base Omnidireccional
2025
2025
- Computer Vision
  Computer Vision
  - Computer Vision
  - Architecture Overview
  - Vision Exercises
  - OnBoarding
  - Human Analysis
    Human Analysis
    
    Clothing Detection
    
    Face Recognition
    
    Person Tracking
    
    Poses and Gestures
  - Object Detection
    Object Detection
    
    Dataset Generation Pipeline
    
    Shelf Detection
    
    Zero-Shot Object Detector
  - VLM
    VLM
    
    Moondream
- Human Robot Interaction
  Human Robot Interaction
  - HRI 2025 Summary
  - Areas
    Areas
    
    Command Interpreter
    
    Local TTS
    
    OpenWakeWord
    
    Speech pipeline upgrades
    
    Display
    
    Embeddings
    
    RAG
- Manipulation
  Manipulation
  - Architecture
  - Manipulation Onboarding Guide
Development
Development
- Development
- HRI
  HRI
  - Weekly Spotlights
- Electronics
  Electronics
  - Weekly Spotlights
- Integration
  Integration
  - Integration Overview
  - Weekly Spotlights
  - Task Breakdown
    Task Breakdown
    
    Tasks per area
    
    Clean Table
    
    Enhanced General Purpose Service Robot
    
    Receptionist
    
    Restaurant
    
    Serve Breakfast
    
    Stickler for the Rules
    
    Storing Groceries
    
    Give me a Hand
    Give me a Hand
    
    Description
    
    To Do Tasks - Give me a Hand
    
    Gpsr
    Gpsr
    
    General Purpose Serivce Robot
    
    Functions for GPSR
    
    Command Break Down
    
    Proposed API for GPSR
- Manipulation
  Manipulation
  - Area Overview
  - Weekly Spotlights
- Mechanics
  Mechanics
  - Overview
  - Weekly Spotlights
- Navigation
  Navigation
- Omnibase
  Omnibase
- Vision
  Vision
  - Node Overview
  - Weekly Spotlights
Resources
Resources
- Codelabs
  Codelabs
  - @Home Codelabs
  - ROS2 @Home Guide
  - General
    General
    
    Tailscale Installation & Usage Guide
    
    Using Tmux
  - Hri
    Hri
    
    HRI Display Guide
- Onboarding
  Onboarding
  - Onboarding
  - Vision
    Vision
    
    Architecture Overview
    
    OnBoarding

Person Description

In order to describe a person, originally dlib was used to extract an approximate age, race and gender. However, this was usually not very precise, so a different approach was integrated.

Moondream

For visual model descriptions the locally hosted VLLM Moondream was used. Model, which is a prompt power visual analysis model. Given a contextualized frame with a specific prompt, the model provides an accurate description of the image enhanced with the prompt analysis and response.