Perception.
Synthesized.
A completely autonomous visual navigation matrix. Powered by bespoke YOLOv11seg models, robust Vision-Language Models (VLMs), and real-time haptic translation.
A completely autonomous visual navigation matrix. Powered by bespoke YOLOv11seg models, robust Vision-Language Models (VLMs), and real-time haptic translation.
Information flows seamlessly from photon to haptic response in under 20 milliseconds.
Proprietary segmentation model trained on high-contrast urban navigation datasets. Extracts pixel-perfect object boundaries instantly with TensorRT acceleration.
Multi-modal AI contextualizes the environment. It doesn't just see a "chair"; it understands "an obstacle blocking the immediate path."
Translates 3D spatial data into high-fidelity haptic waveforms. Feel the proximity, speed, and dimensions of objects around you via wearable matrices.
At the center of Occulant's architecture is the real-time feedback loop. A stereo VIDEO array feeds continuous optical data to our local VisionAssist edge processor. This low-power module manages raw signal denoising before handing it over to the cloud or local ML Model.
Unlike standard detection pipelines, our approach utilizes dual-stage validation. A lightweight LLM runs asynchronously alongside the core detection engine, correcting semantic misclassifications and providing a cohesive situational context (e.g., recognizing that a collection of segmented metal parts is actually a bicycle).
Data from the models must be synthesized predictably. Through spatial sonification and directed haptics, the User node receives continuous feedback without sensory overload. The user directly issues asks directly to VisionAssist (via voice or peripheral triggers), instantly recalibrating the network's priority vectors.
Real-time semantic segmentation directs our high-fidelity haptic feedback system, perfectly isolating dynamic obstacles.
The visual world is hopelessly dense. Occulant extracts the salient features of any scene in real-time, masking out irrelevant background noise and focusing computing power entirely on navigational hazards.
In the demonstrator adjacent, the YOLOv11 Segmentation module precisely masks a pedestrian. The Vision-Language Model interprets this as an immediate trajectory threat and formulates actionable language. This textual vector is instantly converted to a physical map: a rapid pulsing sequence localized on the user's left haptic actuator array, alerting them to slow down.
No screens. No delay. Pure intuition.
Experience BlindAssit navigation assistance with real-time YOLO11 detection.
git clone https://github.com/manthanabc/BlindAssit.git
cd BlindAssit && pip install -r requirements.txt
export OPENROUTER_API_KEY="your-key"
./start.sh
http://localhost:5000
Flask + YOLO11 + PyTorch + OpenCV + gTTS + OpenRouter AI + JavaScript + Vibration API
BlindAssit in action - Real-time sidewalk navigation