Mission Control

Real-time Computer Vision in Unity via OpenCV.

Project Synopsis

Vision-Strike gives the AI actual sight. Instead of using Raycasts (perfect math), we force the agent to interpret a raw pixel feed, simulating real-world robotics constraints.

By streaming a Unity Render Texture to a Python script running OpenCV, we perform object detection (HSV filtering) and return targeting coordinates, closing the loop between Perception and Action.

Why This Matters

Raycasts are "cheating" in the context of robotics. True autonomy requires processing noisy visual data. This project demonstrates mastery over Image Processing Pipelines and Latency Management.

Tech Stack

Unity 2022.3 Python (OpenCV) C# (PID Controller) Socket Streams

Project Checkpoints

Phase 1: The Visual Stream (Byte Transfer)
Phase 2: Object Detection (OpenCV)
Phase 3: The Reactive Loop (Control)
Phase 4: Creative Visuals (HUD & Demo)

Field Notes & Learnings

Key engineering concepts for Computer Vision.

1. The Data Stream

Concept: Sending 60 HD images per second over a socket will crash the network.

Solution: Downsample the Render Texture to 224x224 pixels. This is the standard input size for neural networks (like ResNet) and is sufficient for color tracking while keeping latency low.

2. HSV Color Space

Concept: RGB is bad for detection because "Red" in shadow looks "Brown".

Solution: Convert frames to HSV (Hue, Saturation, Value). Hue separates color from brightness, making the detector robust against lighting changes.

3. PID Controller

Concept: If the camera sees the target to the right, snapping instantly causes jitter.

Solution: Use a Proportional-Integral-Derivative (PID) loop.
• P: Move towards target.
• D: Slow down as you get closer to prevent overshooting.

4. Coordinate Systems

Translating 2D pixels to 3D angles:

Image Space: (0,0) is top-left in OpenCV, bottom-left in Unity. Flip Y-axis before processing.
Error Calculation: ErrorX = (ImageWidth / 2) - TargetX. This value drives the rotation torque.

Implementation

Step-by-step Execution Plan.

Phase 1: The Visual Stream (Week 1)

Setup: Render Texture (224x224) on 2nd Camera.
Encoding: Convert Texture to JPG Bytes in C#.
Stream: Send bytes via Socket to Python.

Phase 2: Object Detection (Week 2)

Decode: `cv2.imdecode` in Python to get Image Matrix.
Filter: Apply `cv2.inRange` for Red/Enemy colors.
Tracking: Find Contours -> Get Bounding Box Center (x,y).

Phase 3: The Reactive Loop (Week 3)

Mapping: Convert (x,y) to Rotation Error (-1.0 to 1.0).
Feedback: Send Error data back to Unity via Socket.
Control: Implement PID script to rotate agent towards target.

Phase 4: Creative Visuals (Week 4)

HUD: Display the "Computer Vision" view in the corner.
DevLog: "Human vs Machine" vision comparison video.

Dev Logs

Engineering notes & daily updates.

Entry 000 Planning

Date: Feb 3, 2026

Project 06 queued for July. Bridging Unity and OpenCV for pixel-based AI targeting.