Skip to main content
Technique

Depth Map

A depth map is an image where pixel brightness represents distance from the camera — bright is close, dark is far. AI models use depth maps to understand 3D structure and control generation spatially.

June 18, 2026

Depth Map - AI image and video glossary preview from eaxy (depth map)
Depth Map - AI image and video glossary preview from eaxy (depth map)

A depth map is a grayscale image where each pixel's brightness represents its distance from the camera — bright pixels are close, dark pixels are far. AI models use depth maps to understand 3D structure in 2D images.

How it works

A depth map encodes the 3D geometry of a scene into a 2D image format. Each pixel's value (from 0 to 255 in grayscale) corresponds to how far that point is from the camera. A person standing close to the camera appears bright; a distant mountain appears dark. This gives AI models a spatial understanding that pure RGB images lack.

Why it matters in AI generation

Depth maps enable spatial control. When used with tools like ControlNet, a depth map can guide generation so that the output follows the same 3D structure as the input — a person in the foreground, a building in the midground, sky in the background. This is how AI can transform a photo into a different style while preserving its composition and perspective.

In AI video, depth maps help the model understand which parts of the scene should move independently (foreground objects) versus which parts are part of the environment (background). This improves temporal consistency and motion realism.

In eaxy

Eaxy's generation pipeline uses depth understanding internally to produce more spatially coherent images and video. When you use image-to-image or image-to-video, the model infers depth from your reference image to maintain structural consistency in the output.

Related terms

Frequently asked questions

What is a depth map in AI image generation?+

A depth map is a grayscale image where each pixel's brightness represents how far that point is from the camera. AI models use it to understand the 3D structure of a 2D image, enabling spatial control over generation.

How does ControlNet use depth maps?+

ControlNet can take a depth map as input and use it to guide generation — ensuring that the generated image follows the same 3D structure as the depth map while applying a new style, subject, or content.

Can I generate a depth map from any image?+

Yes. AI depth estimation models can infer a depth map from any 2D image. This depth map can then be used to guide new generation, create 3D effects, or control video motion.

Make it with eaxy

Describe anything and generate stunning images in seconds - then bring them to motion with the best AI video models.

Related

Useful next steps