Feb 28, 2026

Visualizing Vertex and Fragment Shaders in the Graphics Pipeline

Shaders are small programs that run on the GPU as part of drawing a scene. Some decide where geometry appears on screen, while others decide what that geometry looks like once it is drawn. The first useful distinction in that process is between vertex shaders and fragment shaders. A vertex shader runs once for each vertex in your mesh, while a fragment shader runs for the fragments generated inside the triangle coverage on screen, roughly one per covered pixel sample. Those two stages do different jobs and run at very different rates, which explains much of the behavior, cost, and visual output you see in real-time rendering.

This article builds a concrete model of where each stage runs, what data it reads, and what data it produces. It starts with the vertex-versus-fragment split because that is the foundation for the rest of the graphics pipeline. Later, once that baseline is clear, we place geometry, tessellation, and compute shaders in context.

Graphics Pipeline Shader Stages

Most real-time 3D engines follow a rasterization pipeline, meaning they start with triangle data and turn it into the 2D image you see on screen. In that process, vertices enter first, then become primitives such as triangles, then rasterization converts those triangles into fragments, and finally the surviving results are written into the framebuffer, which is the image buffer holding the current frame before it is displayed. The two core programmable shader stages in this path are the vertex shader before rasterization and the fragment shader after rasterization. Optional shader stages can add, refine, or compute data around that path, but the vertex-to-fragment handoff is the baseline for ordinary triangle rendering. So the most important question is: what does each stage receive as input?

Vertex shader input: one vertex at a time (position, normal, UV, tangents, custom attributes)
Vertex shader output: transformed clip-space position plus per-vertex values that will be smoothly interpolated across the triangle, such as color, normals, or UVs
Fragment shader input: those interpolated values for each fragment plus textures, uniforms, and material parameters
Fragment shader output: one or more color/depth values written to render targets

As the triangle expands across more of the right-hand panel, the difference between vertex work and fragment work becomes much easier to read. The triangle still has only three vertices, but the rasterized coverage touches many more screen cells, so fragment work rises much faster than vertex work. Switching to Optional stages places tessellation, geometry, and compute relative to that main vertex-to-fragment path.

In the optional-stage view, tessellation and geometry should both be understood as extra processing that happens before rasterization, not between rasterization and the fragment shader. Compute shaders are separate again: they are GPU programs, but they do not sit on this triangle-to-fragment path at all.

Even when the triangle stays small, moving between Vertex work and Fragment work shows that the stage order does not change with workload. Once the triangle fills more of the screen-space panel, the fragment side grows in visual density while the count of corner vertices stays unchanged. The Optional stages view then makes the structural distinction clear: tessellation and geometry still feed rasterization, while compute remains off to the side on its own execution path.

A practical way to remember this is to track where count explodes. A mesh may have tens of thousands of vertices, but a full-screen draw can touch millions of fragments. Because fragment count is often much larger, expensive math in fragment shaders usually costs more frame time than the same math in vertex shaders.

What Is A Vertex Shader?

A vertex shader is usually responsible for geometric placement. Its most common job is multiplying each vertex position by model, view, and projection matrices. It can also prepare values for later stages, such as world-space normals, texture coordinates, or effect-specific data like a mask value, a blend factor, or a direction vector that the fragment shader will use later. In direct terms, a vertex shader is a GPU program that runs once for each input vertex in a draw call. It cannot shade every covered pixel because rasterization has not happened yet. Its main output is a clip-space position, plus any per-vertex values that should be interpolated across the primitive for later fragment work.

Mathematically, the canonical transform often looks like this:

\text{clipPos} = P\,V\,M\,[x,y,z,1]^T

Here, [x,y,z,1]^T is the original vertex position written in homogeneous coordinates. M is the model matrix, which places the object in the world. V is the view matrix, which expresses the scene from the camera’s point of view. P is the projection matrix, which maps that camera-space position into clip space so the GPU can continue toward screen-space rendering.

The important detail is not the formula itself, but the execution rate. If your model has 20,000 vertices, this shader runs around 20,000 times for that draw call. It does not run for every pixel on screen.

In the playground below, the gray triangle is the incoming vertex data and the blue triangle is the transformed output after a simplified transform chain. Dragging the gray corners reshapes the original mesh, while dragging the blue result, its rotation handle, or its scale handle changes the transformed positions that the vertex stage would hand to rasterization. The important thing to notice is that all of that motion still happens by moving only the triangle corners in normalized device space, without creating any new interior detail.

When the source triangle is reshaped on the left, the output on the right keeps the same per-vertex transform, which makes it easier to separate mesh structure from later placement work. Rotating or translating the blue triangle shows the transformed vertices being repositioned as a group, and increasing scale makes the triangle occupy more of the screen without creating extra geometric corners. That larger screen footprint also hints at why later fragment cost can rise even when this stage still processes only three vertices.

One practical consequence of this stage split is that vertex shaders are good at shaping and preparing geometry, while fragment shaders are better for fine image detail inside each triangle. If you push pixel-like appearance work into the vertex stage, the result usually looks blocky or unstable because vertex outputs are only known at the triangle corners and then interpolated across the surface. Interpolation is useful, but it is not equivalent to true per-fragment computation.

What Is A Fragment Shader?

After primitives are rasterized, the GPU generates fragments. Each fragment has interpolated varyings from the vertex stage. Now the fragment shader decides the visible surface appearance: base color, texture detail, lighting response, transparency logic, and sometimes whether a fragment should be discarded. In direct terms, a fragment shader is a GPU program that runs for each generated fragment, roughly each covered pixel sample before depth, stencil, blending, and render-target writes finish the frame. It receives interpolated data rather than raw mesh vertices. That makes it the right place for pixel-level material decisions.

This stage is where materials become image detail. If you sample a texture, combine normal maps, compute BRDF terms, apply fog, and blend layers, that work usually happens here.

The first thing to make concrete is that a fragment shader does not receive three separate corner values and then choose one. It receives values that were blended across the triangle during rasterization. As the probe moves through the visualization below, the inspected fragment color changes continuously because the GPU is mixing the three vertex outputs by barycentric weight before fragment shading begins.

Notice what changes smoothly across the triangle: not the original vertex values directly, but interpolated values. That interpolation step is one of the core reasons the vertex and fragment stages are paired. The vertex stage prepares data endpoints, and the fragment stage uses continuous values between endpoints to compute final appearance.

Normals follow the same rule. They are often written once per vertex, then blended across the primitive, and only then normalized and used for lighting. The next explorer makes that handoff visible: changing the vertex normal directions reshapes the normal field across the triangle, while moving the probe reveals the difference between the raw blended normal and the normalized direction that is actually useful for shading.

This detail matters because many lighting models depend on that interpolated normal rather than on a flat triangle-wide direction. If the per-fragment normal changes smoothly, the diffuse and specular response can also change smoothly. If the inputs are too coarse or the lighting is evaluated too early, the result starts to miss detail that should exist between the vertices.

That tradeoff becomes clearer in the lighting comparison below. With a coarse mesh, the per-vertex version can only sample the highlight at a few corners and interpolate the result across the interior, so the bright region can look smeared or misplaced. As the light moves, the per-fragment version stays more faithful because the lighting is evaluated at every covered fragment instead of being approximated from a few vertex samples. Increasing mesh density narrows the gap, which is why per-vertex lighting can be acceptable for some surfaces and clearly insufficient for others.

Mesh density

Vertex Shader vs Fragment Shader

The fastest way to compare these stages is to ask the same four questions for both.

How often does it run? Vertex shader: once per vertex. Fragment shader: once per fragment.
What is the main purpose? Vertex shader: geometric transformation and varying setup. Fragment shader: final shading and output color/depth.
What data dominates its input? Vertex shader: mesh attributes plus transform uniforms. Fragment shader: interpolated varyings, textures, lights, material uniforms.
What performance pattern is typical? Vertex shader: scales with geometry complexity. Fragment shader: scales with screen coverage and overdraw.

These differences imply practical optimization rules. If an effect can be approximated with per-vertex math and interpolation, it can be cheaper. If accuracy must be pixel-precise, as with specular response, normal mapping, or fine procedural detail, it belongs in the fragment stage even if cost increases.

Common Mistakes in Stage Selection

One frequent error is pushing too much logic into fragments without considering coverage. A full-screen post-effect at 4K can execute many millions of shader invocations per frame. Another error is pushing appearance logic too early into vertices and then wondering why detail collapses on large triangles.

A simple decision process helps:

Does this computation define object placement? Put it in vertex.
Does it define pixel-level appearance? Put it in fragment.
Does it need neighboring pixel information from already-rendered data? Often this means a later post-process pass, possibly compute.

That process is not perfect, but it avoids most architectural mistakes in real-time rendering code, especially once you compare it with techniques like ray marching with signed distance fields, which sit outside the standard triangle-raster path entirely.

Other Shader Types in Context

Geometry Shaders

Geometry shaders run per primitive after the vertex stage. They can emit new primitives, so they are useful for specific effects like layered shadow map outputs or line expansion. However, they are often avoided in performance-critical paths because they can become a throughput bottleneck. Many modern engines prefer alternatives such as instancing, mesh shaders on supported APIs, or compute-driven generation, and the Khronos rendering pipeline overview is a useful formal reference if you want the API-level stage ordering behind that summary.

Tessellation Shaders

Tessellation is split into control and evaluation stages. It subdivides patch primitives to add geometric density on the GPU. This can improve curved surfaces and displacement mapping when screen-space detail demands it. For the lower-level curve intuition behind handle-controlled spatial paths, visualizing 3D Bezier curves and De Casteljau construction is a useful companion before moving from curves to patches. The tradeoff is increased complexity and hardware/API constraints, so many teams use it selectively.

Compute Shaders

Compute shaders are not tied to rasterization. They execute general-purpose GPU kernels over thread groups and are widely used for simulation, culling, particle updates, clustered lighting preparation, denoising, and post-processing. In modern renderers, compute often cooperates with traditional graphics passes rather than replacing them, and those passes often consume procedural inputs built from noise functions such as value noise, Perlin noise, and fractal noise. For a beginner-oriented explanation of shader inputs, outputs, and stage-local execution, LearnOpenGL’s shader introduction remains a useful companion.

A useful mental map is:

Vertex + Fragment: core raster graphics path
Geometry + Tessellation: optional geometry amplification/refinement stages
Compute: general parallel processing path that can feed or consume rendering data

Building Intuition for Real Projects

When debugging rendering issues, identify the stage boundary where wrong data first appears. If transformed positions are wrong before rasterization, inspect vertex logic. If geometry looks right but color/lighting is wrong, inspect fragment logic. If topology or subdivision is wrong, inspect geometry/tessellation stages. If preprocessing buffers are wrong, inspect compute kernels.

You can also profile by stage intent. High vertex cost often tracks dense meshes or heavy skinning. High fragment cost often tracks large screen coverage, expensive material math, or overdraw from transparent layers. That split gives immediate direction for optimization experiments, and if you want a deeper explanation of the interpolation step between the programmable stages, Scratchapixel’s lesson on rasterization is the most relevant follow-up.

Summary

Vertex and fragment shaders are different because they run on different units of work. Vertex shaders process mesh points and prepare interpolated data. Fragment shaders process rasterized fragments and compute final appearance.

If you keep that execution model in mind, most pipeline decisions become clearer:

Place-space math and varying preparation in vertex shaders.
Pixel-accurate material and lighting logic in fragment shaders.
Use geometry and tessellation only when their specific capabilities are needed.
Use compute for general GPU tasks outside strict raster flow.

That model scales from simple demos to production renderers and makes shader code easier to reason about, optimize, and debug.