02. SRP frame loop and data flow (CPU↔GPU)
This chapter summarizes “What data does SRP ultimately create for each frame and what commands are sent to the GPU?” This is the basis for understanding the RenderGraph/CommandBuffer/Shader interaction in the future.
2.1 Two layers of the frame: logic (render pass construction) vs execution (GPU instructions)
- Logic Layer (Record/Build): Determines “which pass to run this frame” and configures dependencies between passes.
- Execute/Submit: Constructed passes are converted to GPU instructions and executed.
In URP, this separation is particularly sharp in RenderGraph. RecordRenderGraph() records the logic, and then URP is responsible for execution.
2.2 Three core types of C# rendering loop
RenderPipeline/RenderPipelineAssetScriptableRenderContext(SRP’s “Rendering Submission” context)CommandBuffer(a container for stacking GPU instructions)
ScriptableRenderContext eventually submits the commands accumulated in the CommandBuffer to the GPU at the appropriate timing.
2.3 Camera Unit Rendering and FrameData
SRP/URP usually renders on a per-camera basis. For each camera:
- Culling results (what renderer to draw)
- Camera matrix/projection
- Render target (screen or texture)
- Post processing/volume parameters
This is because the same data changes.
In the URP RenderGraph path, “data of this frame” shared by multiple subsystems is delivered through ContextContainer frameData. (For a specific example, refer to the RecordRenderGraph example of 04. RenderGraph)
2.4 Data flow checklist (practical)
Every time you make something, check the following:
- Which pass produces** this data? (write)
- Which passes consume? (read)
- Is the lifespan “this camera only” or “the entire frame”?
- What is the access type on GPU? (Sampling Texture / Render Target / UAV / Constant Buffer)
- Is it safe in multi-camera (scene view/game view/reflection)?
2.5 Culling: The step of deciding “what to draw”
Rendering is ultimately a matter of “how many/what/in what order” draw calls are issued. The starting point is curling.
Culling in SRP generally has the following flow:
- Obtain
ScriptableCullingParametersfrom the camera - Get
CullingResultswithcontext.Cull(ref parameters) - Based on this result, configure “which renderer to draw with which pass”
URP performs this step internally, but also when writing a Render Feature/Pass. To design “what object is this path targeting” you need to know the concepts of culling and filtering.
2.6 Draw configuration: DrawingSettings / FilteringSettings / RenderStateBlock
“What to draw with which shader pass” is usually determined by:
- FilteringSettings: “Target” filters such as render queue range (opaque/transparent), layer mask, etc.
- DrawingSettings: Which ShaderTagId (=LightMode) to use, sorting criteria, per-object data, etc.
- RenderStateBlock: Override render states like blend/depth/stencil
Draw “only specific layers” in off-screen RT with the Render Feature, If you want to create a mask by forcing “only a specific LightMode” to be selected, these three are key tools.
2.7 Why does NativeArray appear here? (data structure sense)
NativeArray<T> appears frequently in Unity's rendering/culling/job system.
Key reasons:
- GC-free memory (native memory, not managed heap)
- Compatible with Job System/Burst
- Makes costs predictable when handling large amounts of data on a frame-by-frame basis.
To summarize “What does NativeArray do?” in one sentence:
“Large array data (instances/lights/culling results, etc.)” frequently handled in rendering
This is a container for handling allocation/GC/thread issues.
2.7.1 Points often encountered in practice
- When drawing many objects/instances (instancing data)
- Light list/cluster data (Forward+ series)
- When efficiently passing temporary data inside the RenderGraph/pipeline
Caution: Lifetime management (Dispose) is important for NativeArray.
In this book, only the core concepts are introduced when delving into the internal implementation of URP, It is recommended that actual API usage/Job patterns be expanded with separate materials.
2.8 CPU↔GPU Boundary: Submit, Synchronization, and “Why Do I See One Frame Late?”
Common questions asked when debugging rendering:
- “Why is the value changed but not applied one frame later?”
- “Why does the frame suddenly jump?”
Most of this is due to the separation of CPU instruction recording and GPU execution.- CPU records commands in CommandBuffer
- SRP submits at the appropriate time
- GPU runs on its own queue
So:
- If the GPU is busy, there is a waiting period for the CPU (stall),
- Conversely, if the CPU does too much work, the GPU becomes idle.
Understanding this boundary also makes it clearer why “resource lifetimes/dependencies” in RenderGraph are important.
Further reading (official/authoritative sources)
- ScriptableRenderContext (overview): https://docs.unity3d.com/6000.3/Documentation/ScriptReference/Rendering.ScriptableRenderContext.html
- CommandBuffer (overview): https://docs.unity3d.com/6000.3/Documentation/ScriptReference/Rendering.CommandBuffer.html