Copy a texture to screen

For our sparse voxel octree renderer we’re going to bypass almost the entire GPU rendering pipeline and use a compute shader instead. ‘Almost’, because the pipeline is the only method to render onto the screen. In this post I’ll describe how to create a texture and copy that texture onto the screen. I also wanted to show how to use bindless textures, but unfortunately my GPU doesn’t support those.

Texture memory allocation

We’ve already set up an OpenGL4.5 context with debugging, so allocating texture memory is next. You can either follow the debugging tutorial or download its project files. After which you should set SCREEN_FULLSCREEN to 0.

Since OpenGL 4.5 the glGen* methods are kind of deprecated. They create a ‘name’ for the object, but don’t initialize the object. Initialization then happens when it is first bound using a glBind* method. This delayed initialization becomes a problem when you want to use Direct State Access (DSA), as it allows you to modify objects without having to bind them first. Hence the glCreate* methods have to be used, which create a ‘name’ and also initializes the objects. The goal of DSA is to reduce driver overhead, though it also simplifies your code.

Since OpenGL 4.2 it is also possible to allocate texture memory up-front, thereby creating a so-called immutable-format texture. The purpose of immutable format textures is (again) to reduce driver overhead.

With those two changes explained we will allocate GPU memory for our texture, using glCreateTextures and glTextureStorage2D:

GLuint texture;
glCreateTextures(GL_TEXTURE_2D, 1, &texture);
glTextureStorage2D(texture, 1, GL_RGBA8, 1024, 1024);

Uploading texture data to an immutable-format texture must be performed using glTextureSubImage2D. The method glTexImage2D is no longer allowed as it could change the format (note that there is also no DSA variant of this method). For the sake of completeness of this tutorial, we will upload some dummy data.

GLuint image_data[960*540];
for (int i=0; i<960*540; i++) {
  image_data[i] = rand();
}
glTextureSubImage2D(texture, 0, 0, 0, 960, 540, GL_RGBA, 
    GL_UNSIGNED_BYTE, image_data);

Buffers and vertex Arrays

The only way to render something to the screen is through the GPU’s rendering pipeline. So we need to render two triangles using the texture we just created. To render these triangles, we somehow need to tell the GPU what the triangle’s coordinates are. There used to be a method glVertexPointer to do just that. However, that is not available in OpenGL core profiles. OpenGL core doesn’t assume anything about what data you want to attach to vertices (for example, you might want to compute the vertex coordinates inside the shader). So instead, a generic way of attaching data to vertices is provided: the vertex array object (VAO).

A major downside of the old gl*Pointer methods was that the vertex attribute information had to be set-up and torn-down for every draw call. Doing so incorrectly would mess-up the OpenGL-client-state and affect all future draw calls. This is also solved by VAOs, which store all the vertex attribute information and can be swapped in and out using a single glBindVertexArray call. The VAO still needs to be bound, as there is no bindless version of glDrawArrays, glDrawElements, etc. Note that VAOs don’t store the vertex data itself, instead it stores a pointer to the buffer containing the attribute data. Like textures, buffers can be created with immutable-format and have DSA methods to manipulate them.

The API for VAOs has changed often. An answer on stack overflow was very useful for me to figure out what methods I should be using in OpenGL 4.5.

Creating the buffer

Buffers are created using glCreateBuffers, after which GPU memory is allocated for its contents using glNamedBufferStorage. The method glNamedBufferStorage accepts a data parameter. If data is not NULL, its contents are copied into the allocated buffer storage. We use this to initialize the buffer with these four vertices: (-1,-1), (-1,1), (1,-1), (1,1).

GLfloat data[8]={
  -1,-1, -1, 1,
   1,-1,  1, 1,
};
GLuint buffer;
glCreateBuffers(1, &buffer);
glNamedBufferStorage(buffer, sizeof(data), data, 0);

Creating the VAO

The second step is to call glCreateVertexArrays to create a VAO and call glVertexArrayVertexBuffer to attach the buffer to it. The buffer is attached at index buffer_index and we also tell OpenGL that the elements in the buffer are 8 bytes each (i.e. 2 floats).

int buffer_index = 0;
GLuint array;
glCreateVertexArrays(1, &array);
glVertexArrayVertexBuffer(
    array, buffer_index, buffer, 0, sizeof(GLfloat)*2);

Adding an attribute to the VAO

The third step is to add an attribute to our VAO. First we call glEnableVertexArrayAttrib to tell OpenGL that we have an attribute at index position_attrib. This index is used by shaders to refer to the attribute using layout(location=0). Then we specify the format of the attribute using glVertexArrayAttribFormat: two floats, stored immediately at the start of each element. And last we use glVertexArrayAttribBinding to tell OpenGL from what buffer it should obtain the data.

int position_attrib = 0;
glEnableVertexArrayAttrib(array, position_attrib);
glVertexArrayAttribFormat(
    array, position_attrib, 2, GL_FLOAT, GL_FALSE, 0);
glVertexArrayAttribBinding(
    array, position_attrib, buffer_index); 

The shader

With an OpenGL core profile, there is no fixed function pipeline, hence, whenever you want to render something, you have to use shaders to do so. Shader programs also have DSA methods, so we create our shader program object with glCreateProgram. Furthermore, we use glObjectLabel to give our program a name, which will be used by OpenGL in debug output.

GLuint program = glCreateProgram();
glObjectLabel(GL_PROGRAM, program, -1, "TextureCopy");

Attaching the shader code

For attaching the shader code we first define a little helper function which will call the methods glCreateShader, glShaderSource, glCompileShader, glAttachShader and glDeleteShader in order. The code can be passed to glShaderSource as a single ‘line’ containing line breaks. Shaders are ref-counted, so calling glDeleteShader will only cause the shader to actually be destroyed when the program it’s attached to is deleted. (Note: I’m not sure whether the call to glCompileShader is necessary as my OpenGL driver would otherwise compile them when linking. However, this might be a liberal interpretation of the specs. I also don’t check the compile status and log, as a compile failure would cause linking to fail as well, and my driver includes the compile log in the link log.)

void attach_shader
(GLuint program, GLenum type, const char * code) {
  GLuint shader = glCreateShader(type);
  glShaderSource(shader, 1, &code, NULL);
  glCompileShader(shader);
  glAttachShader(program, shader);
  glDeleteShader(shader);
}

Now we can attach the shader code. For the sake of simplicity, I embedded the code in c++11 raw strings. Though I would recommend reading them from a file, as it eliminates compile time while debugging shaders.

Vertex shader

Every shader must start with a line specifying the GLSL version being used. We use GLSL 4.50 core, which is the version that came with OpenGL 4.5. As input data we have our coordinates. The only thing our shader has to do is set the vertex shader output variable gl_Position to this coordinate. This is not as trivial as it might seem. You have to know that the screen coordinates run from -1 to 1, which is why I put those 4 coordinates in the vertex buffer, and that OpenGL expects homogeneous coordinates, which is why I set gl_Position.w to 1. The vertex will end up at position gl_Position.xy / gl_Position.w on the screen.

attach_shader(program, GL_VERTEX_SHADER, R"(
  #version 450 core
  layout(location=0) in vec2 coord;
  void main(void) {
    gl_Position = vec4(coord, 0.0, 1.0);
  }
)");

Fragment shader

We’ve at least three methods for reading pixel data from our texture. I’ve chosen for the imageLoad/imageStore methods, as I’ll be using these in the compute shader. Though, I could also have used texture2D or texelFetch with a texture sampler. Performance-wise there is barely any difference (at least for my GPU & driver). The image is specified as readonly, as we’re only reading from it in this shader; the restrict tells that the shader won’t access the image through any other means, uniform (still) means that this variable will be the same for all shader invocations for this draw call; and we have to specify the image’s format, which is rgba8.

There is no default color output variable, however the output at location 0 is used for this, so we define a color variable such that we can write to that location. The image and screen use the same origin, with the lower left corner being the origin (0,0). The imageLoad method wants the integer coordinate of the pixel that we want to lookup in the texture. Luckily, gl_FragCoord.xy happens to contain the screen’s pixel coordinate, so we only need to cast it to a 2D integer vector before we pass it to imageLoad. (The values of glFragCoord.z and glFragCoord.w also have some meaning, though I currently don’t know them.)

attach_shader(program, GL_FRAGMENT_SHADER, R"(
  #version 450 core
  readonly restrict uniform layout(rgba8) image2D image;
  layout(location=0) out vec4 color;
  void main(void) {
    color = imageLoad(image, ivec4(gl_FragCoord).xy);
  }
)");

Linking the shader program

Before we can use our shader program we still have to link it. Linking is performed with glLinkProgram. The result, success or failure, must be queried using glGetProgramiv. And in case of an error, the link log must be obtained using glGetProgramInfoLog.

glLinkProgram(program);
GLint result;
glGetProgramiv(program, GL_LINK_STATUS, &result);
if (result != GL_TRUE) {
  char msg[10240];
  glGetProgramInfoLog(program, 10240, NULL, msg);
  fprintf(stderr, "Linking program failed:\n%s\n", msg);
  abort();
}

The draw call

Now we have everything prepared such that we can perform the method calls necessary to render the texture to screen. First we must tell what shader program we want to use, with glUseProgram. Then we have to bind our texture to an image unit, with glBindImageTexture. Note that an image unit is not the same as an image variable, so we use glUniform1i to tell OpenGL that our image variable in our fragment shader (which has location 0) should use image unit 0. We also have to bind our VAO, for which we use glBindVertexArray. And now we’re finally ready to call glDrawArrays to copy our texture to the screen.

We’re planning on using a compute shader to render to our texture. When we do so, OpenGL requires us to tell that it must synchronize its memory before reading from the texture. This is done with glMemoryBarrier, which expects as parameter how you intend to use the memory that needs synchronization. We will be reading from the texture using imageLoad, so we have to pass GL_SHADER_IMAGE_ACCESS_BARRIER_BIT.

glUseProgram(program);
glBindImageTexture(
  0, texture, 0, false, 0, GL_READ_ONLY, GL_RGBA8);
glUniform1i(0, 0);
glBindVertexArray(array);
glMemoryBarrier(GL_SHADER_IMAGE_ACCESS_BARRIER_BIT);
glDrawArrays(GL_TRIANGLE_STRIP, 0, 4);

That’s all. The result is available in a zip-file. Next post I’ll show how to do performance testing with OpenGL.

Update: The method glUniform1i can be replaced by the DSA method glProgramUniform1i.

Advertisements

2 thoughts on “Copy a texture to screen

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s