Challenge #6: Extract Dimensions from PNG file

For this challenge, the task is to create a flow that fetches the following image file and extracts the width and height from the image data:

For this challenge we need to fetch the PNG file, and look at its contents.

The Spec
First let’s take a look at the PNG file format specificaction

It tells us that a valid PNG files starts with a fixed sequence of 8 bytes, often called the ‘magic number’.
Then we should expect an IHDR chunk. The chunk comes with a prefix of a 4 bytes containing chunk length, which is always 13 for IHDR. Then we expect the ASCII characters “IHDR”, followed by the chunk contents.

The IHDR chunk contents are: 4 bytes representing image width, 4 bytes representing image height, followed by 5 more bytes that we’re not interested in.

In summary, the binary structure looks like this:

Block Size Content
magic number 8 bytes 137, 80, 78, 71, 13, 10, 26, 10
IHDR length 4 bytes 0x0000000d = 13
IHDR name 4 bytes 0x49484452 = ASCII “IHDR”
image width 4 bytes variable
image height 4 bytes variable

Since we’re looking at a single file, I can use a control flow. I start by ensuring the the standard library is available and setting up VFS, so I can read files from HTTP as if they were local.

The Flow

The logic is simple. Get the binary file contents of our target file, and use a calculator to dig into the contents.

Screenshot 2020-01-07 at 10.33.34

I’m using the Read Binary File step to get binary contents of our image file. I just copy-pasted the image URL directly from the browser.

The calculator step looks at the binary contents of the file, and validates them. If the file contents were invalid, our validation of the magic bytes, and header length and id would fail.

The step uses some functions from the standard library that help working with binary data such as:

But as it happens, our image is a perfectly fine PNG file and we get the expected result:

Here’s the complete read_png.cfl (29.6 KB) file.