The Draw and Print Machine

Jul 30, 2025

I’d been eagerly awaiting a way to use OpenAI’s chat-integrated 4o image generation tools programmatically. So as soon as they rolled out their gpt-image-1 API, I got to work:

The problem

My daughters’ desire to draw, color, and paint is inexhaustible, especially on rainy days.
But especially on those rainy days, screen time is hard to resist.
A stack of Crayola printables & “free coloring page for kids” Google image searches provides almost enough structure to keep us all happy and entertained, but it’s not quite creative enough … and it’s a little too easy to get sucked into endless scrolling through those search results.
With all the typing, browsing, selecting, right-clicking, and printing it also requires a lot more frequent adult intervention than anyone wants!

The vision

A friendly web application where kids can:

Draw a picture on the screen using an MSPaint-style user interface — in as much or as little detail as they want.
With one click, convert their image to a printable coloring page with by sending it to ChatGPT with a request to invoke the image_generation tool and a … “convert this image to a printable coloring page” prompt.
Automatically print the resulting image.
1. Without needing to press a print key.
2. And without previewing it on screen beforehand — it’s exciting to run across the house and watch a printer for the big reveal!
Automatically save both the original (human, MSPaint) input and the (AI, coloring page) output side-by-side into a portfolio for future browsing and reprint.
Not accidentally spent hundreds of dollars in OpenAI API credits.

The prototype: backend

The OpenAI integration that does the “heavy lifting” server-side is just one API call:

from openai import OpenAI

def generate(image, age):
    image = base64.b64encode(image.read()).decode("utf-8")
    client = OpenAI(api_key=settings.AI_IMAGES_OPENAI_API_KEY)

    openai_response = client.responses.create(
        model="gpt-4.1-mini",
        tools=[{
	    "type": "image_generation",
	    "size": "1024x1536",
        }],
        input=[{
            "role": "user",
            "content": [
		{ "type": "input_text", "text": f"Convert this image to clean black and white line art suitable for children's coloring pages with a high level of detail and an interesting background pattern. Ensure the result is suitable for printing on a standard 8.5x11 sheet of printer paper (either portrait or landscape is fine) for a child to color with crayon or marker. The level of detail should be appropriate for a child aged {age} years old."},
		{ "type": "input_image",
                  "image_url": f"data:image/jpeg;base64,{image}",
		},
            ],
	}],
    )
    
    return [
	f"data:image/jpeg;base64,{output.result}"
	for output in openai_response.output
	if output.type == "image_generation_call"
    ][0]

Because these gpt-4.1 models have image generation fully integrated into a conversational flow that can also see image files as inputs, we don’t actually need to describe what we want to generate, and so our server never needs to have any idea what kind of picture it’s receiving from the end user.

This is really neat. The code stays simple, and we don’t risk stripping out details or aesthetic considerations in an imprecise image → text → image translation. We can just let the user and the AI decide together what’s meaningful by communicating — almost entirely and almost without mediation — in pictures.

The prototype: front end

The front end is also quite lightweight. I grabbed a modern HTML5 MSPaint clone (@christianliebel/paint) that provides a <paint-app> web component, and dropped it on a page with a form:

<form enctype="multipart/form-data"  method="post">
  <paint-app></paint-app>
  <input type="submit" value="I'm done!">
</form>

<script
  src="https://unpkg.com/@christianliebel/paint/dist/elements/index.js"
  type="module"></script>

Then we just need to extract the painted image when it’s time to send it to the server:

document.addEventListener('DOMContentLoaded', function() {
 const form = document.querySelector('form');
 form.addEventListener('submit', function(event) {
   event.preventDefault();

   // Get the actual drawing canvas from the paint app
   const canvas = document.querySelector('paint-app').shadowRoot.querySelector('paint-canvas').shadowRoot.querySelector('canvas.main');

   canvas.toBlob(blob => {
     const file = new File([blob], 'canvas-image.jpg', { type: 'image/jpeg' });
     const fileInput = form.querySelector('input[type=file]');
     const dataTransfer = new DataTransfer();
     dataTransfer.items.add(file);
     fileInput.files = dataTransfer.files;

     document.getElementById("submit-button").disabled = true;
     document.getElementById("loading-indicator").style.display = "block";

     form.submit();
   }, 'image/jpeg', 1);
 });
});

The prototype: keeping costs manageable

At this point I had something that worked really well and totally delighted my daughters.

We spent an hour or two taking turns drawing, printing, crayoning, & sharing, and we accumulated some really nice coloring pages:

My parents and sister tried it too, and we were pleasantly surprised to see how well it picked up on aesthetic differences. It’s not always floating characters with smiles and rounded strokes:

When I checked my OpenAI account later that day, I noticed that we had also accumulated $10+ in usage costs. Oops! That won’t scale very well unattended.

Luckily, this turned out to be a pretty easy fix. In my haste to get images from MSPaint to ChatGPT, I had neglected to think about file size — so in my input prompts, I was sending ultra-high-resolution exports of my daughters’ line drawings.

I put in three optimizations:

Send the exported image as a webp file, not a jpeg
Adjust the quality level of the export
Scale the canvas down to a fixed small size during export, by drawing a copy of the original canvas onto a new, smaller canvas

    const canvas = document.querySelector('paint-app').shadowRoot.querySelector('paint-canvas').shadowRoot.querySelector('canvas.main');

    // Resize the canvas down to 1024px max dimension
    const maxDim = 1024;
    const scale = maxDim / Math.max(canvas.width, canvas.height);
    const scaledCanvas = document.createElement('canvas');
    scaledCanvas.width = canvas.width * scale;
    scaledCanvas.height = canvas.height * scale;

    const ctx = scaledCanvas.getContext('2d');
    ctx.drawImage(canvas, 0, 0, scaledCanvas.width, scaledCanvas.height);

The last was the one that really mattered. After a few rounds of tuning I landed on 1024px as a good quality level: small enough that each printable costs ten cents or less, but large enough to preserve the intent & details of the original image. (When I got down to 512px or lower, ChatGPT couldn’t really “see” what was in the low-resolution images at all, and just started improvising its own printables untethered from the input.)

The prototype: a browseable portfolio

This part’s just some pretty standard Django, and I vibe-coded most of it. Nothing to say here! Here’s a screenshot:

But… when reviewing our portfolio, I noticed the exports of our initial drawings had a ton of whitespace:

That happens because I’m naively exporting a copy of the entire drawing <canvas> — including a large portion that’s out of the initial viewport, which users will very rarely bother to scroll and draw in.

So I asked Claude to give me a smart-cropping algorithm we can run before export. This works perfectly: first it checks a corner pixel to determine the background color, then scans the full image to get a bounding box for the actually-drawn-upon area.

     const data = imageData.data;                                                                                                                                                                                       
                                                                                                                                                                                                                        
     // First, detect the background color by sampling corners                                                                                                                                                          
     const getPixel = (x, y) => {                                                                                                                                                                                       
       const idx = (y * canvas.width + x) * 4;                                                                                                                                                                          
       return [data[idx], data[idx + 1], data[idx + 2], data[idx + 3]];                                                                                                                                                 
     };                                                                                                                                                                                                                 
     const bgColor = getPixel(0, 0);  // @@TODO sample all four corners and pick the most prevalent                                                                                                                                                                                 
                                                                                                                                                                                                                        
     const isBackground = (r, g, b, a) => {                                                                                                                                                                             
       // Check if pixel matches background color (with tolerance for slight variations)                                                                                                                                
       const tolerance = 15;                                                                                                                                                                                            
       return Math.abs(r - bgColor[0]) <= tolerance &&                                                                                                                                                                  
              Math.abs(g - bgColor[1]) <= tolerance &&                                                                                                                                                                  
              Math.abs(b - bgColor[2]) <= tolerance &&                                                                                                                                                                  
              Math.abs(a - bgColor[3]) <= tolerance;                                                                                                                                                                    
     };                                                                                                                                                                                                                 
                                                                                                                                                                                                                        
     let minX = canvas.width, minY = canvas.height, maxX = 0, maxY = 0;                                                                                                                                                 
     let hasContent = false;     

     // Scan for non-background pixels                                                                                                                                                                                  
     for (let y = 0; y < canvas.height; y++) {                                                                                                                                                                          
       for (let x = 0; x < canvas.width; x++) {                                                                                                                                                                         
         const idx = (y * canvas.width + x) * 4;                                                                                                                                                                        
         const r = data[idx];                                                                                                                                                                                           
         const g = data[idx + 1];                                                                                                                                                                                       
         const b = data[idx + 2];                                                                                                                                                                                       
         const a = data[idx + 3];                                                                                                                                                                                       
                                                                                                                                                                                                                        
         // If pixel is different from background, it's part of the drawing
         if (!isBackground(r, g, b, a)) {
           hasContent = true;
           minX = Math.min(minX, x);
           minY = Math.min(minY, y);
           maxX = Math.max(maxX, x);
           maxY = Math.max(maxY, y);
         }
       }
     }

     // If no content found, use a small centered area
     if (!hasContent) {
       const centerX = canvas.width / 2;
       const centerY = canvas.height / 2;
       const size = 100;
       minX = centerX - size;
       minY = centerY - size;
       maxX = centerX + size;
       maxY = centerY + size;
     }

     // Add padding around the drawing (20px or 10% of size, whichever is smaller)
     const padding = Math.min(20, Math.max((maxX - minX) * 0.1, (maxY - minY) * 0.1));
     minX = Math.max(0, minX - padding);
     minY = Math.max(0, minY - padding);
     maxX = Math.min(canvas.width, maxX + padding);
     maxY = Math.min(canvas.height, maxY + padding);

     const cropWidth = maxX - minX;
     const cropHeight = maxY - minY;

     // Create cropped canvas
     const croppedCanvas = document.createElement('canvas');
     croppedCanvas.width = cropWidth;
     croppedCanvas.height = cropHeight;
     const croppedCtx = croppedCanvas.getContext('2d');

     // Draw the cropped area
     croppedCtx.drawImage(canvas, minX, minY, cropWidth, cropHeight, 0, 0, cropWidth, cropHeight);

Much better now, and as an added bonus, it means our scaled-down exports are much less lossy:

Make Your Own Printables

You can try the resulting application here:

MakeYourOwnPrintables.com

You’ll need to provide your own OpenAI API key1 — and make sure that you have the image generation models enabled. (This for some reason requires that you provide a photo of your driver’s license. Creepy!)

The only remaining requirement is the auto-printing:

Automatically print the resulting image.
Without needing to press a print key!
And without previewing it on screen beforehand — it’s exciting to watch a printer for the big reveal!

This is a whole thing — web browsers really aren’t supposed to just take over your printer — so I’ll tackle it on the next rainy day. More soon!

If you don’t feel like going through all that and/or giving me your API key, get in touch, I’ll probably let you just use ours.

The Third Bear Thinks

Discussion about this post