b.splendous.net

test images for leaking metadata

May 2021 / tech

One of the initial reasons to move to 11ty was to add responsive images. Many of the original images were huge, and I wanted to generate (and serve!) appropriate resolutions.

As a side effect, I thought that it might trim out some sensitive metadata like GPS locations. At some point I realized that many of the full-size originals in git still contained GPS information.

When I realized I was serving images with GPS metadata, the easiest fix was to just trim that metadata out, which I did with exiftool and the command below. Note that this runs on all files (not just images, and recursively); this worked for me at the time but it's worth double-checking your path before running something like this.

exiftool -gps:all= -xmp:all= `find . -type f`

Remembering to do this was overhead, though, so I was curious if generating responsive images would be a way to trim this out.

Responsive Images

I'm using the Local Responsive Images eleventy plugin to generate responsive images (npm).

The docs are a bit bare-bones, but starting with the sample in the project's README put me on the right track. My .eleventy.js snippet is pretty simple:

const pluginLocalRespimg = require('eleventy-plugin-local-respimg');
eleventyConfig.addPlugin(pluginLocalRespimg, {
folders : {
source : 'src',
output : 'dist',
},
images : {
resize : {
min : 250,
max : 1250,
step : 500,
},
sizes : '100vw',
lazy : true,
watch : {src : 'images/**/*'}
},
});

It's been a while, but I think that's about it! I tweaked a few things like turning down the steps (500 in the sample above, down from 150 in the recommended configuration) to generate fewer images which keeps my build speed high and avoids unnecessary storage.

Testing Metadata

So, now to test the hypothesis: that the generated files wouldn't have any of the concerning metadata. I spot-checked a few and they didn't, so I felt good about it, but I decided to make sure it continued to work with an automated test.

A small script seemed simplest. I've recently switched to fish shell, but I've only been using it interactive (I haven't tried scripting). This seemed like a good opportunity to try a script. The syntax was a bit different, but the fish reference was very helpful.

There's a bit more framing in my test, but this is the meat of it. Note that my code formatter doesn't currently support fish, so this is formatted with bash.

set EXIT 0

for image in (find $TESTDIR -iname '*.jpg' -or -iname '*.jpeg' \
-or -iname '*.gif' -or -iname '*.webp')
echo $image
set GPSOUT (exiftool -gpslatitude -gpslongitude $image)
set XMPOUT (exiftool -xmp $image)
if test -n "$GPSOUT" -o -n "$XMPOUT"
echo $GPSOUT
echo $XMPOUT
set EXIT 1
end
end

exit $EXIT

With the test written, I reverted the commits where I'd removed the metadata with exiftool. The test failed on the reverted files, and passed on my output directories (after going through generation), so it looks like my hypothesis was correct! The GPS information is trimmed off of the responsive image files, so I don't need to worry about that any longer.

I wired this up into my package.json, so I can run npm run test (when I remember, which is not often) and have my output dir scanned for files containing GPS information.