This is going to be one of the dullest posts ever, but if you have ever been faced with the need to archive hundreds or thousands of emails on a short time schedule you'll want to read on.
I'm an employee of the State of Washington. As such, my communications with students, colleagues, and others are subject to public records requests under state law (RCW 42.56). It's recently come to my attention that many of my colleagues at the University of Washington have struggled to compile records in response to public records requests. Given that most communication takes place by email these days, the key task is to sort through emails to identify relevant records, then push them through the bureaucracy to the University office that deals directly with requesters.
So imagine a requester notifies the University that they are interested in, say, 5,000+ of your emails spanning a period of several years. What do you do? This guide pertains to those who use Gmail as their email client and Google Chrome as their browser. With these tools, plus a very handy plug-in from a company called cloudHQ, the task is... well, I won't say easy, but much less daunting.
Step 1: Identify requested records
There are some simple tips for searching your gmail archive effectively. You can delimit your search in several ways, and use Boolean operators (AND, OR, NOT -- in caps just like that). A first suggestion: it's easier to work with emails if you "unthread" them. This means that each email is just that -- a single email -- rather than an entire back-and-forth conversation. You can toggle this option in Gmail's settings.
Got that? Good! So suppose you need to release all email sent to or received from your buddy Pat at the Smithsonian. In the search box, type:
to:[email protected] OR from:[email protected]
Suppose instead that you've been asked to release all email from any individual at the Smithsonian. That's easy too. Search:
to:smithsonian.edu OR from:smithsonian.edu
In either case, when you get your search results you can use Gmail's label functionality to identify them as pertinent to the records request. Select all the emails in the list, hit the button that looks like a price tag, and apply. You'll need to create the label first, of course. If you got a records request from the New York Times, for example, you could call your label "NYT."
Now consider this curveball. Suppose you and your colleagues Lou and Jo have both been asked to release all email to and from each other. The three of you email all the time, often sending messages to the entire group. That means if the three of you all comply with the requests, there might be as many as three copies of each email -- one in a sent mail folder, and more in received mail folders. Typically you only need to release one copy of each relevant document. How do you avoid redundancy?
The key is to work with sent mail. Email may have millions of recipients but it is always sent from one account. So if you, Lou, and Jo just work from your sent mail folders you'll get a complete record with no redundancy (unless one of you happens to delete sent mail, in which case the others will have to fill in the gaps). So a typical search would take the form:
in:sent to:[email protected]
This gets all the emails you sent to Lou. You attach your label to these. Now, if you do a second search for emails to Jo you will pull up a bunch of emails you already labeled, unless you search like this:
in:sent to:[email protected] NOT label:NYT
Step 2: send those records up the food chain
OK, you've identified pertinent records, now what? We've seen already that gmail makes it pretty straightforward to search for relevant records and attach a label that flags them. Now you need to get them out of gmail and headed down the path to the requester.
Here's where the cloudHQ products come in handy. These are little plug-ins that interface quickly and easily with Google Chrome and Gmail. And unlike the pharmaceutical industry, each plug in has a name that tells you what the plug in does. One "Multi Email Forward for Gmail," is set up to forward batches of emails all at once. A second, "Save Emails to PDF," will take a set of emails you select and place all of them -- and their attachments -- into a .pdf file. If that .pdf file gets big, it's automatically compressed into a .zip file. I've worked more extensively with the latter product, but they are both super handy.
The cloudHQ products don't work instantaneously -- if you've got dozens of emails, it takes a while to convert them -- but you'll find a little button next to your search box in Google Chrome that lets you track progress. You get a notification when your file is ready to download. You grab it, and then can either forward it as an email attachment, upload it to a server, et cetera. Of course, if your goal is to upload to a Google Drive, Dropbox, or OneDrive cloudHQ has separate products for that.
If you have only a small number of emails to release, cloudHQ has free versions that limit the number of emails processed per month. For a low monthly price ($4.99 for Save Emails to PDF), you can remove this limit. And no, I don't get a commission if you pay them.
---
As you may have surmised, I've amassed all this knowledge because I myself am curating some emails for release related to a public records request. Anyone reading this post with additional insights or tips -- particularly for those who aren't blessed with the Gmail/Chrome combo -- please submit a comment. I'll update the post if I come up with any more useful shortcuts!