When forwarding supplier invoice emails to receipt hub, that contain the invoice in the HTML Body - I have noticed some are imported with extra characters as in the image below.
Presumably this is some sort of character encoding mismatch - but is there any way to easily resolve this?
Separately to that, getting invoice emails to forward automatically is a challenge because
(a) Gmail wants verification of a forwarding address
(b) Even if verified, forwarded messages appear from the original sender address.
I have got around this by using Zapier to trigger when an email is labelled, and then sending a new email with the body and/or attachment to the receipt hub address.
To get around the odd character problem above, I have two versions of the Zap.
One that simply forwards emails that contain attachments, and one that converts the HTML body on the fly to a PDF and then emails that into quickfile.
In Gmail I have filters that automatically apply the labels to regular supplier invoices, and for one-off invoices all that is required to kick the process off is to label the email.
There are various custom methods you could use. I assume you want to do it all in Gmail/on the web? How automated do you want it? You could just print the email to PDF, save to Drive/Dropbox and then have that import to receipt hub, or have Gmail save attachments to Drive and have the attachments imported?
I use Thunderbird and on another account I have a plugin scan the emails in the receipts folder and automatically save attachments to Dropbox which is linked with Quickfile.
I always go back to suppliers that send out embedded invoices and ask for an actual proper invoice in a standard format, or at least not encoded/formatted with loads of random bits.
The automation I’ve built makes it completely hands-off and handles embedded invoices by converting them to pdf on the fly. I just wondered if I was over complicating it.
Not really. As you’ve correctly surmised the issue is down to a character encoding mismatch - the content is UTF-8 (where the Unicode character U+00A0 no-break space is represented by the two bytes C2 A0) but your browser is rendering it as ISO-8859-1 or Windows-1252 or another similar single-byte encoding (where C2 becomes Â). I get exactly the same problem forwarding UTF-8 emails that include the £ sign, which renders as “£”.
In your email client the character encoding is provided by the Content-Type declaration in the MIME headers, but by the time it gets to the receipt hub all QuickFile has to go on is the email body, which doesn’t tell you the encoding unless there happens to be a relevant meta tag in the HTML content (and even then that may not be consistent with the actual content).
Depending what system QuickFile use to store files for download, they might be able to implement something a bit more intelligent in the logic that adds files to let it store a full Content-Type to use when serving each file (they used to use Amazon S3 where you can give the MIME type as part of the metadata when you do an upload, and the file will be served back to you with the same MIME type, but they switched away from that a while back and I’m not sure what they’ve moved over to). That could then be populated from the Content-Type of a file upload, an explicit parameter in the API, or the MIME part headers for files sent in by email. But the way it works at the moment with QuickFile not storing the correct text/html; charset=UTF-8 in the first place, no, there’s no general way to fix it other than to render the HTML to PDF yourself or somehow munge the HTML to insert a