Turns out this problem is with the PDF conversion software we are using. I have contacted the company that supply this component to see if we can invoke a particular setting to ensure these characters are rendered correctly. Failing that we will fall back on the substitution method you suggested in your original post.