Usually this is caused by having values in character variables that cannot be trancoded.
When you are running in with a single byte character encoding then what is actually in the character string doesn't matter much. But if you try to treat the same string of bytes as if it was UTF-8 then you could have a sequence of bytes that do not represent any known unicode character.