Several years ago I posted an article about why I don't generally offer per-word rates for Korean>English translation. The following is from a recent email to a client, explaining things in a bit more detail.
- The majority of the work I get for KO>EN is scanned source files in PDF format, which can't be analyzed precisely until the translation is complete. On those jobs, fixed quotes in advance or target word billing are the most reasonable. Sometimes these PDFs can be converted to Word through OCR or the native Adobe Acrobat conversion. However, for various reasons, these word counts are extremely unreliable.
- Even if the files are editable, I find that it takes an extra measure of care to ensure everyone's talking about the same thing when referring to Korean words/characters. To make matters worse, if the language settings in Word aren't set right, the software will count Korean words as characters (or vice versa, I can't remember which right now) and that creates confusion. At least until a few years ago, Excel also didn't count Korean words and characters correctly.
- Korean does not have a long tradition of using words (or even writing left-to-right), and I find that Koreans are not as consistent in their use of spacing as we are in English. Therefore, what you find is that different writing styles yield different Korean word counts, even as the final English translated word count remains unchanged. Furthermore, when clients equate Korean with Chinese and Japanese which don't use words, it adds another layer of confusion. Your colleague mentioned that internally you are assuming two Korean characters to be one word, but that is arbitrary. Korean words are calculated based on discrete units of meaning, and separated by spaces.
- Different types of content return different word count expansions. For example, Korean word lists will translate to English almost at one for one. However, because Korean grammar attaches tags to words and those tags are then translated to English as separate words, the expansion rate increases the more "prose-y" a text is. The expansions vary depending on subject matter, too.
- As with the current job, many Korean writers, especially on technical documents, mix a lot of English words into the text. These are embedded in the Korean grammar though and can't be excluded from the word count. However, if the letters of the English words are counted as characters (which is what happens if not analyzed separately), it runs the word count way up. On today's job, there were 1,500 English words mixed in with some 4,000 Korean words. That means rejigging the word counting formula to avoid overcharging. Counting source characters also means having to do something extra with numbers, since that also runs up the count.