I recently asked my agency clients for their feedback on offering a rate discount in exchange for the right to use machine translation and termbase leveraging on certain projects. Machine translation brings up ideas of low-quality output, and cross-client TM sharing is generally off-limits due to confidentiality issues and rights to use content. Machine translation approaches are not without confidentiality and content rights complications, either. Here is a detail explanation of my thoughts on the approach, as explained in an email to a client.
Dear [Client] – Thanks for the response and opportunity to explain what I'm thinking. (I'm afraid the following is longer than I expected when I started typing.)
You're right that MT has generally been considered just a cheap way to deliver low-quality output. And I agree that Google Translate and the others are currently useless as resources by themselves on high-quality translation projects. But I think there may be a way that MT can be utilized within a range of high-quality workflows.
MT is now being used on high-quality workflows within certain VERY narrow sub-topics. One of my clients is apparently even making it work for Chinese and Japanese, so Korean is clearly not impossible. For example, the translation of a 500-page cutting robot manual can be used to train an MT engine to produce a very good job on another cutting robot manual. Apparently the process breaks down pretty quickly though. The text for a cutting robot manual may not be a good enough match even for training an MT engine to generate high quality output on an assembly robot manual.
If we bring the human translator into the process, along with a properly prepared TB, the MT may be able to bring in a few value-added suggestions right away that the professional can finalize efficiently.
In addition, it appears to me that the CAT tools are on the verge of getting MT, TMs and TBs to work together, so that if the software finds, say, an 80% match in the TM, it can identify what's different, replace words from the TB, and then machine translate ONLY the sub-segments that are different. I don't know why this couldn't easily start moving fuzzy matches up 10-20% right off the bat. And there are various algorithms for figuring out how good the MT match is likely to be. That means these high-quality segments could be removed from the translation step and only included for proofreading (i.e. post-MT edited, but to perfection, not the conventional "good enough" level). In this way, the translator is still doing his/her job on the segments where the MT/TM/TB combination doesn't get to the required threshold, and the entire document is still proofread and/or linguistically QAd, and the final product is possibly of a higher quality and consistency than otherwise. Over time, this approach should yield increasingly higher efficiency.
There can be no doubt that this is the direction things are moving in our industry and I would like to start experimenting with it now. However, to plug in the MT functionality to my CAT tool generally requires special client permission, and so I've never done it, not even once. And to use the TM from one job to leverage it to build up seed TMs in various fields to apply to other client projects is also out of the question without specific approval.
Therefore, my idea is to start by offering a penny discount on projects where the client gives me, in effect, an "indefinite, irrevocable right to use" their content for such an approach. It would never involve revealing full coherent documents in public, but would mean that the segments/TB entries would go into various reference TMs/TBs/corpora to be applied to other projects and/or that the translations we do with that content may be used to train MT engines which may also be used on other projects.
I would then watch where things go from there. If I'm just hitting dead-ends and this content isn't useful to my bottom line and doesn't look like it will in the future, I could stop offering the discount. On the other hand, if the approach works, I could even increase the discounts over time.
I see it as a long-term approach. About ten years ago, I was worried that the technology would replace me eventually; I'm cautiously taking the position that the technology is creating more opportunities than it is killing. For example, I remember Google Adwords about 10 years ago… It was doable as a layperson. Today, Google has built in so much complexity that I can't even find an expert who can do it properly and affordably. You would have thought that by now the process would have been automated, but it's gone in the opposite direction, and changes so fast that one cannot stay up on it without investing huge amounts of time in continuously learning.
I suspect translation is also going to unfold this way. At memoQfest in May, I realized that some of the approaches I use in translation with my team are unique, but that the software is changing so fast that I can't hope to use all the functionality that's available. Not only learning the software, but also figuring out how to apply it and then continuously updating those approaches to the changing landscape is a process that creates barriers to entry which are likely greater than they they've ever been. You may have noticed that late last year I updated my email footer greeting to say "translation technologist". That's still a bit more of a "hopeful" title than it is in reality, but I also see a new role opening up even in the freelance side of things, which is the role that bridges project managers with translators. Project managers rarely have the time or inclination to really extract all the value and ensure all the quality that exists in the project stages between end-client and translator, especially if translators aren't using CAT tools. And this is even before the MT/TM/TB combination hits its stride. With the right skills and tools, a translation technologist could help achieve all kinds of benefits in the production chain. This part of my thinking is still in-development, but being able to use MT and leveraging TMs on a cross-client basis is surely a place to start.
Let me know your thoughts on this.