Multilanguage support: lead language & automatic translation
I know that this is somewhat over the top at the moment. But I am wondering if you ever thought about merging different languages on one platform via automatic translation. DeepL does an amazing job ... unfortunatly the API is not available for free (even not for FOSS). I recently found a setting in my Outlook account which offers automatic translation for my mail. So, definitively to early for implementation today for FOSS (I think). But maybe it could make sense to think ahead, keep it in mind and arrange the overall structure accordingly in advance.
Just an idea.
- 4 replies
- KajMagnus @KajMagnus2020-01-19 15:11:55.123Z
B.t.w. yes auto translations this is something I think would be nice. I've actually previously slightly thought about that, and ... There's for example Google's translations API: https://cloud.google.com/translate/pricing — seems to cost $20 per million characters, maybe that's not too expensive? Especially if only a few people write in a different language.
I'm thinking probably it'd be good if a translations service was *not * bundled with Talkyard. Instead one would connect to some external translation service. Maybe one's own open source, or Google's, or sth else
- In reply tomarkymark⬆:KajMagnus @KajMagnus2020-01-19 15:04:02.852Z2020-01-19 23:16:13.145Z
How could this work? There'd be a forum main language — say, English. Then someone posts a reply in, say, Arabic. And Talkyard then asks a translation API to translate to English, and then saves both the Arabic and English versions of the reply? And, people would have their individual language preferences (maybe looking at the
Accept-Languageheader). If one's language prefs is anything else than Arabic, then the English version is shown — together with an info message that the comment has been auto translated, and a button to show the text in the original language instead?
A problem with this, could be that the translation APIs get better and better, and a few year later, the translation now stored in the database, could be quite lousy, in comparison to a newly generated translation.
O.t..o.h., if not storing translations in the database, then people would manually need to click a Translate button, on each "foreign language" comment they wanted to read
And if refreshing the translations maybe once a year [edit: only lazily, as needed], so they become more and more accurrate as the years go by, then, maybe people would be confused about the English text slightly changing every now and then. Or maybe they wouldn't notice this or think about it, or maybe they'd just think this was a good thing — if there's some info text that explains how this works
What do you think
Well, I have not thought too much about an actual implementation so far. I would say the translation could be static (only once at the time of the post). So, there would be nothing like a "current" translation that might be misleading. Automatic translation (today) always has the risk of being slightly besides the track. But, it will always be better than nothing if you would like to communicate in a foreigen language that you are not familar with. After all, this is what a lot of people do today ... copy from borad ... paste to DeepL ... write ... copy from DeepL ... paste to board. Runs like a charm as long as you are not into ancient literature or heart surgery ;-)
Regarding the costs of the service, I might be wrong, but I think a pretty good translation will be available for free in a few months ... at least for open source projects. I am pretty sure you have a lot of things on your 'to do' list ... it was just the idea to keep this thing in mind if you think about backend structure. So that maybe you just have to "toggle a switch" if a proper service is available.
Really apprechiate your effort. Much better attitue than Stackwhatsoever ...
- KajMagnus @KajMagnus2020-03-01 20:51:08.411Z
Hi Mark, now there's a related discussion in another topic,
about many (well, 3) languages in the same discussion:
They'll write in English and Hebrew. To make that work, Talkyard will need to remember a bit about what language each post is written in — at least if the language is RTL (right-to-left) or LTR.
This could become a small starting point, with slowly slowly step-by-step making Talkyard handle many languages in the same discussion, and auto translating between them.
Automatic translation [...] always be better than nothing if you would like to communicate in a foreigen language that you are not familar with
Yes indeed. I was in places where I didn't speak the languages or understood the letters — and Google Translate worked okay well, at least for simple things like understanding what an event I found on Facebook, was about.
I think a pretty good translation will be available for free in a few months
This makes me wonder if you know the people who are developing it? Or if you're developing it?
It'd be interesting to have a look, once a website (or GitHub repo?) is available.
keep this thing in mind if you think about backend structure
Hmm. Maybe there'd be a
post_translationsdatabase table, with columns:
post_id, post_revisoin_nr, -- there's edit history & revisions from_lang, to_lang, transl_at, -- if is years old, maybe want to re-translate (lazily / on-demand only) transl_api_request_url, -- so knows what server did the translation transl_method_name, -- e.g. GNMT = Google Neural Machine Translation source_text, result_text