Code suggestion for adding translations

Today's developer telco has spawned a discussion about how more translations should be added to the existing translations hash (code for this is contained in each Language-module). Two solutions have been discussed: 1. sub Data { my $Self = shift; $Self->{Translation} = { %{$Self->{Translation}}, 'green' => 'grün', 'yellow' => 'gelb', 'Being blue is not a lot of fun' => 'Blau sein macht im Deutschen schon mehr Spaß', }; return 1; } 2. sub Data { my $Self = shift; $Self->{Translation}->{green} = 'grün'; $Self->{Translation}->{yellow} = 'gelb'; $Self->{Translation}->{Being blue is not a lot of fun} = 'Blau sein macht im Deutschen schon mehr Spaß'; return 1; } Solution 1 is (slightly) easier to read, but it is slower and requires more memory, since it copies the hash with the existing translations into a new hash. This copying does not scale well if there are a lot of translation modules being loaded (each one copies the hash, which is getting bigger and bigger). In the telco, I had suggested a third way that is as easy to read and (as I thought at the time ;-) does not suffer from bad performance: 3. sub Data { my $Self = shift; $Self->AddTranslations( { 'green' => 'grün', 'yellow' => 'gelb', 'Being blue is not a lot of fun' => 'Blau sein macht im Deutschen schon mehr Spaß', } ); return 1; } Where AddTranslations would be implemented in Kernel::Language.pm: sub AddTranslations { my ( $Self, $Translations ) = @_; return if ref $Translations ne 'HASH'; for my $TrKey (keys %$Translations) { $Self->{Translations}->{$TrKey} = $Translations->{$TrKey}; } return; } Benchmarking has shown that solution 2 is in fact unusable for many modules, as the copying of the hash becomes very slow. Solutions 1 and 3 scale ok (i.e. O(n)) but solution 1 is still twice as fast as solution 3 (as the creation of the temporary hash that is being passed into the method costs time). Please have a look at the three solutions and tell me what you think. I personally would favour solution 1, as it is the fastest and IMHO only a tiny bit harder to read than the other two. Now that I have looked at the code in Kernel::Language, I wonder if we should not try to find a way to limit the loading of Language modules to the ones which are actually being used by the current HTTP-request. Please correct me if I'm wrong (and I hope I am), but as far as I can see, we currently load *all* available Language modules in the constructor of Kernel::Language. In an OTRS setup that has many modules installed, this would cause quite a performance hit, wouldn't it? cheers, Oliver

Oliver Tappe (ot@otrs.com) wrote: [...]
Solution 1 is (slightly) easier to read, but it is slower and requires more memory, since it copies the hash with the existing translations into a new hash. This copying does not scale well if there are a lot of translation modules being loaded (each one copies the hash, which is getting bigger and bigger).
[...]
Benchmarking has shown that solution 2 is in fact unusable for many modules, as the copying of the hash becomes very slow. Solutions 1 and 3 scale ok (i.e. O(n)) but solution 1 is still twice as fast as solution 3 (as the creation of the temporary hash that is being passed into the method costs time).
Hi, did you swap solution 1 for solution 2 in the above passage? Solution 1 ought to be slow, whereas #2 doesn't copy any hashes...? I'd like to propose another solution: 4. sub Data { my $Self = shift; my $Language = $Self->{Translation}; return if ref $Language ne 'HASH'; $Language->{green} = 'grün'; $Language->{yellow} = 'gelb'; $Language->{Being blue is not a lot of fun} = 'Blau sein macht im Deutschen schon mehr Spaß'; return 1; }
Now that I have looked at the code in Kernel::Language, I wonder if we should not try to find a way to limit the loading of Language modules to the ones which are actually being used by the current HTTP-request. Please correct me if I'm wrong (and I hope I am), but as far as I can see, we currently load *all* available Language modules in the constructor of Kernel::Language.
Hmm, no, just all modules which match $Self->{UserLanguage}_*.pm. If you use mod_perl for Apache, loading some unused modules shouldn't have that performance impact, I guess (Yes, I know, OTRS doesn't run on Apache only). Regards, Felix

Hi Felix, thanks for your input! Felix J. Ogris schrieb:
Oliver Tappe (ot@otrs.com) wrote:
[...]
Solution 1 is (slightly) easier to read, but it is slower and requires more memory, since it copies the hash with the existing translations into a new hash. This copying does not scale well if there are a lot of translation modules being loaded (each one copies the hash, which is getting bigger and bigger).
[...]
Benchmarking has shown that solution 2 is in fact unusable for many modules, as the copying of the hash becomes very slow. Solutions 1 and 3 scale ok (i.e. O(n)) but solution 1 is still twice as fast as solution 3 (as the creation of the temporary hash that is being passed into the method costs time).
Hi,
did you swap solution 1 for solution 2 in the above passage? Solution 1 ought to be slow, whereas #2 doesn't copy any hashes...?
Yep, sorry about that :-/
I'd like to propose another solution:
4. sub Data { my $Self = shift; my $Language = $Self->{Translation};
return if ref $Language ne 'HASH';
$Language->{green} = 'grün'; $Language->{yellow} = 'gelb'; $Language->{Being blue is not a lot of fun} = 'Blau sein macht im Deutschen schon mehr Spaß';
return 1; }
Yep, surely better than 3 :-) I would prefer renaming $Language to $Translation though, for improved clarity.
Now that I have looked at the code in Kernel::Language, I wonder if we should not try to find a way to limit the loading of Language modules to the ones which are actually being used by the current HTTP-request. Please correct me if I'm wrong (and I hope I am), but as far as I can see, we currently load *all* available Language modules in the constructor of Kernel::Language.
Hmm, no, just all modules which match $Self->{UserLanguage}_*.pm.
Ah, you are right of course. I was not precise enough - what I should have been saying is that OTRS always loads all Language modules matching the current language. So if the user is currently surfing the FAQ-module in German, the de_* files for all other modules would be loaded, too, even if they are not used at all by the current HTTP-request. Then again, the exact performance penalty of this is unclear, especially since many modules provide an icon on the toplevel which will cause that modules language files to be loaded anyway. I suppose in the long run it would be best to keep the translations in a persistent cache, such that they would only be loaded once, not once per request. Implementing this should not be a problem with Apache & ModPerl, but I wonder whether this is possible with IIS? Ah well, something to tinker with on long, boring days ;-)
If you use mod_perl for Apache, loading some unused modules shouldn't have that performance impact, I guess (Yes, I know, OTRS doesn't run on Apache only).
The actual .pm-loading is cached by ModPerl, but not the invocation of Data(), so the copying of the translation still takes place for every request - or am I missing something? cheers, Oliver

Oliver Tappe (ot@otrs.com) wrote:
Ah, you are right of course. I was not precise enough - what I should have been saying is that OTRS always loads all Language modules matching the current language. So if the user is currently surfing the FAQ-module in German, the de_* files for all other modules would be loaded, too, even if they are not used at all by the current HTTP-request.
How are you going to tell Kernel::Language which language subset you need for the current request? Kernel::Language is instanced from Kernel::Output::HTML::Layout. So either Kernel::Language evaluates the current action parameter, or the frontend modules in Kernel/Modules/ provide each call of LayoutObject->Output() with that extra information.
The actual .pm-loading is cached by ModPerl, but not the invocation of Data(), so the copying of the translation still takes place for every request - or am I missing something?
Yes, you are right. But I think loading a .pm is more costly then calling Data(). With Apache/mod_perl, one could load a global language data hash by using the PerlRequire statement (yep, optimization always breaks flexibility). Regards, Felix

Felix J. Ogris schrieb:
Oliver Tappe (ot@otrs.com) wrote:
Ah, you are right of course. I was not precise enough - what I should have been saying is that OTRS always loads all Language modules matching the current language. So if the user is currently surfing the FAQ-module in German, the de_* files for all other modules would be loaded, too, even if they are not used at all by the current HTTP-request.
How are you going to tell Kernel::Language which language subset you need for the current request? Kernel::Language is instanced from Kernel::Output::HTML::Layout. So either Kernel::Language evaluates the current action parameter,
that probably wouldn't be enough, as it would still be possible for other modules' translations to be required (e.g. for toplevel icon names). It would be difficult to find out which ones just by looking at the action parameter.
or the frontend modules in Kernel/Modules/ provide each call of LayoutObject->Output() with that extra information.
Well, kind of. Every invocation of Kernel::Language::Get() would have to provide a context along with the string that's to be translated. This context would be useful for separating multiple occurrences of the same phrase (with different translations), too. As part of that context, the originating module could be specified. But this clearly is future stuff. cheers, Oliver
participants (2)
-
Felix J. Ogris
-
Oliver Tappe