Eventually /dev/null

Random thoughts from an emerging Programmer

Be Careful With Approving Comments

Ever got Comments you don’t understand, because they’re in some language you’re not able to speak? Ever asked why someone writes a Comment in Russian to some post, which is let’s say written in english, german, french or whatever( at the same time I want to notice that I don’t talk about Comments made to some .de, .fr, .ru Blog, in the depending language where the owner of the Blog is able to speak that language as well. Where a commenter could believe it would be better to write in their native language as possible missunderstandings could be avoided because of better knowledge of this language).
What could be the reason to behave like that? In general they should be able to write some basic text in english, german, french, as it seems that they’re able to read the text, quite good enough to be able to add their own comment/opinion. So why don’t they do so?
Let’s try to get behind the reason. Following I’ll show you how I handle comments in languages I don’t understand, with an example I received the last day:

The Comment

So let’s take a look at the given Comment(to avoid contribution to this Spammer/Hacker, I replaced some Data):
1000 Ñ„�?¾Ñ€Ñƒ�?¼�?¾�?² 2 �?´�?¾�?»�?»�?°Ñ€�?° 5000 Ñ„�?¾Ñ€Ñƒ�?¼�?¾�?² 8 �?´�?¾�?»�?»�?°Ñ€�?¾�?² 10000 Ñ„�?¾Ñ€Ñƒ�?¼�?¾�?² 13 �?´�?¾�?»�?»�?°Ñ€�?¾�?² 50000 Ñ„�?¾Ñ€Ñƒ�?¼�?¾�?² 50 �?´�?¾�?»�?»�?°Ñ€�?¾�?²
�?‘�?¾�?½ÑƒÑ? �?¿Ñ€�?µ�?´�?»�?¾�?¶�?µ�?½�?¸�?µ �?´�?»Ñ? Ñ‚�?µÑ… �?ºÑ‚�?¾ �?·�?°�?º�?°�?¶�?µÑ‚ 20000 Ñ„�?¾Ñ€Ñƒ�?¼�?¾�?² ч�?µÑ€�?µ�?· �?½�?µ�?´�?µ�?»ÑŽ �?¿�?¾�?²Ñ‚�?¾Ñ€�?½�?°Ñ? �?¾Ñ‚�?¿Ñ€�?°�?²�?º�?°
�?� �?µÑ„�?¿Ñ€�?µ�?´�?»�?¾�?¶�?µ�?½�?¸�?µ: ч�?µ�?»�?¾�?²�?µ�?º �?º�?¾Ñ‚�?¾Ñ€Ñ‹�?¹ �?¿Ñ€�?¸�?²�?µ�?´�?µÑ‚ �?¼�?½�?µ �?º�?»�?¸�?µ�?½Ñ‚�?° �?±Ñƒ�?´�?µÑ‚ �?¿�?¾�?»ÑƒÑ‡�?°Ñ‚ÑŒ 10% �?¾Ñ‚ �?·�?°�?º�?°�?·�?° �?º�?»�?¸�?µ�?½�?º�?°!!!
�?ž�?±Ñ€�?°Ñ‰�?°Ñ‚ÑŒÑ?Ñ? �?² �?°Ñ?ÑŽ 3�?¿Ñ?Ñ‚ÑŒ3-8ш�?µÑ?Ñ‚ÑŒ7-0�?½�?¾�?»ÑŒ1 �?¼Ñ‹�?»�?¾ mymail(�?³�?°�?²)example.com

That one made me curious as there are quite some numbers added, as well as an additional email is added which doesn’t fit to the input in the Email field. Let’s check that Comment by translating it in our native language or some language we understand.

Translating the Comment

In general I can recommend to translate it into your native language, as you mostly understand that one best. On the other hand the chosen translator, has maybe your language not available(or the dictionary is quite limited), or it’s not possible to translate between these languages directly.
You should avoid to have the text translated more than once before it’s in some language you understand, as the general problem with automatic translations are: that not the best sentence structure and word choosing is done. So you could end upon two or three translations steps within some nonsense text(that wouldn’t be better). The best way maybe, to let it translate to English, and if you don’t understand some english words let them be translated to your native language.
for the example above we would get something like that:
1000 forums 2 dollars of 5000 forums of 8 dollars of 10000 forums of 13 dollars of 50000 forums of 50 dollars the Bonus the offer for those who will order 20000 forums in a week repeated sending �?� �?µÑ„�?¿Ñ€�?µ�?´�?»�?¾�?¶�?µ�?½�?¸�?µ: the person which will result to me the client will receive 10 % from the order �?º�?»�?¸�?µ�?½�?º�?°!!! To address in �?°Ñ?ÑŽ 3»nÃ��?ý3-8ÞÑßÃ��?ý7-0¡«½ý1 soap mymail (�?³�?°�?²) example.com
That makes now quite more sense, doesn’t it? It seems as that’s the pricelist for Spamming of Forums, we even see that we get 10% of the profit from something!

Translate unknown words

Now we know quite surely that this is a Spam comment, but as you can see as well, we have some not translated words, like �?º�?»�?¸�?µ�?½�?º�?° (these can be sometimes important) so let’s have them translated as well, don’t we want to know how to receive our 10%?
If you’re using some good Translator, you should have the option to have unknown words transliterated into the target language. So for our �?º�?»�?¸�?µ�?½�?º�?° we would get something like: klienka that sounds like client. Let’s guess that we receive 10% of the Money the client pay for his contract.

Deciding dropping or keeping?

Now you should have enough information to decide if it’s a Spam comment or some legit one. If it’s Spammy it shouldn’t be hard to decide, if it’s some legit Comment I advice to keep the initial comment and add below it the translation. If you like you can as well improve the comment, but note explicit where you made changes!

Some good online Translators

Where can I get my text translated to some other language?
Just search for some Translate/Translation From-Language to-language and you should find some useful result. A good translator is PROMT, there you’re able to translate some texts as whole(no word for word translations) of some languages, or Babelfish. If you need to get some words translated into your language search for some dictionary for the given languages.

Conclusion

As you see, it’s better to prove comments of other languages as well(these will often pass Spamfilter) for Spam. If you can’t get the comment translated, it’s mostly better to keep the comment back or to drop it. From my point of view it’s better to have one or two legit comments less than to have one Spammy.