Note: Currently new registrations are closed, if you want an account Contact us

Difference between revisions of "SMC/AtomicChilluIsUnacceptable"

From FSCI Wiki
(started the document for submitting to UTC)
 
Line 3: Line 3:
The Atomic chillu's are unacceptable because it destroys the link of a chillu with its base character.
The Atomic chillu's are unacceptable because it destroys the link of a chillu with its base character.


1. The examples used to justify semantic difference between  words only separated by zwj are non-existent in dictionary or are grammatically wrong or meaningless without proper context.


a) വന്‍യവനിക/വന്യവനിക (vanYavanika/vanyavanika), കണ്‍വലയം/കണ്വലയം (kanvalayam/kanualayam) ... contrived examples not found in dictionary
b)  ആ മനുഷൃന്‍ കൊടുക്കുന്നു (that man is giving)
ആ മനുഷൃനു് കൊടുക്കുന്നു (giving to the man)
as per malayalam lingustic rules the sentence is a mistake.
it will be completed if and only if you need to write it as following.
Structure:
ആ മനുഷ്യന്‍ <to whom & what he is gives>  കൊടുക്കുന്നു
ആ മനുഷ്യനു്  <who is giving & what is giving> കൊടുക്കുന്നു.
Example:
ആ മനുഷ്യന്‍ (man) പൂച്ചക്ക് (to cat) പാല്‍(milk) കൊടുക്കുന്നു (That man is giving
milk to cat )
ആ മനുഷ്യനു് (to man) പൂച്ച (cat) പാല്‍ (milk) കൊടുക്കുന്നു. (That cat is giving milk
to man)  :-) 
Fundamental problem lies here in the unicode's way of treating only representational forms without checking linguistic correctness. Most  of the indic languages are unlike latin and collations are based on linguistic base. If you are not considering it, it will become a play yard of people with vested interests
2. All these arguments were once considered and rejected by UTC and the only new argument in support of atomic chillus is the issue of missing domain names in IDN. The examples given in 1) can't be considered real as these are contrived just to make a case for atomic chillus. Even if were real it is similar to case folding in Latin (You can't register two sites PenIsland.com and PenisLand.com). How can already rejected proposal be accepted when the new arguments in supports is not only proved to be real, but creates a lot of new chaos and security problems.
3. This will create dual encoding and makes URL spoofing very easy.
http://റാല്‍മിനോവ്.blogspot.com (using chillu joiner sequence)
http://റാൽമിനോവ്.blogspot.com (using atomic chillu)
because both of these have different punicode. The existing chillu encoding with joiners is best solution because all of the combinations of joiners and non-joiners give exactly same punicode.
4. Since the joiners has to be supported for backward compatibility it  creates unnecessary complexity to all text processing application (sorting, searching) and it makes atomic chillus redundant and useless.
5. Why isn't a canonical equivalence to old sequences not provided?
6. Even after atomic chillus are  made part of the standard many words cannot be written without joiners and it would be increasing the chaos.
കൊയ്‌രാള (koirala), സദ്‌വാരം (sadvaram)
7. Using virama with chillus is linguistically incorrect (function of virama is to create vowel-less and you can't use it with a chillu or pure consonant because these are already vowel-less forms of the underlying consonants)
I strongly oppose including this characters in the standard as it not only fail to solve all the problems with joiner it creates lots of new problems and the need for providing backward compatibility will produce more chaos in encoding chillus.


Praveen A
Praveen A
Swathanthra Malayalam Computing
 
Swathanthara Malayalam Computing
 
http://www.smc.org.in

Revision as of 07:02, 28 January 2008

[Working Draft]

The Atomic chillu's are unacceptable because it destroys the link of a chillu with its base character.

1. The examples used to justify semantic difference between words only separated by zwj are non-existent in dictionary or are grammatically wrong or meaningless without proper context.

a) വന്‍യവനിക/വന്യവനിക (vanYavanika/vanyavanika), കണ്‍വലയം/കണ്വലയം (kanvalayam/kanualayam) ... contrived examples not found in dictionary b) ആ മനുഷൃന്‍ കൊടുക്കുന്നു (that man is giving)

ആ മനുഷൃനു് കൊടുക്കുന്നു (giving to the man)

as per malayalam lingustic rules the sentence is a mistake. it will be completed if and only if you need to write it as following.

Structure:

ആ മനുഷ്യന്‍ <to whom & what he is gives> കൊടുക്കുന്നു ആ മനുഷ്യനു് <who is giving & what is giving> കൊടുക്കുന്നു.

Example:

ആ മനുഷ്യന്‍ (man) പൂച്ചക്ക് (to cat) പാല്‍(milk) കൊടുക്കുന്നു (That man is giving milk to cat ) ആ മനുഷ്യനു് (to man) പൂച്ച (cat) പാല്‍ (milk) കൊടുക്കുന്നു. (That cat is giving milk to man)  :-)

Fundamental problem lies here in the unicode's way of treating only representational forms without checking linguistic correctness. Most of the indic languages are unlike latin and collations are based on linguistic base. If you are not considering it, it will become a play yard of people with vested interests

2. All these arguments were once considered and rejected by UTC and the only new argument in support of atomic chillus is the issue of missing domain names in IDN. The examples given in 1) can't be considered real as these are contrived just to make a case for atomic chillus. Even if were real it is similar to case folding in Latin (You can't register two sites PenIsland.com and PenisLand.com). How can already rejected proposal be accepted when the new arguments in supports is not only proved to be real, but creates a lot of new chaos and security problems.

3. This will create dual encoding and makes URL spoofing very easy.

http://റാല്മിനോവ്.blogspot.com (using chillu joiner sequence) http://റാൽമിനോവ്.blogspot.com (using atomic chillu)

because both of these have different punicode. The existing chillu encoding with joiners is best solution because all of the combinations of joiners and non-joiners give exactly same punicode.

4. Since the joiners has to be supported for backward compatibility it creates unnecessary complexity to all text processing application (sorting, searching) and it makes atomic chillus redundant and useless.

5. Why isn't a canonical equivalence to old sequences not provided?

6. Even after atomic chillus are made part of the standard many words cannot be written without joiners and it would be increasing the chaos. കൊയ്‌രാള (koirala), സദ്‌വാരം (sadvaram)

7. Using virama with chillus is linguistically incorrect (function of virama is to create vowel-less and you can't use it with a chillu or pure consonant because these are already vowel-less forms of the underlying consonants)

I strongly oppose including this characters in the standard as it not only fail to solve all the problems with joiner it creates lots of new problems and the need for providing backward compatibility will produce more chaos in encoding chillus.

Praveen A

Swathanthara Malayalam Computing

http://www.smc.org.in