Encontrar UnicodeBlock configurado para una configuración regional determinada

I'm currently trying to figure out how to get a Character.UnicodeBlock set for a given Locale. Languages need differents characters from one to another.

What I'm exactly trying to achieve is having a String containing every character needed to write in a specific language. I can then use this String to precompute a set of OpenGL textures from a TrueTypeFont file, so I can easily write any text in any language.

Precaching every single character and having around 1000000 textures is of course not an option.

Does anyone have an idea ? Or does anyone see a flaw in this procedure ?

preguntado el 27 de agosto de 11 a las 19:08

I see the problem that text written by humans might contain not only characters which occur in the language in question, but also various symbols from common areas of unicode (like ℕ, ℝ or other mathematical symbols). Or foreign words. -

Some characters set are found in every language, like new line, tabulation, whitespace, etc. Those are, of course, part of the UnicodeBlock set I want to have. If they aren't, I can still add them easily. -

1 Respuestas

It's not as simple as that. Text in most European languages can often be written with a simple set of precomposed Unicode characters, but for many more complex scripts you need to handle composing characters. This starts fairly easily with combining accents for Western alphabets, progresses through Arabic letters that are context-sensitive (they have different shapes depending on whether they are first, last, or in the middle of a word), and ends with the utter madness that is found in many Indic scripts.

Al Estándar Unicode has chapters about the intricacies involved in rendering the various scripts it can encode. Just sample, for example, the description of Tibetan early in chapter 10, and if that doesn't scare you away, flip back to Devanagari in chapter 9. You will quickly drop your ambition of being able to "write text in any language". Doing so correctly requires specialized rendering software, written by experts deeply familiar with the scripts in question.

Respondido 28 ago 11, 00:08

That's actually what I was scared of. I guess 256 chars will be more than enough then. Thanks ! - Klems

I'm however still trying to find a way to do it. Not "every language", but at least the most used Locale. I'm obviously not going to use hebrew or braille. - Klems

It's still not clear to be what "do it" is, actually. Is there some reason you cannot use your system's OS-supplied ways of displaying text? That would probably be most likely to have an acceptable renderer for the user's language. - hmakholm dejó a Monica

I'm creating a game using OpenGL and I want to display text on the viewport. In order to display text in-game, I have to render every potential character on a texture beforehand (unless I can find a library which doesn't use bitmap font). If the user is english, the game is obviously not going to use cyrillic alphabet. So why bother rendering cyrillic character beforehand ? - Klems

No es la respuesta que estás buscando? Examinar otras preguntas etiquetadas or haz tu propia pregunta.