Calcular la longitud real de una cadena, como hacemos con el signo de intercalación

What I want is to calculate how much time the caret will move from the beginning till the end of the string.

Explicaciones:
Look this string "" in this fiddle: http://jsfiddle.net/RFuQ3/
If you put the caret before the first quote then push the right arrow you will push 3 times to arrive after the second quote (instead of 2 times for an empty string).

The first way, and the easiest to calculate the length of a string is <string>.length.
But here, it returns 2.

The second way, from JavaScript Obtiene la longitud real de una cadena (sin entidades) gives 2 too.

¿Cómo puedo obtener 1?


1-I thought to a way to put the string in a text input, and then do a while bucle con un try{setCaret}catch(){}
2-It's just for fun

preguntado el 30 de junio de 12 a las 19:06

Just add one to the length returned? -

What are you actually trying to accomplish? That would probably be useful to know. Are you wanting to track cursor crawling left-to-right to detect or manipulate something? -

Also, can you demonstrate that effect somewhere other than jsFiddle, which is something of a unique editing environment? In other words, regular input y textarea don't have this effect. jsFiddle's text manipulation scripts aren't perfectly tuned. -

@JaredFarrish If a user push the arrow to count the length in a textbox, he will get x. In most cases ("foo",123,ಠ_ಠ) length will give x. But not in my example. -

See my second comment. Do you have another place to demonstrate that's not jsFiddle? What kind of "cursor-based" environment do have to work with? -

3 Respuestas

The character in your question "󠀁" is the Unicode Character 'LANGUAGE TAG' (U+E0001).

From the following Stack Overflow questions,

aprendemos que

JavaScript strings are UCS-2 encoded but can represent Unicode code points outside the Basic Multilingual Pane (U+0000-U+D7FF y U+E000-U+FFFF) using two 16 bit numbers (a UTF-16 surrogate pair), the first of which must be in the range U+D800-U+DFFF.

El UTF-16 surrogate pair representing "󠀁" is U+DB40 y U+DC01. In decimal U+DB40 es 56128, y U+DC01 es 56321.

console.log("󠀁".length); // 2
console.log("󠀁".charCodeAt(0)); // 56128
console.log("󠀁".charCodeAt(1)); // 56321
console.log("\uDB40\uDC01" === "󠀁"); // true
console.log(String.fromCharCode(0xDB40, 0xDC01) === "󠀁"); // true

Adapting the code from https://stackoverflow.com/a/4885062/788324, we just need to count the number of code points to arrive at the correct answer:

var getNumCodePoints = function(str) {
    var numCodePoints = 0;
    for (var i = 0; i < str.length; i++) {
        var charCode = str.charCodeAt(i);
        if ((charCode & 0xF800) == 0xD800) {
            i++;
        }
        numCodePoints++;
    }
    return numCodePoints;
};

console.log(getNumCodePoints("󠀁")); // 1

Demostración de jsFiddle

contestado el 23 de mayo de 17 a las 12:05

function realLength(str) {
    var i = 1;
    while (str.substring(i,i+1) != "") i++;
    return (i-1);
}

Didn't try the code, but it should work I think.

Respondido el 30 de junio de 12 a las 19:06

Accepting the premise of the ghost hault on the cursor, "2" still has the issue after the first " (en Firefox). - Jared Farrish

@user1493235 Sorry, but it doen't work, it just return <string>.length-1 Mirar esto jsfiddle.net/zDwPu (And test the real length with your caret) - Mago

This solution is slow (.substring(i,i+1) en lugar de [i]), wrong (var i = 1;) and useless (isn't the answer to the question). - maxarte

No down votes for "Didn't try the code, but it should work I think." ? - user1566694

Javascript doesn't really support unicode. You can try

yourstring.replace(/[\uD800-\uDFFF]{2}/g, "0").length

Por lo que vale

Respondido el 30 de junio de 12 a las 19:06

Are you sure that uDFFF is the limit? - Mago

I did not understand very well. The limit is \uFFFF or \uDFFF? What should I use? And why? Thank you. - Mago

What exactly do you not understand? It's written uDFFF, what other limit could be there? Probably, more correct expression would be yourstring.replace(/[\uD800-\uDBFF][\uDC00-\uDFFF]/g, "0").length, but who cares, really... - panda-34

panda-34, did somebody step on your birthday cake today? Lil' pessimism much? - Jared Farrish

No es la respuesta que estás buscando? Examinar otras preguntas etiquetadas or haz tu propia pregunta.