¿Configurar la clave única de MySQL o verificar si hay duplicados en la parte de la aplicación?

Which one is more reliable and has better performance? Setting MySQL unique key and using INSERT IGNORE or first checking if data exists on database and act according to the result?

If the answer is the second one, is there any way to make a single SQL query instead of two?

UPDATE: I ask because my colleagues in the company I work believe that deal with such issues should be done in application part which is more reliable according to them.

preguntado el 31 de enero de 12 a las 08:01

This is obvious. See item 22 on Top 1000 SQL Performance tips -

Why havent you tested them? Seems trivial to test both methods. Why not use empirical data. -

3 Respuestas

You application no se catch duplicates.

Two concurrent calls can insert the same data, because each process doesn't see the other while your application checks for uniqueness. Each process thinks it's OK to INSERT.

You can force some kind of serialisation but then you have a bottleneck and performance limit. And you have other clients writing to the database, even if it is just a release script-

That is why there are such things as unique indexes and constraints generally. Foreign keys, triggers, check constraints, NULL/NIOT NULL, datatype constraints are all there to enforce data integrity

There is also the arrogance of some code monkey thinking they can do better.

See programmers.se: Constraints in a relational databases - Why not remove them completely? y este Hacer cumplir las restricciones de la base de datos en el código de la aplicación (ENTONCES)

contestado el 23 de mayo de 17 a las 13:05

Settings a unique key is better. It will reduce the amount of round-trips to mysql you'll have to do for a single operation, and item uniqueness is ensured, reducing errors caused by your own logic.

Respondido el 31 de enero de 12 a las 12:01

You definitely should set a unique key in your MySQL table, no matter what you decide.

As far as the other part of your question, definitely use insert ignore on duplicate key update if that is what you intend for your application.

I.e. if you're going to load a bunch of data and you don't care what the old data was, you just want the new data, that is the way to go.

On the other hand, if there is some sort of decision branch that is based on whether the change is an update or a new value, I think you would have to go with option 2.

I.e. If changes to the table are recorded in some other table (e.g. table: change_log with columns: id,table,column,old_val,new_val), then you couldn't just use INSERT IGNORE because you would never be able to tell which values were changed vs. which were newly inserted.

Respondido el 31 de enero de 12 a las 12:01

No es la respuesta que estás buscando? Examinar otras preguntas etiquetadas or haz tu propia pregunta.