dos diferencias mongodb Consulta?

Mongodb:

1:collection such as {'num':1}{'num':2}{'num':3}{'num':4}

in my program:

list=[1,2,3,4]
db.collection.find({'num':{"$in":list}})

for i in list:
    db.collection.find({'num':i})

Are there any difference(performance) in the two method?

if I have this scene: collection such as collection1:{'num':1}{'num':2}{'num':3}{'num':4}

collection2:{'n':1}{'n':2}{'n':3}{'n':4}

nums=db.collection1.find()

1:

for num in nums:
    db.collection2.find({'n':num})

2:

list=[]
for num in nums:
    list.append(num)
db.collection2.find({'n':{"$in":list}})

Are there any difference(performance) in the two method?

preguntado el 01 de julio de 12 a las 14:07

Second one should be slower (more network roundtrips). -

Thanks,but I want to know the performance's difference maybe huge?And I want to know database's pressure between them? -

difference will depend on your actual data in the DB, network, etc. -

2 Respuestas

The first one sends the whole search query to the database at once and searches for [1,2,3,4] in one connection.

The second one opens a connection, searches for 1, comes back with a result, then goes back over the network, searches for 2, etc. This one should be slower.

Respondido 01 Jul 12, 14:07

Yes, in general you would get differences due to a variety of factors:

  • the "one call" approach entails, as observed by Sergio, less network roundtrips. If you have a large list, slow network, and would access the whole table sequentially, this option will run faster.
  • on the other hand if you have an index on the field you're searching, the single queries will run faster. If you have a small list, fast network, and slow overall database access, then it is now this second option that is likely to run faster.

Depending on what actually happens (i.e., if the documents in your collections have also huge payload, and therefore accessing them directly instead of through index is more expensive; or how many records there are; and so on), you might experience different levels of performance, but you can't say in general which approach would be more convenient.

Also, the difference is influenced by database size, whether you're sharding or not, and so on.

Frankly, on a big database in the real world, I'd rather run several times both versions in different load conditions and... time them. Too many factors are in play, and network roundtrip is only one.

If you're designing a system, carefully stake out your assumptions (including growth and scaling). It is very easy to come up with a solution that runs blazingly fast when things are small, and becomes molasses when the database grows or if it is maybe moved to a cloud.

Respondido 01 Jul 12, 15:07

this is incorrect. index will be used in both cases. there isn't a case where multiple queries will be faster that I've been able to find. - Asya Kamsky

Thanks.can you help me for the second scene? - Wahaha

No es la respuesta que estás buscando? Examinar otras preguntas etiquetadas or haz tu propia pregunta.