¿Cómo hago para que una consulta SQL opere en un subconjunto de datos sin repetirme?

I have a table with app statistics, let’s say the columns are os_version y app_id (simplifying). I want to list all OS versions for a certain app, and for each OS version I want to see a number of records that have at least this OS version. Example data:

app1 1.0
app1 2.0
app2 1.0

Ahora para app1 Quiero ver:

version | score | comments
--------+-------+---------
1.0     | 2     | there are two records having OS at least 1.0
2.0     | 1     | there is just one record with OS at least 2.0

Now, pardon my ignorant SQL, but I have come up with this query:

SELECT
    os_version,
    -- number_of_records_having_at_least_this_version / number_of_all_records
    (SELECT COUNT(*) FROM stats WHERE app_id='app1' AND os_version >= outer.os_version)/ 
        (SELECT COUNT(*) FROM stats WHERE app_id='app1') AS score
FROM (SELECT * FROM stats WHERE app_id='app1') AS outer
GROUP BY os_version;

This is crazy, as I have to filter by the app ID three times. Is it possible to filter by the app ID first and then use the resulting row set for further operations? Without a temporary table? In SQLite? Something like:

SELECT
    os_version,
    (SELECT COUNT(*) FROM filtered WHERE os_version >= filtered.os_version)/
        (SELECT COUNT(*) FROM filtered) AS score
FROM (SELECT * FROM stats WHERE app_id='app1') AS filtered
GROUP BY os_version;

…which, sadly, does not work.

preguntado el 31 de enero de 12 a las 16:01

The real problem is how to write SQL code that identifies that '1.0' is earlier version that '2.0'. -

1 Respuestas

I think, if I've understood the question properly, that something like this should do the trick..

SELECT
    s1.app_id,
    s1.os_version,
    count(*)
FROM stats s1 INNER JOIN stats s2 ON s1.app_id = s2.app_id 
                                  AND s2.os_version >= s1.os_version
GROUP BY s1.app_id, s1.os_version

EDIT: This returns results fitered by app_id (as with the example query in the question)

SELECT
    s1.app_id,
    s1.os_version,
    count(*)
FROM stats s1 INNER JOIN stats s2 ON s1.app_id = s2.app_id 
                                  AND s2.os_version >= s1.os_version
WHERE s1.app_id = 'app1'
GROUP BY s1.app_id, s1.os_version

EDIT2:

SELECT
    s1.app_id,
    s1.os_version,
    count(*)
FROM stats s1 INNER JOIN stats s2 ON s1.app_id = s2.app_id 
                                  AND s2.os_version >= s1.os_version
                                  AND s1.app_id = 'app1'
GROUP BY s1.app_id, s1.os_version

EDIT3: To get round the 2.11 < 2.7 problem

SELECT
    s1.app_id,
    s1.os_version,
    count(*)
FROM stats s1 INNER JOIN stats s2 ON s1.app_id = s2.app_id 
              AND CAST(REPLACE(s2.os_version, '.', '') TO INTEGER) >= 
                  CAST(REPLACE(s1.os_version, '.', '') TO INTEGER)
              AND s1.app_id = 'app1'
GROUP BY s1.app_id, s1.os_version

Respondido el 31 de enero de 12 a las 21:01

This looks promising, but does not yield any results in a reasonable timeframe on my machine :) The query mentioned in the question finishes almost immediately. - zoul

Strange, assuming you have indexes on your id and version columns, then this is about as performant as you can get.. I've tested on a table with ~40k records and it returns almost immediately for me.. It should be quicker if you filter on the app_id, but I'm not sure if thats what you want? - StevieG

I’m an SQL noob, so it’s quite possible that I am doing something incredibly stupid. There’s around 800k rows in the table. I’ll check the indexes. - zoul

looks good to me. Adding 'app1' in a where clause will filter by app_id too - CD Jorgensen

¿Cómo se s2.os_version >= s1.os_version trabajar con '2.11' >= '2.7' ? - ypercubeᵀᴹ

No es la respuesta que estás buscando? Examinar otras preguntas etiquetadas or haz tu propia pregunta.