Recuperar filas de detalles de un grupo según el gran total

I have a table that looks like this one :

+------+------+------------------+
| item | val  |    timestamp     |
+------+------+------------------+
|  1   | 3.66 | 16-05-2011 09:17 | 
|  1   | 2.56 | 16-05-2011 09:47 | 
|  2   | 4.23 | 16-05-2011 09:37 | 
|  3   | 6.89 | 16-05-2011 11:26 | 
|  3   | 1.12 | 16-05-2011 12:11 |
|  3   | 4.56 | 16-05-2011 13:23 |
|  4   | 1.10 | 16-05-2011 14:11 |
|  4   | 9.79 | 16-05-2011 14:23 |
|  5   | 1.58 | 16-05-2011 15:27 |
|  5   | 0.80 | 16-05-2011 15:29 |
|  6   | 3.80 | 16-05-2011 15:29 |
+------+------+------------------+

so, the grand total of all item for the day : 16 2011 mayo es: 40.09

Now i want to retrieve which items of this list form an amount of 80% of the grand total. Let me make an example :

  • Grand Total : 40.09
  • 80% of the Grand Total : 32.07

starting from the item with more percentage weight on the total amount i want to retrieve the grouped list of the item that form the 80% of the grand total :

+------+------+
| item | val  |
+------+------+
|  3   | 12.57|
|  4   | 10.89|
|  1   |  6.22|
+------+------+

As you can see the elements in the result set are the elements grouped by item code and ordered from the element with greater percentage weight on the grand total descending until reaching the 80% threshold.

From the item 2 onward the items are discarded from the result set because they exceed the threshold of 80%, because :

12.57 + 10.89 + 6.22 + 4.23 > 32.07 (80 % of the grand total )

This is not an homework, this is a real context where i am stumbled and i need to achieve the result with a single query ...

The query should run unmodified or with few changes on MySQL, SQL Server, PostgreSQL .

preguntado el 08 de noviembre de 11 a las 18:11

So are you really just needing to sum the value by item, and then return the largest summed values as long as the sum of the summed values are < 80% of the grand total? -

I need to retrieve a result set like the one in the post, with the list of element which sum are nearest to 80% but not exceed 80%. -

2 Respuestas

Usted enlatado do this with a single query:

WITH Total_Sum(overallTotal) as (SELECT SUM(val) 
                                 FROM dataTable), 

     Summed_Items(id, total) as (SELECT id, SUM(val)
                                 FROM dataTable
                                 GROUP BY id),

     Ordered_Sums(id, total, ord) as (SELECT id, total, 
                                         ROW_NUMBER() OVER(ORDER BY total DESC)
                                      FROM Summed_Items),

     Percent_List(id, itemTotal, ord, overallTotal) as (
                  SELECT id, total, ord, total
                  FROM Ordered_Sums
                  WHERE ord = 1
                  UNION ALL
                  SELECT b.id, b.total, b.ord, b.total + a.overallTotal
                  FROM Percent_List as a
                  JOIN Ordered_Sums as b
                  ON b.ord = a.ord + 1
                  JOIN Total_Sum as c
                  ON (c.overallTotal * .8) > (a.overallTotal + b.total))

SELECT id, itemTotal
FROM Percent_List

Which will yield the following:

id      itemTotal
3       12.57  
4       10.89  
1       6.22  

Tenga en cuenta que esto no work in mySQL (no CTEs), and will require a more recent version of postgreSQL to work (otherwise OLAP functions are not supported). SQLServer should be able to run the statement as-is (I think - this was written and tested on DB2). Otherwise, you may intento to translate this into correlated table joins, etc, but it will no be pretty, if it's even possible (a stored procedure or re-assembly in a higher level language may then be your only option).

respondido 09 nov., 11:01

It worked on PostgreSQL changing WITH with WITH RECURSIVE, the only problem is MySQL however great technique. - aleroot

Please note that this version will not consider 'future' rows that may still add up to less than the cutoff amount - that is, it doesn't skip rows. I was assuming this was the desired behaviour, because otherwise different rows from the initial set should have been included. - Musa Mecánica

This is the intended behaviour, Thanks. - aleroot

I don't know of any way this can be done with a single query; you'll probably have to create a stored procedure. The steps of the proc would be something like this:

  1. Calculate the grand total for that day by using a SUM
  2. Get the individual records for that day ordered by val DESC
  3. Keep a running total as you loop through the individual records; as long as the running total is < 0.8 * grandtotal, add the current record to your list

respondido 08 nov., 11:22

@aleroot - You can loop through individual records, si you create a stored procedure. Most RDBMSs will allow the procedure code to be created in (almost) any language - including an extension to SQL, which contain looping constructs. Check your provider documentation for specifics. - Musa Mecánica

No es la respuesta que estás buscando? Examinar otras preguntas etiquetadas or haz tu propia pregunta.