Aggregate functions perform summary operations on a set of values, such as counting, averaging, or finding minimum or maximum values. Aggregate functions often are used in conjunction with a GROUP BY clause to arrange values from a result set into groups. In this case, the aggregate function produces a summary value for each group.

Grouping can be based on the values in one or more columns of the selected rows. For example, the Country table indicates which continent each country is part of, so you can group the records by continent and calculate the average population of countries in each continent:

SELECT Continent, AVG(Population) FROM Country GROUP BY Continent;

Functions such as AVG() that calculate summary values for groups are known as "aggregate" functions because they're based on aggregates or groups of values. There are several types of aggregate functions. Those discussed here are as follows:

  • MIN() and MAX() find smallest and largest values.
  • SUM() and AVG() summarize numeric values to produce sums (totals) and averages.
  • COUNT() counts rows, values, or the number of distinct values.
  • GROUP_CONCAT() concatenates a set of strings to produce a single string value.

Aggregate functions may be used with or without a GROUP BY clause that places rows into groups. Without a GROUP BY clause, an aggregate function calculates a summary value based on the entire set of selected rows. (That is, MySQL treats all the rows as a single group.) With a GROUP BY clause, an aggregate function calculates a summary value for each group. For example, if a WHERE clause selects 20 rows and the GROUP BY clause arranges them into four groups of five rows each, a summary function produces a value for each of the four groups.

The MIN() and MAX() Aggregate Functions

MIN() and MAX() are comparison functions. They return smallest or largest numeric values, lexically first or last string values, and earliest or latest temporal values. The following queries determine the smallest and largest country populations and the lexically first and last country names:

mysql> SELECT MIN(Population), MAX(Population) FROM Country;
+-----------------+-----------------+
| MIN(Population) | MAX(Population) |
+-----------------+-----------------+
|               0 |      1277558000 |
+-----------------+-----------------+

mysql> SELECT MIN(Name), MAX(Name) FROM Country;
+-------------+-----------+
| MIN(Name)   | MAX(Name) |
+-------------+-----------+
| Afghanistan | Zimbabwe  |
+-------------+-----------+

For string values, the behavior of MIN() and MAX() depends on whether the strings are non-binary or binary. Consider a table t that contains the following string values:

mysql> SELECT name FROM t;
+--------+
| name   |
+--------+
| Calvin |
| alex   |
+--------+

If the name column has a non-binary string data type such as CHAR or TEXT, MAX(name) determines which value is greatest based on the string collation. For the default case-insensitive collation of latin1_swedish_ci, MAX() returns 'Calvin' because 'c' is greater than 'a':

mysql> SELECT MAX(name) FROM t;
+-----------+
| MAX(name) |
+-----------+
| Calvin    |
+-----------+

If the name column has a binary string data type such as BINARY or BLOB, its values are compared using the numeric values of the bytes in the strings. If 'C' has a smaller numeric value 'a' (as is true if characters are stored using ASCII codes), MAX(name) returns 'alex':

mysql> ALTER TABLE t MODIFY name BINARY(20);

mysql> SELECT MAX(name) FROM t;
+-----------+
| MAX(name) |
+-----------+
| alex      |
+-----------+

It is to be noted that MIN() and MAX() ignore NULL values.

The SUM() and AVG() Aggregate Functions

The SUM() and AVG() functions calculate sums and averages. For example, the Country table in the world database contains a Population column, so you can calculate the total world population and the average population per country like this:

mysql> SELECT SUM(Population), AVG(Population) FROM Country;
+-----------------+-----------------+
| SUM(Population) | AVG(Population) |
+-----------------+-----------------+
|      6078749450 |   25434098.1172 |
+-----------------+-----------------+

SUM() and AVG() are most commonly used with numeric values. If you use them with other types of values, those values are subject to numeric conversion, which might not produce a sensible result.

It is to be noted that SUM() and AVG() ignore NULL values.

The COUNT() Aggregate Function

The COUNT() function can be used in several ways to count either rows or values. To illustrate, the examples here use the following table that has several rows containing various combinations of NULL and non-NULL values:

mysql> SELECT i, j FROM t;
+------+------+
| i    | j    |
+------+------+
|    1 | NULL |
| NULL |    2 |
|    1 |    1 |
|    1 |    1 |
|    1 |    3 |
| NULL | NULL |
|    1 | NULL |
+------+------+

COUNT() may be used as follows:

  • COUNT(*) counts the total number of rows:

    mysql> SELECT COUNT(*) FROM t;
    +----------+
    | COUNT(*) |
    +----------+
    |        7 |
    +----------+
    
  • COUNT(expression) counts the number of non-NULL values of the given expression. It's common for expression to be a column name, in which case COUNT() counts the number of non-NULL values in the column:

    mysql> SELECT COUNT(i), COUNT(j) FROM t;
    +----------+----------+
    | COUNT(i) | COUNT(j) |
    +----------+----------+
    |        5 |        4 |
    +----------+----------+
    
  • COUNT(DISTINCT expression) counts the number of distinct (unique) non-NULL values of the given expression. expression can be a column name to count the number of distinct non-NULL values in the column:

    mysql> SELECT COUNT(DISTINCT i), COUNT(DISTINCT j) FROM t;
    +-------------------+-------------------+
    | COUNT(DISTINCT i) | COUNT(DISTINCT j) |
    +-------------------+-------------------+
    |                 1 |                 3 |
    +-------------------+-------------------+
    

    It's also possible to give a list of expressions separated by commas. In this case, COUNT() returns the number of distinct combinations of values that contain no NULL values. The following query counts the number of distinct rows for which neither i nor j is NULL:

    mysql> SELECT COUNT(DISTINCT i, j) FROM t;
    +----------------------+
    | COUNT(DISTINCT i, j) |
    +----------------------+
    |                    2 |
    +----------------------+
    

The GROUP_CONCAT() Function

The purpose of the GROUP_CONCAT() function is to concatenate column values into a single string. This is useful if you would otherwise perform a lookup of many rows and then concatenate them on the client end. For example, the following query displays the languages spoken in Thailand, one per line:

mysql> SELECT Language FROM CountryLanguage WHERE CountryCode = 'THA';
+----------+
| Language |
+----------+
| Chinese  |
| Khmer    |
| Kuy      |
| Lao      |
| Malay    |
| Thai     |
+----------+

To concatenate the values into a single string, use GROUP_CONCAT():

mysql> SELECT GROUP_CONCAT(Language) AS Languages FROM CountryLanguage WHERE CountryCode = 'THA';
+----------------------------------+
| Languages                        |
+----------------------------------+
| Chinese,Khmer,Kuy,Lao,Malay,Thai |
+----------------------------------+

GROUP_CONCAT() supports several modifiers:

  • The default string separator used by GROUP_CONCAT() is ',' (comma). To change the separator, use a SEPARATOR clause:

    mysql> SELECT GROUP_CONCAT(Language SEPARATOR ' - ') AS Languages
           FROM CountryLanguage WHERE CountryCode = 'THA';
    +--------------------------------------------+
    | Languages                                  |
    +--------------------------------------------+
    | Chinese - Khmer - Kuy - Lao - Malay - Thai |
    +--------------------------------------------+
    
  • GROUP_CONCAT() adds strings to the result in the order in which the database server reads them. To change the concatenation order, add an ORDER BY clause. You can specify ASC or DESC to control the direction of sorting, just as when you use ORDER BY in other contexts:

    mysql> SELECT GROUP_CONCAT(Language ORDER BY Language DESC) AS Languages
    	      FROM CountryLanguage WHERE CountryCode = 'THA';
    +----------------------------------+
    | Languages                        |
    +----------------------------------+
    | Thai,Malay,Lao,Kuy,Khmer,Chinese |
    +----------------------------------+
    
  • DISTINCT removes duplicates from the set of concatenated strings. The following two statements both select the languages spoken in North and South Korea, but the second statement eliminates duplicates:

    mysql> SELECT GROUP_CONCAT(Language) AS Languages
           FROM CountryLanguage WHERE CountryCode IN('PRK','KOR');
    +-------------------------------+
    | Languages                     |
    +-------------------------------+
    | Chinese,Korean,Chinese,Korean |
    +-------------------------------+
    
    mysql> SELECT GROUP_CONCAT(DISTINCT Language) AS Languages
           FROM CountryLanguage WHERE CountryCode IN('PRK','KOR');
    +----------------+
    | Languages      |
    +----------------+
    | Chinese,Korean |
    +----------------+
    

It is to be noted that GROUP_CONCAT() ignores NULL values.

Aggregation for NULL Values or Empty Sets

In general, aggregate functions ignore NULL values. The exception is COUNT(), which behaves as follows:

  • COUNT(*) does not ignore NULL values because it counts rows, even those that contain NULL values.
  • COUNT(expression) and COUNT(DISTINCT) do ignore NULL values.

A SELECT statement might produce an empty result set if the table is empty or the WHERE clause selects no rows from it. If the set of values passed to an aggregate function is empty, the function computes the most sensible value. For COUNT(), the result is zero. But functions such as MIN(), MAX(), SUM(), AVG(), and GROUP_CONCAT() return NULL. They also return NULL if a non-empty result contains only NULL values. These behaviors occur because there is no way for such functions to compute results without at least one non-NULL input value.

Posted in: MySQL

Related FAQ's

Marius Ion ANGEL HOT SOFT LLC (800) 316-7677