Thursday, January 3, 2008

MySQL Stored Functions

What's a Stored Function

If procedural programming is new to you, you may be wondering what the difference is between a Stored Procedure and a Stored Function. Not too much really. A function always returns a result, and can be called inside an SQL statement just like ordinary SQL functions. A function parameter is the equivalent of the IN procedure parameter, as functions use the RETURN keyword to determine what is passed back. Stored functions also have slightly more limitations in what SQL statements they can run than stored procedures.

A Stored Function example

Here is an example of a stored function:

mysql> DELIMITER |
mysql>
 CREATE FUNCTION WEIGHTED_AVERAGE (n1 INT, n2 INT, n3 INT, n4 INT)
  RETURNS INT
   DETERMINISTIC
    BEGIN
     DECLARE avg INT;
     SET avg = (n1+n2+n3*2+n4*4)/8;
     RETURN avg;
    END|
Query OK, 0 rows affected (0.00 sec)
mysql> SELECT WEIGHTED_AVERAGE(70,65,65,60)\G
*************************** 1. row ***************************
WEIGHTED_AVERAGE(70,65,65,60): 63
1 row in set (0.00 sec)

As mentioned in the first stored procedures tutorial, we declare the "|" symbol as a delimiter, so that our function body can use ordinary ";" characters. This function returns a weighted average, as could be used to determine an overall result for a subject. The third test score is weighted twice as heavily as the first and second scores, while the fourth score counts four times as much. We also make use of the DECLARE (declaring a variable) and DETERMINISTIC (telling MySQL that, given the same input, the function will always return the same result) statements, as discussed in earlier tutorials.

Accessing tables in stored functions

Stored functions in early versions of MySQL 5.0 (< 5.0.10) could not reference tables except in a very limited capacity. That limited their usefulness to a large degree. Newer versions can now do so, but still cannot make use of statements that return a result set. So, no SELECT queries returning result sets from a table. However, you can get around this by using SELECT INTO. For the next example, we create a table allowing us to store 4 marks, and a name. Then we will define a new WEIGHTED_AVERAGE function to make use of the dynamic data from the table.

mysql> CREATE TABLE sfdata(mark1 INT,mark2 INT,mark3 INT,mark4 INT,name VARCHAR(50))
mysql> INSERT INTO sfdata VALUES(70,65,65,60,'Mark')|
mysql> INSERT INTO sfdata VALUES(95,94,75,50,'Pavlov')|
mysql>
 CREATE FUNCTION WEIGHTED_AVERAGE2 (v1 VARCHAR(50))
  RETURNS INT
  DETERMINISTIC
   BEGIN
    DECLARE i1,i2,i3,i4,avg INT;
    SELECT mark1,mark2,mark3,mark4 INTO i1,i2,i3,i4 FROM sfdata WHERE name=v1;
    SET avg = (i1+i2+i3*2+i4*4)/8; 
    RETURN avg;
   END|
Query OK, 0 rows affected (0.00 sec)
mysql> SELECT WEIGHTED_AVERAGE2('Pavlov') AS Pavlov, WEIGHTED_AVERAGE2('Mark') AS Mark\G
*************************** 1. row ***************************
Pavlov: 67
  Mark: 63
1 row in set (0.00 sec)

By SELECTING the contents of the mark1 to mark4 rows INTO the variables we have just declared, there is no need to return a result set, and we can happily use the results inside of the function.
All the usual behaviors and conditions apply inside the function. Here is what happens if one of the records is missing a field.

mysql> INSERT INTO sfdata VALUES(90,NULL,70,60,'Isabelle')|
Query OK, 1 row affected (0.18 sec)
mysql> SELECT WEIGHTED_AVERAGE2('Isabelle') AS Isabelle\G
*************************** 1. row ***************************
Isabelle: NULL
1 row in set (0.16 sec)

As expected, the NULL (and NULLs are always a bad idea to use) contaminates the entire result, and MySQL, not knowing what else to do, can do nothing other than return a NULL.
Here is the full syntax for stored functions:

CREATE FUNCTION sf_name ([parameter1 [...]])
    RETURNS type
    [
     LANGUAGE SQL
     | [NOT] DETERMINISTIC
     | { CONTAINS SQL | NO SQL | READS SQL DATA | MODIFIES SQL DATA }
     | SQL SECURITY { DEFINER | INVOKER }
     | COMMENT 'string'
    ] 
    SQL statements

Manipulating tables

With the early restrictions on accessing tables inside a function lifted, you can use a function to make changes to a table as well. The next two examples are not ideal use of functions (in their current format they would more ideally be stored procedures), as we are not interested in the result being returned, and only want to manipulate the data, but they show you some of the potential power of functions. A function is best used when you want to return a result. Building upon these examples, you can create your own where complex INSERTs, SELECTs and UPDATEs are performed, with a single result being returned at the end of it all. First, we INSERT a record into the sfdata table.

mysql>
 CREATE FUNCTION WEIGHTED_AVERAGE3 (n1 INT,n2 INT,n3 INT,n4 INT,v1 VARCHAR(50))
  RETURNS INT
  DETERMINISTIC
   BEGIN
    DECLARE i1,i2,i3,i4,avg INT;
    INSERT INTO sfdata VALUES(n1,n2,n3,n4,v1);
    RETURN 1;
   END|
Query OK, 0 rows affected (0.08 sec)
mysql> SELECT WEIGHTED_AVERAGE3(50,60,60,50,'Thoko')\G
*************************** 1. row ***************************
WEIGHTED_AVERAGE3(50,60,60,50,'Thoko'): 1
1 row in set (0.00 sec)
mysql> SELECT * FROM sfdata\G
*************************** 1. row ***************************
mark1: 70
mark2: 65
mark3: 65
mark4: 60
 name: Mark
*************************** 2. row ***************************
mark1: 95
mark2: 94
mark3: 75
mark4: 50
 name: Pavlov
*************************** 3. row ***************************
mark1: 90
mark2: NULL
mark3: 70
mark4: 60
 name: Isabelle
*************************** 4. row ***************************
mark1: 50
mark2: 60
mark3: 60
mark4: 50
 name: Thoko
4 rows in set (0.01 sec)

Similarly, the next example UPDATEs a record based upon the parameters passed to it:

mysql>  
 CREATE FUNCTION WEIGHTED_AVERAGE_UPDATE (n1 INT,n2 INT,n3 INT,n4 INT,v1 VARCHAR(50))
  RETURNS INT
  DETERMINISTIC
   BEGIN
    DECLARE i1,i2,i3,i4,avg INT;
    UPDATE sfdata SET mark1=n1,mark2=n2,mark3=n3,mark4=n4 WHERE name=v1;
    RETURN 1;
   END|
Query OK, 0 rows affected (0.00 sec)
mysql> SELECT WEIGHTED_AVERAGE_UPDATE(60,60,60,50,'Thoko')\G
*************************** 1. row ***************************
WEIGHTED_AVERAGE_UPDATE(60,60,60,50,'Thoko'): 1
1 row in set (0.00 sec)
mysql> SELECT * FROM sfdata\G
*************************** 1. row ***************************
mark1: 70
mark2: 65
mark3: 65
mark4: 60
 name: Mark
*************************** 2. row ***************************
mark1: 95
mark2: 94
mark3: 75
mark4: 50
 name: Pavlov
*************************** 3. row ***************************
mark1: 90
mark2: NULL
mark3: 70
mark4: 60
 name: Isabelle
*************************** 4. row ***************************
mark1: 60
mark2: 60
mark3: 60
mark4: 50
 name: Thoko
5 rows in set (0.01 sec)

Information about existing stored functions

As with stored procedures, there are various ways to get to the metadata about existing stored functions. There is SHOW CREATE FUNCTION, and SHOW FUNCTION STATUS. The former returns the CREATE statement applied to the supplied function, while the latter returns metadata about all existing functions, as follows.

mysql> SHOW CREATE FUNCTION WEIGHTED_AVERAGE\G
*************************** 1. row ***************************
       Function: WEIGHTED_AVERAGE
       sql_mode:
Create Function: CREATE FUNCTION 'WEIGHTED_AVERAGE'(n1 INT, n2 INT, n3 INT, n4 INT) 
RETURNS int(11)
    DETERMINISTIC
BEGIN
   DECLARE avg INT;
   SET avg = (n1+n2+n3*2+n4*4)/8;
   RETURN avg;
  END
1 row in set (0.01 sec)
mysql> SHOW FUNCTION STATUS\G
*************************** 1. row ***************************
           Db: test
         Name: WEIGHTED_AVERAGE
         Type: FUNCTION
      Definer: root@localhost
     Modified: 2005-12-07 13:21:37
      Created: 2005-12-07 13:21:37
Security_type: DEFINER
      Comment:
*************************** 2. row ***************************
           Db: test
         Name: WEIGHTED_AVERAGE2
         Type: FUNCTION
      Definer: root@localhost
     Modified: 2005-12-07 13:41:07
      Created: 2005-12-07 13:41:07
Security_type: DEFINER
      Comment:
*************************** 3. row ***************************
           Db: test
         Name: WEIGHTED_AVERAGE3
         Type: FUNCTION
      Definer: root@localhost
     Modified: 2005-12-07 15:51:16
      Created: 2005-12-07 15:51:16
Security_type: DEFINER
      Comment:
*************************** 4. row ***************************
           Db: test
         Name: WEIGHTED_AVERAGE_UPDATE
         Type: FUNCTION
      Definer: root@localhost
     Modified: 2005-12-07 16:03:26
      Created: 2005-12-07 16:03:26
Security_type: DEFINER
      Comment:
4 rows in set (0.00 sec)

Another way to get the same information is to query the mysql.proc table. As you may know, the mysql database stores all sorts of data about permissions, and you can UPDATE the user or db table to change the MySQL privileges. Since MySQL 5, the mysql.proc table also contains metadata about stored procedures and functions.

mysql> SELECT * FROM mysql.proc\G
*************************** 1. row ***************************
              db: test
            name: WEIGHTED_AVERAGE
            type: FUNCTION
   specific_name: WEIGHTED_AVERAGE
        language: SQL
 sql_data_access: CONTAINS_SQL
is_deterministic: YES
   security_type: DEFINER
      param_list: n1 INT, n2 INT, n3 INT, n4 INT
         returns: int(11)
            body: BEGIN
   DECLARE avg INT;
   SET avg = (n1+n2+n3*2+n4*4)/8;
   RETURN avg;
  END
         definer: root@localhost
         created: 2005-12-07 13:21:37
        modified: 2005-12-07 13:21:37
        sql_mode:
         comment:
*************************** 2. row ***************************
              db: test
            name: WEIGHTED_AVERAGE2
            type: FUNCTION
   specific_name: WEIGHTED_AVERAGE2
        language: SQL
 sql_data_access: CONTAINS_SQL
is_deterministic: YES
   security_type: DEFINER
      param_list: v1 VARCHAR(50)
         returns: int(11)
            body: BEGIN     
                    DECLARE i1,i2,i3,i4,avg INT;    
                    SELECT mark1,mark2,mark3,mark4 INTO i1,i2,i3,i4 FROM sf1_data WHERE name=v1;    
                    SET avg = (i1+i2+i3*2+i4*4)/8;    
                    RETURN avg;   
                  END
         definer: root@localhost
         created: 2005-12-07 13:41:07
        modified: 2005-12-07 13:41:07
        sql_mode:
         comment:
*************************** 3. row ***************************
              db: test
            name: WEIGHTED_AVERAGE3
            type: FUNCTION
   specific_name: WEIGHTED_AVERAGE3
        language: SQL
 sql_data_access: CONTAINS_SQL
is_deterministic: YES
   security_type: DEFINER
      param_list: n1 INT,n2 INT,n3 INT,n4 INT,v1 VARCHAR(50)
         returns: int(11)
            body: BEGIN
   DECLARE i1,i2,i3,i4,avg INT;
   INSERT INTO sfdata VALUES(n1,n2,n3,n4,v1);
   RETURN 1;
  END
         definer: root@localhost
         created: 2005-12-07 15:51:16
        modified: 2005-12-07 15:51:16
        sql_mode:
         comment:
*************************** 4. row ***************************
              db: test
            name: WEIGHTED_AVERAGE_UPDATE
            type: FUNCTION
   specific_name: WEIGHTED_AVERAGE_UPDATE
        language: SQL
 sql_data_access: CONTAINS_SQL
is_deterministic: YES
   security_type: DEFINER
      param_list: n1 INT,n2 INT,n3 INT,n4 INT,v1 VARCHAR(50)
         returns: int(11)
            body: BEGIN
   DECLARE i1,i2,i3,i4,avg INT;
   UPDATE sfdata SET mark1=n1,mark2=n2,mark3=n3,mark4=n4 WHERE name=v1;
   RETURN 1;
  END
         definer: root@localhost
         created: 2005-12-07 16:03:26
        modified: 2005-12-07 16:03:26
        sql_mode:
         comment:
4 rows in set (0.00 sec)

Note that querying the mysql.proc table returns more complete data than either of the first two methods, effectively returning the sum of both of those methods.
However, people coming from other DBMS', familiar with the ANSI standard, may be uncomfortable with these MySQL-specific methods. The standard way is to query the INFORMATION_SCHEMA. It is a highly flexible way of getting what you want, but can be a bit of an overkill, hence MySQL's provision of the more simple SHOW methods. I will leave a more complete explanation of INFORMATION_SCHEMA for another day, as it extends well beyond stored procedures and functions. For now, suffice to say that you can query INFORMATION_SCHEMA.ROUTINES to get similar metadata as the above, as follows:

mysql> SELECT * FROM INFORMATION_SCHEMA.ROUTINES\G
*************************** 1. row ***************************
     SPECIFIC_NAME: WEIGHTED_AVERAGE
   ROUTINE_CATALOG: NULL
    ROUTINE_SCHEMA: test
      ROUTINE_NAME: WEIGHTED_AVERAGE
      ROUTINE_TYPE: FUNCTION
    DTD_IDENTIFIER: int(11)
      ROUTINE_BODY: SQL
ROUTINE_DEFINITION: BEGIN
   DECLARE avg INT;
   SET avg = (n1+n2+n3*2+n4*4)/8;
   RETURN avg;
  END
     EXTERNAL_NAME: NULL
 EXTERNAL_LANGUAGE: NULL
   PARAMETER_STYLE: SQL
  IS_DETERMINISTIC: YES
   SQL_DATA_ACCESS: CONTAINS SQL
          SQL_PATH: NULL
     SECURITY_TYPE: DEFINER
           CREATED: 2005-12-07 13:21:37
      LAST_ALTERED: 2005-12-07 13:21:37
          SQL_MODE:
   ROUTINE_COMMENT:
           DEFINER: root@localhost
*************************** 2. row ***************************
     SPECIFIC_NAME: WEIGHTED_AVERAGE2
   ROUTINE_CATALOG: NULL
    ROUTINE_SCHEMA: test
      ROUTINE_NAME: WEIGHTED_AVERAGE2
      ROUTINE_TYPE: FUNCTION
    DTD_IDENTIFIER: int(11)
      ROUTINE_BODY: SQL
ROUTINE_DEFINITION: BEGIN
                      DECLARE i1,i2,i3,i4,avg INT;    
                      SELECT mark1,mark2,mark3,mark4 INTO i1,i2,i3,i4 FROM sf1_data WHERE name=v1;    
                      SET avg = (i1+i2+i3*2+i4*4)/8;    
                      RETURN avg;   
                    END
     EXTERNAL_NAME: NULL
 EXTERNAL_LANGUAGE: NULL
   PARAMETER_STYLE: SQL
  IS_DETERMINISTIC: YES
   SQL_DATA_ACCESS: CONTAINS SQL
          SQL_PATH: NULL
     SECURITY_TYPE: DEFINER
           CREATED: 2005-12-07 13:41:07
      LAST_ALTERED: 2005-12-07 13:41:07
          SQL_MODE:
   ROUTINE_COMMENT:
           DEFINER: root@localhost
*************************** 3. row ***************************
     SPECIFIC_NAME: WEIGHTED_AVERAGE3
   ROUTINE_CATALOG: NULL
    ROUTINE_SCHEMA: test
      ROUTINE_NAME: WEIGHTED_AVERAGE3
      ROUTINE_TYPE: FUNCTION
    DTD_IDENTIFIER: int(11)
      ROUTINE_BODY: SQL
ROUTINE_DEFINITION: BEGIN
   DECLARE i1,i2,i3,i4,avg INT;
   INSERT INTO sfdata VALUES(n1,n2,n3,n4,v1);
   RETURN 1;
  END
     EXTERNAL_NAME: NULL
 EXTERNAL_LANGUAGE: NULL
   PARAMETER_STYLE: SQL
  IS_DETERMINISTIC: YES
   SQL_DATA_ACCESS: CONTAINS SQL
          SQL_PATH: NULL
     SECURITY_TYPE: DEFINER
           CREATED: 2005-12-07 15:51:16
      LAST_ALTERED: 2005-12-07 15:51:16
          SQL_MODE:
   ROUTINE_COMMENT:
           DEFINER: root@localhost
*************************** 4. row ***************************
     SPECIFIC_NAME: WEIGHTED_AVERAGE_UPDATE
   ROUTINE_CATALOG: NULL
    ROUTINE_SCHEMA: test
      ROUTINE_NAME: WEIGHTED_AVERAGE_UPDATE
      ROUTINE_TYPE: FUNCTION
    DTD_IDENTIFIER: int(11)
      ROUTINE_BODY: SQL
ROUTINE_DEFINITION: BEGIN
   DECLARE i1,i2,i3,i4,avg INT;
   UPDATE sfdata SET mark1=n1,mark2=n2,mark3=n3,mark4=n4 WHERE name=v1;
   RETURN 1;
  END
     EXTERNAL_NAME: NULL
 EXTERNAL_LANGUAGE: NULL
   PARAMETER_STYLE: SQL
  IS_DETERMINISTIC: YES
   SQL_DATA_ACCESS: CONTAINS SQL
          SQL_PATH: NULL
     SECURITY_TYPE: DEFINER
           CREATED: 2005-12-07 16:03:26
      LAST_ALTERED: 2005-12-07 16:03:26
          SQL_MODE:
   ROUTINE_COMMENT:
           DEFINER: root@localhost
4 rows in set (0.00 sec)

The MySQL documentation gives a complete overview about the INFORMATION_SCHEMA structure, and some MySQL oddities, if you want to pursue that further.

Conclusion

Stored procedures and stored functions open a whole new world to MySQL developers, and mean that MySQL is starting to attract attention from developers of entirely new types of applications. While the implementation in MySQL 5.0 is still raw (MySQL 5.1 is out in alpha now, and develops things further, as will MySQL 6.0), most of what is needed is already there. Not all of the my applications are running on MySQL 5 yet, but the itch to move everything is starting to get stronger every time I do further development, and have to make do with more unwanted logic in the application. I came across a query recently from one poor developer bemoaning the lack of features in MySQL 3.23. Upgrading legacy systems is not fun, but for those of you lucky enough to start with a clean slate, enjoy the new world of MySQL 5, stored procedures and stored functions!

No comments: