Tricky sorting technique in MySQL

Recently, I faced an interesting challenge and I want to share my solution with the readers using a simple example. Suppose we have a list of TODO items. Each entry in the list may have a due date, the priority and creation date.

Now, suppose that you need to get the records from the database, sorted this way – first, show records with due date in ascending order, second, show records with no due date ordered by priority from higher to lower, and finally, show records that do not have due date and priority and sort them by date created. The first thing that comes to mind is to write this query:

SELECT * FROM todo WHERE ... ORDER BY due_date asc, priority desc, create_date asc

It would seem that’s all, the problem is solved, but there’s a catch – the first field (due_date) is not compulsory and that’s what we get as a result of this query:

As you can see we have first records with no due date, but we wanted to show records with most recent due dates first. Obviously reversing sorting order from ASC to DESC will not help. Usually in such cases, a UNION of two SELECT queries is used. In the first query we select only the records with the due date, in the second – on the contrary, only the records with no due date. Something like this:

SELECT * FROM todo WHERE due_date IS NOT NULL ... ORDER BY due_date asc
SELECT * FROM todo WHERE due_date IS NULL ... ORDER BY priority desc, create_date asc

Does everyone agree with this solution? Don’t you want to look at it one more time? Okay, I’ll put you out of your misery; the query does not work for the simple reason that when you use ORDER BY with UNION it is applied to all records in a result set. In this case, you might say, UNION does not solve our problem, because the result would be no different from the first query and you will be right. However, UNION does gives us the ability to change the result in a simple way. What didn’t we like in the first query? The fact that the records without a due date were at the beginning. Besides, why did they go to the top? Because we sorted in ascending order by column allowing NULLs, and according to documentation NULL values are presented first if you do ORDER BY. What should you do to move these records to the bottom? That’s right, to make sure that we use some other very big value instead of NULL. According to the manual the maximum value of the datetime field is ’9999-12-31 23:59:59′, which we will use. We now have the following query

SELECT priority, description, due_date, due_date AS sort_due_date, create_date FROM todo
WHERE due_date IS NOT NULL ...
SELECT priority, description, due_date, '9999-12-31 23:59:59' AS sort_due_date, create_date FROM todo
WHERE due_date IS NULL ...
ORDER BY sort_due_date ASC, priority DESC, create_date ASC

and the result looks like this:

This query keeps the original due_date column intact, and sorting is performed by using the synthetic column ‘sort_due_date’. We need this in order not to distort the real data, we do not want to confuse ourselves and show the user the date of the end of the (MySQL) world? The problem is solved. I don’t know about you, but I love the elegant solutions, and this solution is definitely not elegant. Do you want to know what the elegant version of the solution is? Then read on.

In the last few months, we have been working with CodeIgniter and previously we worked mostly with ZendFramework. So, in CI and ZF, and in many other frameworks the ActiveRecord pattern is widely used. This means that the first SQL query we wrote is created this way:

$this->db_app->select('priority, description, due_date, create_date');
$this->db_app->where( ... );
$this->db_app->order_by('due_date asc, priority desc, create_date asc');

And yet, I have not found the capabilities in CI to build UNION queries using this pattern. If you know how, I would appreciate your input. If a WHERE clause in the query is simple, in principle, we can form the first request and compile it. Then we can create the second request using the same method and compile it, and finally connect them with UNION manually and run it. Often, however, the conditions in the query are formed depending on the parameters passed, and the logic can be quite daunting. As a result, we get a strong code duplication. However, there are no hopeless situations, and here a very handy feature of MySQL helps us – the use of conditional logic in the ORDER BY construction. It works as follows:

SELECT priority, description, due_date, create_date
FROM todo WHERE ...
ORDER BY IFNULL(due_date, '9999-12-31 23:59:59') ASC, priority DESC, create_date ASC

That’s what I call an elegant solution! Those who disagree with me please write in the comments, I will remove them anyway :) . We do not duplicate the code because we have one request. We just need the right way to create ORDER BY conditions. Unfortunately CI has some problems with it. If you try (and I have tried, no doubt) to set sorting using Active Record this way:

$this->db_app->order_by("IFNULL(due_date, '9999-12-31 23:59:59') ASC, priority DESC, create_date ASC");

Then you, like me, will be disappointed. The super smart framework will split the passed string and will surround the first word with the quotation marks. As a result, a request will be completed with an error. By the way, if anyone knows how to ask CI not to do it, please let me know. Finally, we are left with the only workaround – to add an order condition to a query without using the Active Record, ie manually. Yes, it violates our beautiful code a little, but in my opinion, it is a lesser evil. I did it like this:

$this->db_app->select('priority, description, due_date, create_date');
$this->db_app->where( ... );
$sql = $this->db_app->_compile_select();
$sql .= " ORDER BY IFNULL(due_date, '9999-12-31 23:59:59') ASC, priority DESC, create_date ASC";
$result = $this->db_app->query($sql)->result_array();

The attentive reader will notice one inconsistency here – the method is called _compile_select(), defined in the class CI_DB_active_record as protected. I confess, though I do not like to change the code of any good framework, but I saw no other way to generate a query without executing it but to make this method public. Again – let us know how to do it better. That’s it – we shared what we learned and are looking forward to your comments.

Finally, a little note. Column’s due_date default values is NULL, in our example. However this is not always the case and sometimes you can find, for example, ’0000-00-00′ as a default value. How to form ORDER BY statement in this case? You can use CASE expression this way:

SELECT priority, description, due_date, create_date
FROM todo WHERE ...
  CASE WHEN due_date ='0000-00-00' THEN '9999-12-31 23:59:59'
    ELSE due_date
  END ASC, priority DESC, create_date ASC
Print this post | Home

One comment

  1. Анастася says:

    Спасибо вам, жизнь просто спасли!