Hierarchical queries in MySQL, Managing tree like structures
I was wondering how to implement a hierarchical query in MySQL (using the ancestry chains version) for a single row, such that it picks up the parents (if any) and any children (if any).The idea is, I want to be able to jump in at any point, provide an Id of some sort, and be able to draw out the entire hierarchy for that Id, both upwards and downwards.
We need to combine two queries here:
- Original hierarchical query that returns all descendants of a given
id
(a descendancy chain) - A query that would return all ancestors of a given
id
(an ancestry chain)
An
id
can have only one parent
, that’s why we can employ a linked list technique to build an ancestry chain, like shown in this article:
Here’s the query to to this (no functions required):
SELECT @r AS _id, ( SELECT @r := parent FROM t_hierarchy WHERE id = _id ) AS parent, @l := @l + 1 AS lvl FROM ( SELECT @r := 1218, @l := 0, @cl := 0 ) vars, t_hierarchy h WHERE @r <> 0
To combine two queries, we can employ a simple
UNION ALL
.
The only problem that is left to preserve the correct
level
, since the ancestry chain query conts level
backwards, and the hierarchical query will count it starting from zero.Let’s create a sample table and see what we get:
CREATE
TABLE
t_hierarchy (
02.
id
int
(10) unsigned
NOT
NULL
AUTO_INCREMENT,
03.
parent
int
(10) unsigned
NOT
NULL
,
04.
PRIMARY
KEY
(id),
05.
KEY
ix_hierarchy_parent (parent, id)
06.
) ENGINE=InnoDB
DEFAULT
CHARSET=utf8;
07.
08.
DELIMITER $$
09.
CREATE
PROCEDURE
prc_fill_hierarchy (
level
INT
, fill
INT
)
10.
BEGIN
11.
DECLARE
_level
INT
;
12.
DECLARE
_fill
INT
;
13.
INSERT
14.
INTO
t_hierarchy (id, parent)
15.
VALUES
(1, 0);
16.
SET
_fill = 0;
17.
WHILE _fill < fill DO
18.
INSERT
19.
INTO
t_hierarchy (parent)
20.
VALUES
(1);
21.
SET
_fill = _fill + 1;
22.
END
WHILE;
23.
SET
_fill = 1;
24.
SET
_level = 0;
25.
WHILE _level <
level
DO
26.
INSERT
27.
INTO
t_hierarchy (parent)
28.
SELECT
hn.id
29.
FROM
t_hierarchy ho, t_hierarchy hn
30.
WHERE
ho.parent = 1
31.
AND
hn.id > _fill;
32.
SET
_level = _level + 1;
33.
SET
_fill = _fill + POWER(fill, _level);
34.
END
WHILE;
35.
END
36.
$$
37.
DELIMITER ;
38.
39.
DROP
FUNCTION
IF EXISTS hierarchy_connect_by_parent_eq_prior_id;
40.
41.
DELIMITER $$
42.
43.
CREATE
FUNCTION
hierarchy_connect_by_parent_eq_prior_id(value
INT
)
RETURNS
INT
44.
NOT
DETERMINISTIC
45.
READS SQL DATA
46.
BEGIN
47.
DECLARE
_id
INT
;
48.
DECLARE
_parent
INT
;
49.
DECLARE
_next
INT
;
50.
DECLARE
CONTINUE
HANDLER
FOR
NOT
FOUND
SET
@id =
NULL
;
51.
52.
SET
_parent = @id;
53.
SET
_id = -1;
54.
55.
IF @id
IS
NULL
THEN
56.
RETURN
NULL
;
57.
END
IF;
58.
59.
LOOP
60.
SELECT
MIN
(id)
61.
INTO
@id
62.
FROM
t_hierarchy
63.
WHERE
parent = _parent
64.
AND
id > _id;
65.
IF @id
IS
NOT
NULL
OR
_parent = @start_with
THEN
66.
SET
@
level
= @
level
+ 1;
67.
RETURN
@id;
68.
END
IF;
69.
SET
@
level
:= @
level
- 1;
70.
SELECT
id, parent
71.
INTO
_id, _parent
72.
FROM
t_hierarchy
73.
WHERE
id = _parent;
74.
END
LOOP;
75.
END
76.
$$
77.
78.
DELIMITER ;
79.
80.
START
TRANSACTION
;
81.
CALL prc_fill_hierarchy(6, 5);
82.
COMMIT
;
Now, let’s try to
UNION ALL
the queries as is:SELECT CONCAT(REPEAT(' ', lvl - 1), _id) AS treeitem, parent, lvl AS level FROM ( SELECT @r AS _id, ( SELECT @r := parent FROM t_hierarchy WHERE id = _id ) AS parent, @l := @l + 1 AS lvl FROM ( SELECT @r := 1218, @l := 0, @cl := 0 ) vars, t_hierarchy h WHERE @r <> 0 ORDER BY lvl DESC ) qi UNION ALL SELECT CONCAT(REPEAT(' ', level - 1), CAST(hi.id AS CHAR)), parent, level FROM ( SELECT hierarchy_connect_by_parent_eq_prior_id(id) AS id, @level AS level FROM ( SELECT @start_with := 1218, @id := @start_with, @level := 0 ) vars, t_hierarchy WHERE @id IS NOT NULL ) ho JOIN t_hierarchy hi ON hi.id = ho.id
We see that the hierarchical order is mangled: the first resultset is upside down, the second one is starting from
level = 1
.
To fix it, we need to change the code that calculates level a little.
First, we need to reverse the ancestry part.
This can be easily done by sorting it on
lvl DESC
:SELECT CONCAT(REPEAT(' ', level - 1), _id) AS treeitem, parent, level FROM ( SELECT @r AS _id, ( SELECT @r := parent FROM t_hierarchy WHERE id = _id ) AS parent, @l := @l + 1 AS level FROM ( SELECT @r := 1218, @l := 0, @cl := 0 ) vars, t_hierarchy h WHERE @r <> 0 ORDER BY level DESC ) qi
We now have it in correct order but with wrong
level
values.
Since a
level
is essentially a rownum
here, we can just calculate it as a rownum instead:SELECT CONCAT(REPEAT(' ', level - 1), id) AS treeitem, parent, level FROM ( SELECT _id AS id, parent, @cl := @cl + 1 AS level FROM ( SELECT @r AS _id, ( SELECT @r := parent FROM t_hierarchy WHERE id = _id ) AS parent, @l := @l + 1 AS level FROM ( SELECT @r := 1218, @l := 0, @cl := 0 ) vars, t_hierarchy h WHERE @r <> 0 ORDER BY level DESC ) qi ) qo
We disregard the previously selected
level
at all, leaving it only inside the inline view qi
for ordering purposes.
Now, we need to merge the descendancy tree query, but with
level
s starting from 6
, not from 1
.
Since this query will come after the ancestry one, we can use accumulated value of
@cl
(which we used to calculate level in the previous query) as a seed.
To do that, we can take the
level
returned by the descendancy tree query, and just add @cl
to it.
The only problem it to determine when we should increment
@cl
and when we should add it to level
.
We will just use a boolean field which will help us to tell the sets apart.
Here’s the query to do it:
SELECT CONCAT(REPEAT(' ', level - 1), CAST(id AS CHAR)), parent, level FROM ( SELECT id, parent, IF(ancestry, @cl := @cl + 1, level + @cl) AS level FROM ( SELECT TRUE AS ancestry, _id AS id, parent, level FROM ( SELECT @r AS _id, ( SELECT @r := parent FROM t_hierarchy WHERE id = _id ) AS parent, @l := @l + 1 AS level FROM ( SELECT @r := 1218, @l := 0, @cl := 0 ) vars, t_hierarchy h WHERE @r <> 0 ORDER BY level DESC ) qi UNION ALL SELECT FALSE, hi.id, parent, level FROM ( SELECT hierarchy_connect_by_parent_eq_prior_id(id) AS id, @level AS level FROM ( SELECT @start_with := 1218, @id := @start_with, @level := 0 ) vars, t_hierarchy WHERE @id IS NOT NULL ) ho JOIN t_hierarchy hi ON hi.id = ho.id ) q ) q2
One more thing: use this if extremely necessary, and for small trees,, some times it crashes the Mysql server,
im still working on some more efficient solution.
Till Happy coding :)
Till Happy coding :)
Hi There
ReplyDeleteThank you for an excellent article, it is more or less what I was looking for, however I only needed the very first query you posted on Sorting Lists for use in my Application.
But I have come across a problem and hope you may help.
I need to query a DB that has a Hierachical Structure, See Below: Table: xmlgames1
id | GameID | ParentGameID
11 | 570 | 1
13 | 5157 | 570
20 | 7878 | 5175
67 | 8252 | 7878
When I try out your query above as follows:
SELECT @r AS _id,
(
SELECT @r := ParentGameID
FROM xmlgames1
WHERE GameID = _id
) AS parent,
@l := @l + 1 AS lvl
FROM (
SELECT @r := 8252,
@l := 0,
@cl := 0) vars, xmlgames1 h
WHERE @r <> 0
The query returns an Error: [Err] 1242 - Subquery returns more than 1 row
Can you explain what I am doing wrong?