转载:generate_series: an Oracle implementation in light of SQL Design Patterns - The Spatial Database Advisor
# GENERATE_SERIES
There is a very useful function in PostgreSQL called generate_series that can be used to generate a series of integer numbers from some start value to an end value with an optional step value.
Here is the function and its description from the PostgreSQL help.
Function | Argument Type | Return Type | Description |
---|---|---|---|
generate_series(start, stop) | int or bigint | setof int or setof bigint (same as argument type) | Generate a series of values, from start to stop with a step size of one |
generate_series(start, stop, step) | int or bigint | setof int or setof bigint (same as argument type) | Generate a series of values, from start to stop with a step size of step |
There are a number of ways we can code this function in Oracle. Since the original function is a “set returning function”, we need to code generate_series so that it returns a table of numbers:
1 | TYPE t_numbers IS TABLE OF number; |
The most efficient way to do this is via a PIPELINED function, so that is what I will code.
1 | create or replace |
Now, to run the tests on the PostgreSQL help page:
Let’s start with a simple, additional, example not on the page.
1 | select g.column_value as generate_series |
# Alternative Table Function
Now all this is very good, but there is some debate as to implementing a series of integers in this way.
Vadim Tropashko, in his excellent book, “SQL Design Patterns, The Expert Guide to SQL Programming”, Rampart Press has a whole chapter (2) devoted to “Integer Generators in SQL”. In this chapter, Vadmin presents an coding of a simple Integer generating table function called “Integers”. His coding is as follows.
1 | create or replace function Integers |
We will now use this function to implement the PostgreSQL help examples above.
Firstly, generating numbers between 1 and 5.
1 | select rownum as rin |
I won’t code the invalid series. So let’s move on to the negative series.
1 | select rin |
# Hierarchical Queries
Vadim goes on in his book to describe the use of hierarchical queries for generating integer series. Here is an example of how to generate all even numbers between 5 and 8.
1 | elect level |
I am a big fan of hierachical queries and have used them a lot in my work but mainly in the area of generating sample data (see other articles in this blog for examples). My personal view is that coding integer ranges with the “connect by level” hierarchical query is simpler to use and code than the Integers() function. However, the “connect by level” usage above does not work in versions of Oracle before 10g. So all you 9i users will have to resort to coding a suitable table function.
# Application to Oracle Spatial Data Processing
Since 11g, Oracle has included a function called SDO_UTIL.EXTRACT that can be used to extract the elements of an SDO_GEOMETRY object. For example, it can be used to extract the linestrings in a multilinestring object or the polygons in a multipolygon object.
In addition, Oracle has a function called SDO_UTIL.GetNumElem which can return the number of polygons/linestrings in a multipolygon/multilinestring.
I will show how to use both the generate_series and CONNECT BY LEVEL approaches to extracting single geometries from a multigeometry.
CONNECT BY LEVEL
1 | WITH mGeom AS ( |
That query, I think, is unnecessarily complicated. Let’s try generate_series.
1 | WITH mGeom AS ( |
Both work nicely, but I rather prefer the cleanliness of the generate_series query.
# Conclusion
Tropashko’s preference (as also Mikito Harakiri) in coding an integer series generation table function is to code one without parameters, such as Integers(), and not like generate_series. The main reason is that he prefers the use of predicates in SQL SELECT statements (such as “rownum < = 1000”) than parameters passed to a function. This is because they are at a “higher abstraction level”, claiming that “programs with predicate expressions are shorter and cleaner”. Because I am a “SQL man”, preferring a single SQL statement to complex PL/SQL, and because I like things to be done “orthogonally”, “theoretically” and via patterns/templates I am tempted to agree with Tropashko. However, when I look at the SQL above, I find the generate_series implementation much, much cleaner and less complicated.
The problem with the Integers() function is that it forces me to “reinvent the wheel” each time I need a series of integers. With simple series the SQL Is not too bad. But more complicated series create much more complicated SQL. Now, many of my uses of SQL result in large SQL statements with can result in many uses of generate_series. To integrate multiple calls to the same series generation can be neatly encapsulated using the WITH construct, but even so it does add, to my way of thinking, unnecessary complicated. And complication usually means lots of errors and longer debugging.
In the end a generate*series function like above allows us to design and algorithm, code and test it and then use it many times without having to remember how to code an integer series every time we need it. This is part of good software engineering: the algorithm is *“encapsulated”_ in a program unit that offers a stable interface. And interfaces are everything.
I hope that this is of use to someone.