**************
  Data Flows
**************

.. contents:: Table of Contents
   :depth: 1
   :local:

A central concept in HTSQL is *data flow*, a sequence of homogeneous
values.

HTSQL is a data flow transformation language.  Every HTSQL expression
operates on flows; that is, it accepts an *input flow* and transforms it
into an *output flow*.

(diagram: input flow -> expression -> output flow)

The initial input flow consists of a single empty record.  HTSQL
processor interprets the given HTSQL query as a sequence (or rather a
directed graph) of flow operations, which it applies one by one.  The
resulting flow is then displayed to the user in a tabular form.

Different operations affect the flow in various ways: multiply it, or
remove elements from it, apply a scalar function to each element, etc.
In the next sections, we discuss different types of flow operations.


Scalar Expressions
==================

A simplest example of a flow operation is an application of some
expression to each value in a flow.  The output flow consists of a
results of the expression.

That kind of expression does not change the number of elements in the
flow; we call such expressions *scalar*.

.. htsql:: /(3+4)*6

In this example, a scalar expression ``(3+4)*6`` is applied to the
initial flow; the value of this expression forms the resulting flow.

(diagram: [] -> (3+4)*6 -> 42)

.. htsql:: /school{code, count(department)}
   :cut: 4

In this example, two scalar expressions ``code`` and
``count(department)`` are applied to the flow consisting of *school*
records.  For each school entity, they extract the value of the
attribute ``code`` and the number of associated departments.

(diagram: [art], [bus], [edu], [eng], ...
    -> {'art',2}, {'bus',3}, {'edu',2}, {'eng',4}, ...)

A scalar expression is an example of a *singular* expression; one which
does not increase the number of elements in the flow, as opposed to a
*plural* expression, which may produce more output elements than in the
input flow.


Navigation
==========

Navigation is an operation of selecting the initial record class or
traversing a link.

When used in the root scope, a class name produces a flow of all records
from the class.

.. htsql:: /school
   :cut: 4

(diagram: [] -> (school) -> [art], ...)

In a class scope, the name of a link produces the flow consisting of
associated records from the target class.

.. htsql:: /school.department
   :cut: 4

(diagram: [] -> (school) -> [art], ... -> (department) -> [arthis], ...)

As in the previous example, ``school`` generates a flow of *school*
records.  Then we traverse a link ``school.department``.  That each, for
each school record in the input flow, we find the associated
*department* records, the output flow consists of all *department*
records combined.

A traversal operation is singular or plural depending on whether the
respective link is singular or plural.


Filtering
=========

A *sieve* expression filters the input flow leaving only those elements
which satisfy the given condition.

A sieve expression takes one argument: a scalar logical expression
called the *filter*.  It applies the filter to each element of the input
flow.  The output flow consists of those elements of the input flow for
which the filter is evaluated to *TRUE* value.

.. htsql:: /school?count(department)>3

(diagram)

In this example, the sieve expression evaluates a filter condition
``count(department)>3`` for each record from the *school* class; those
records for which this condition is valid generate the output of the
query.

A *sort* expression reorders elements in the flow according to a given
argument.

.. htsql:: /school.sort(name+)
   :cut: 4

(diagram)

In this example, the *school* records are ordered in the ascending order
with respect to the value of ``name`` attribute.

A *truncation* operation makes a slice of the input flow.

.. htsql:: /school.limit(3)

(diagram)

In this case, we take the top 3 records from the *school* class.


Aggregates
==========

An aggregate function converts a plural expression into a scalar.

The argument of an aggregate function must be a plural expression.  Then
for each element of the input flow, the aggregate evaluates the
respective sub-flow and applies a set function to the result to generate
a scalar value.

.. htsql:: /count(school)

(diagram)

In this example, ``count()`` aggregates produces the number of elements
in the flow generated by expression ``school``.

.. htsql:: /department{code, max(course.credits)}
   :cut: 4

(diagram)

In this example, ``max(course.credits)`` starts with evaluating the flow
``department.course.credits``.  Then for each *department* record of the
input flow, ``max()`` finds the maximum value in the respective
sub-flow.


Projection
==========

A projection expression takes a scalar argument called the *kernel*.
The output flow of projection consists of all unique values of the
kernel as it runs over the input flow.

.. htsql:: /school^campus

(diagram)

The output of this query consists of all distinct values of
`school.campus` attribute.


