Trac

Criteria

Useful sources:
http://www.hibernate.org/hib_docs/v3/api/org/hibernate/Criteria.html
http://www.hibernate.org/hib_docs/reference/en/html/querycriteria.html
http://propel.phpdb.org/trac/wiki/Development/Criteria

The Criteria API will be built on top of the existing Query API. It'll use internally a Query object to execute and return things. So our final behavior will be:

Doctrine_Criteria -> Doctrine_Query -> Doctrine_Query_Parser, Doctrine_Hydrator, etc.

The important point is dependency should be uni-directional. Doctrine_Criteria API -> Doctrine Core.

The following pages will describe the API related to each new component.

Doctrine_Query API

Doctrine_Criteria API

DQL

Useful source: http://edocs.bea.com/kodo/docs41/full/html/ejb3_langref.html

Main differences between JPQL and DQL are as follows:
* JPQL has the syntax: select kitten from Cat kitten , DQL does not (should we?). DQL equivalent: select kitten.* from Cat kitten
* JPQL defaults to an INNER JOIN with the following syntaxes: ... from Foo f join f.bar ; ... from Foo f, f.bar ; DQL defaults to a LEFT [OUTER] JOIN here. [Should we change this? Reasons to do it: 1) Similarity with SQL and transition from SQL is more intuitive, 2) Similarity with other OQLs, like JPQL, therefore transition form these also more intuitive.]
* In DQL, all JOINs are "fetch joins", meaning the joined data in the resultset is hydrated into objects. This DQL: "from Employe emp join emp.company comp" is equivalent to this JPQL: "from Employee emp join fetch emp.company comp".

A note on Doctrine_Table and the new Doctrine_ClassMetadata:
The Doctrine_Table concept has been replaced by a new Metadata concept that is all about classes, their relations, their mapping to the relational schema and all other metadata information about a class. That means, during runtime the metadata information of each used entity class (record class) is stored in a ClassMetadata? instance. These informations include:
- The name of the class
- The relations of the class
- The names of all subclasses & parent classes (only entities)
- The mapping information of properties (column name <-> fieldname(aka column alias))
- The table name the class is mapped to
- and much more ... Basically, such an instance has all the information that previously resided on a Doctrine_Table object. Due to the huge scope of this refactoring, the term "table" has not yet been replaced everywhere. i.e. the 'table' entry in the queryComponents references a ClassMetadata? instance. Other examples are getTable() methods that also return instances of ClassMetadata?.
[romanb: I think we should leave that as is for now and implement the currently provided functionality into the parser. Once we can merge a first version of the new parser to trunk, we can continue these refactorings. Everything else would simply result in a too large maintenance burden for the parser branch. Opinions?]

SQL Generation:
During the parsing process, the parser needs to produce SQL. The current Doctrine_Expression_* family of classes can be extended to be used for dbms-specific SQL generation (instead of just SQL generation for expression as it is now). The name of this new family of classes will be: Doctrine_SqlBuilder_XXX.

[guilhermeblanco: NO! Expression are part of DBAL and SqlBuilder? is part of ORM. We cannot do this change... SQL abstract should always be placed in DBAL and never in ORM. We need to define where we will keep this stuff. Some agree that it should change, others no... I am +1 to keep as Expression]

The parser should recognize the following queries (currently it does not):

FROM User INDEX BY name

SELECT u.name FROM (SELECT u.name, p.phonenumber FROM User u LEFT JOIN u.Phonenumber p) INDEX BY name

FROM User u INDEX BY name LEFT JOIN u.Phonenumber p INDEX BY phonenumber

FROM User u INDEX BY name LEFT JOIN u.Phonenumber p ON p.id > 5 INDEX BY phonenumber

FROM User u INDEX BY name LEFT JOIN u.Phonenumber p WITH p.id > 5 INDEX BY phonenumber

FROM User LIMIT -mathExpression- OFFSET -mathExpression-

FROM User WHERE u.id IN (FROM IdBag) 

Aggregate functions should only be allowed in SELECT and HAVING clauses.
[romanb: This might be a problem, because not all DBMS allow aliases in the GROUP BY clause, ie Oracle. See ticket #565. On the other hand, if i understood the BNF correctly, the JPQL forces this restriction, too http://edocs.bea.com/kodo/docs41/full/html/ejb3_langref.html#ejb3_langref_group). A good solution might be to have a strategy-based approach on SQL generation. That means we have a "default strategy" that generates the SQL that works on most dbms and the "oracle sql generator" could just override the creation of the SQL ORDER BY clause to place the full aggregate function there (i.e. GROUP BY COUNT(v2foo)]

Building the queryComponents array (needed for hydrating result sets):

FROM User u INDEX BY name LEFT JOIN u.Phonenumber p

Hydrator->queryComponents: array(u => array('table' => Object(Doctrine_ClassMetadata), 'relation' => null, 'parent' => null, 'agg' => null, 'map' => name),

p => array('table' => Object(Doctrine_ClassMetadata), 'relation' => Object(Doctrine_Relation), 'parent' => 'u', 'agg' => null, 'map' => null))

For each PathExpression? the parsing goes as follows:

1.Explode by dot 2.If the queryComponents array is empty save the PathExpression? as root component (in the example its alias is 'u') 3.If its not empty then start with the first component, retrieve the data associated with it from the queryComponents array 4.Check if the table has relation to the next component (in the example 'Phonenumber'), table->hasRelation('xx') 5.Retrieve the relation object with getRelation() and save the component to the queryComponents array 6.If the PathExpression? has more than two components eg. User.Group.Phonenumber repeat step 4 and 5

Constructing and storing sql table aliases:

Theory: Not all databases support long aliases (for example Oracle has a 32 char limit), hence we must generate short aliases.

For example table named user would have an alias u.

The aliases can be generated using the getSqlTableAlias() method of the old Doctrine_Query_Abstract class.

Identifier quoting:

All identifiers must be run through Doctrine_Connection::quoteIdentifier() method. For example u.name would become ´u´.´name´ on mysql (when identifier quoting attribute is turned on).

Lets say we have sql table called user and its associated class is called User, given the following query

SELECT u.name FROM User u

should construct sql:

SELECT ´u´.´name´ AS ´uname´ FROM ´user´ ´u´

All column aliases should be converted to their associated sql column names Doctrine_Table::getColumnName().

Building of SQL joins:

The joins depend greatly on the underlying relation. The relation fields can be retrieved from the Relation object.

User <- one-to-many -> Phonenumber

FROM User u LEFT JOIN u.Phonenumber p

-> SELECT .. FROM user u LEFT JOIN phonenumber p ON u.id = p.user_id

User <- many-to-many -> Group

FROM User.Group

FROM User u LEFT JOIN u.Group g

FROM User LEFT JOIN User.Group

-> SELECT .. FROM user u LEFT JOIN usergroup u2 ON u.id = u2.user_id LEFT JOIN group g ON g.id = u2.group_id

User <- many-to-many -> User as Friend (equal nest relation using reference table user_reference)

Basic draft of new process:

Avoid at most the circular reference. Until GSoC 2007 patch is attached, the memory overhead is painful to any application. KEEP THAT IN MIND!

Useful picture of workflow: http://code-factory.org/doctrine-manual-images/doctrine-query-seq.jpg
[DONE] Design changes to be done NOW: http://pastebin.com/f64194812
[DONE] Another Design changes to be done NOW: http://pastebin.com/f4e8a9933
Decision about functions and portability mode: http://pastebin.com/f60330c83

[DONE] Doctrine_Query_Abstract

Handles DQL generation. Principal methods for DQL parser:

  • getDql(): Retrieves the DQL in full format
  • setDql(string $dql): Defines a full DQL to be processed
  • getType(): The DQL type. Can be: SELECT, UPDATE or DELETE
  • getState(): The actual state of component. Can be: DIRTY or CLEAN

Possible pitfalls:

  • None

Important Design decisions:

  • Connector (AND/OR) are added in each _dqlParts in WHERE and HAVING. This brought new methods to add with AND, with OR and none.

TODO:

  • Nothing



[DONE] Doctrine_Query_CacheHandler

New implementation (factory class, generates instances of Doctrine_Query_AbstractResult subclasses) that deals with cached items among different Cache Drivers and types.
Doctrine_Query has 2 types of cache: ResultSet? cache (Doctrine_Query_QueryResult) and Query cache (Doctrine_Query_ParserResult). These subclasses stores 4 items: data (array or string), queryComponents, tableAliasMap and enumParams.

Important methods:

  • (static) fromResultSet(array $result, Doctrine_Query_ParserResult $parserResult): Uses the parserResult object to grab queryComponents, tableAliasMap and enumParams. Receives also the $result to store and generate an instance of Doctrine_Query_QueryResult.
  • (static) fromCachedResult(Doctrine_Query $query, string $cached): Unserialize the string and uses the recovered array to build data (resultset), queryComponents, tableAliasMap and enumParams. Generates an instance of Doctrine_Query_QueryResult.
  • (static) fromCachedQuery(Doctrine_Query $query, string $cached): Unserialize the string and generates a Doctrine_Query_ParserResult with generated SQL, queryComponents, tableAliasMap and enumParams.

Possible pitfalls:

  • None

Important Design decisions:

  • Factory class. The complex logic of cache processment could be moved to a single place, without delivering to query this task it should not do. It is able to deal with both Doctrine_Query cache types.

TODO:

  • Nothing.



[DONE] Doctrine_Query_AbstractResult

Stores the queryComponents, tableAliasMap and enumParams that flows from DQL => SQL generation/handler.

Important methods:

  • setQueryComponents(array $queryComponents): Defines the queryComponents.
  • setQueryComponent($componentAlias, array $queryComponent): Define a single queryComponent.
  • getQueryComponents(): Retrieves all queryComponents
  • getQueryComponent(string $componentAlias): Retrieve a single queryComponent.
  • hasQueryComponent($componentAlias): Checks if there is a queryComponent with that componentAlias name already defined.
  • setTableAliasMap(array $tableAliasMap): Defines the table aliases.
  • setTableAlias($tableAlias, $componentAlias): Adds an SQL table alias and associates it a component alias
  • getTableAliasMap(): Returns all table aliases.
  • getTableAlias($tableAlias): Get component alias associated with given table alias.
  • hasTableAlias($tableAlias): Whether or not this object has given tableAlias.
  • getEnumParams(): Returns the enum parameters.
  • addEnumParam($key, $table = null, $column = null): Sets input parameter as an enumerated parameter
  • toCachedForm(): Returns this object in serialized format, revertable using Doctrine_Query_CacheHandler::fromCached*.

Possible pitfalls:

  • None

Important Design decisions:

  • Ability to serialize/unserialize and keep the same structure in a smooth way: $cachedItem === Doctrine_Query_CacheHandler::fromCachedResult($query, $cachedItem)->toCachedForm()

TODO:

  • Make this class Serializable and update Doctrine_Query to follow this change.



[DONE] Doctrine_Query

Handles Connection, Hydrator, ParserResult? and CacheDrivers? (ResultSet? and Query).
There is only one method important to DQL Parser process:

  • getSql(): Call Doctrine_Query_Parser and retrieves Doctrine_Query_ParserResult from it. Returns at the end the processed sql: $this->_sql.

Possible pitfalls:

  • None.

Important Design decisions:

  • There is no need to keep an instance of Doctrine_Query_Parser active inside this object, since it is only used inside getQuery(array $params). It is generated on demand and automatically destroyed.
  • _parserResult returned by Doctrine_Query_Parser::parse() MUST be stored inside Doctrine_Query. It assigned in getSql() and used in all execute* methods.

TODO:

  • Nothing.



[DONE] Doctrine_Hydrator

Handles hydration from PDOStatement into an array or Doctrine_Collection. Requires to know queryComponents and tableAliasMap, processed by Doctrine_Query_Parser and family.

Possible pitfalls:

  • None

Important Design decisions:

  • setQueryComponents(array $queryComponents) and setTableAliasMap(array $tableAliasMap) that can be assigned without send arguments to hydrateResultSet().

[romanb: We might consider later to just pass the ParserResult? as a single parameter to the hydrator. This is a basic refactoring called "Introduce Parameter Object" that is applied when "You have a group of parameters that naturally belong together.". And all these infos belong naturally together, thats why we put them in the ParserResult?. Just an idea. This is a small refactoring that can be done at any time in the future. not urgent].

TODO:

  • Nothing.



Doctrine_Query_Builder

Handles each notification from each Doctrine_Query_Production_* class and decide what to do with it. Basically, it foods itself with _sqlParts.
The first draft defines a mandatory method:

  • getParserResult(): Basically, it merges the _sqlParts into a final _sql, associate it to Doctrine_Query_ParserResult and return this instance.

Possible pitfalls:

  • None.

Important Design decisions:

  • Revisit the entire organization of _sqlParts to check if it can be optimized
  • Revisit the entire limit-subquery algorithm (optimizations welcome, quoteIdentifier definite solution, etc)
  • Doctrine_Query_ParserResult will be alive during the entire existance of Doctrine_Query_Builder
  • Complex algorithms are placed inside Doctrine_Query_Builder subclasses (will they exist? I think it should!). So, when processing LIMIT, for example, fire the limit-subquery algorithm (it will then be placed inside the Doctrine_Query_Builder_Limit class).
    By creating subclasses, we need to define how the notifications will be sent to these classes... how will they receive the processed parameters? No answer for that yet. =(
    One thing can be confirmed. We cannot mix Doctrine_Query_Production_* code with Doctrine_Query_SqlBuilder_* code. In the given example, we cannot make a single decision for Doctrine_Query_SqlBuilder_Limit inside Doctrine_Query_Production_Limit. Maybe the AST idea is valid here? How will performance be? SHARE YOUR IDEAS PLEASE!!!

TODO:

  • Implement this class



Doctrine_Query_ParserResult

Extends Doctrine_Query_AbstractResult, but holds other special attributes used by Doctrine_Query executor.

  • isLimitSubqueryUsed(): If limit-subquery algorithm was used or not. Needed in Doctrine_Query to double parameters.

Possible pitfalls:

  • None.

Important Design decisions:

  • None.

TODO:

  • Nothing.



[DONE] Doctrine_Query_Scanner

Effectively processes the DQL tokens by subsequently calls to Doctrine_Query_Production_* classes.

Possible pitfalls:

  • None

Important Design decisions:

  • None yet

TODO:

  • Figure it out the status of implementation of BNF and implement the remaining classes if it's missing something.