ColumnSelect API + Persistent objects

classic Classic list List threaded Threaded
14 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

ColumnSelect API + Persistent objects

Nikita Timofeev
Hi all,

I'm currently working on some missing parts of ColumnSelect API
introduced in M5.

Idea is simple: select full entities along with arbitrary columns.
It was discussed previously on @dev list, thou without any details.

These entities can be anything that can be resolved against root (root
itself, toOne or toMany relationships)

* The suggested changes in API is following *

1) Add new Expression ASTFullObject (and the corresponding method in
ExpressionFactory) that will be just a marker for the desired logic.
This expression can be later (in post 4.0 versions) used in where()
and in orderBy() methods where it can act as ObjectId
and thus fill another gap where hacks with paths like "db:OBJECT_ID"
are used now.

2) Add new factory methods in Property class:

    <T extends Persistent> Property<T> createSelf(Class<? super T> type);
    <T extends Persistent> Property<T> createForRelationship(
                   Property<?> property, Class<? super T> type)

3) Prohibit direct usages of properties mapped on toMany
relationships, so that the following code will throw a
CayenneRuntimeException

List<Object[]> result = ObjectSelect.query(Artist.class)
        .columns(Artist.ARTIST_NAME, Artist.PAINTING_ARRAY)
        .select(context);


* Usage examples *

1) Selecting root object plus some related fields:

Property<Artist> artistSelf = Property.createSelf(Artist.class);

List<Object[]> result = ObjectSelect.query(Artist.class)
        .columns(artistSelf, Artist.ARTIST_NAME, Artist.PAINTING_ARRAY.count())
        .select(context);

2) Selecting toOne relationship:

// Here Object[1] will be an Artist, Object[2] will be a Gallery
List<Object[]> result = ObjectSelect.query(Painting.class)
        .columns(Painting.PAINTING_TITLE, Painting.TO_ARTIST,
Painting.TO_GALLERY)
        .select(context);

3) Selecting toMany relationship (this is a questionable feature)
The result will be as it would be in raw SQL query (i.e. flat matrix
of Artists and Paintings)

Property<Artist> artist = Property.createSelf(Artist.class);
Property<Painting> artistPainting =
Property.createForRelationship(Artist.PAINTING_ARRAY, Painting.class);
Property<Gallery> artistPaintingGallery =
Artist.PAINTING_ARRAY.dot(Painting.TO_GALLERY);

List<Object[]> result = ObjectSelect.query(Artist.class)
        .columns(artist, artistPainting, artistPaintingGallery)
        .select(context);

Any comments, usage scenarios, naming suggestions, etc will be really
appreciated!

Here is a link to the corresponding JIRA issue:
https://issues.apache.org/jira/browse/CAY-2255

--
Best regards,
Nikita Timofeev
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: ColumnSelect API + Persistent objects

Aristedes Maniatis-2
On 7/3/17 11:12pm, Nikita Timofeev wrote:
> 2) Add new factory methods in Property class:
>
>     <T extends Persistent> Property<T> createSelf(Class<? super T> type);

Why wouldn't we just use normal constructors?

a = new Property(Artist.class);


>     <T extends Persistent> Property<T> createForRelationship(
>                    Property<?> property, Class<? super T> type)

> 3) Prohibit direct usages of properties mapped on toMany
> relationships, so that the following code will throw a
> CayenneRuntimeException
>
> List<Object[]> result = ObjectSelect.query(Artist.class)
>         .columns(Artist.ARTIST_NAME, Artist.PAINTING_ARRAY)
>         .select(context);

I'm confused about why we need a new type of property for this rather than just using Artist.PAINTING_ARRAY


Ari


--
-------------------------->
Aristedes Maniatis
GPG fingerprint CBFB 84B4 738D 4E87 5E5C  5EFA EF6A 7D2E 3E49 102A
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: ColumnSelect API + Persistent objects

Nikita Timofeev
On Wed, Mar 8, 2017 at 3:48 AM, Aristedes Maniatis <[hidden email]> wrote:
> On 7/3/17 11:12pm, Nikita Timofeev wrote:
>> 2) Add new factory methods in Property class:
>>
>>     <T extends Persistent> Property<T> createSelf(Class<? super T> type);
>
> Why wouldn't we just use normal constructors?
>
> a = new Property(Artist.class);

That's because createSelf() method will create expression needed for
this to work.
And as mentioned in the first mail this new expression can later be
used in other ways.

>
>>     <T extends Persistent> Property<T> createForRelationship(
>>                    Property<?> property, Class<? super T> type)
>
>> 3) Prohibit direct usages of properties mapped on toMany
>> relationships, so that the following code will throw a
>> CayenneRuntimeException
>>
>> List<Object[]> result = ObjectSelect.query(Artist.class)
>>         .columns(Artist.ARTIST_NAME, Artist.PAINTING_ARRAY)
>>         .select(context);
>
> I'm confused about why we need a new type of property for this rather than just using Artist.PAINTING_ARRAY
>

This is because PAINTING_ARRAY have List<Painting> type and we
currently can provide only Painting.
So to have proper type we need another Property to be created.
Thou it is indeed a question whether we need this at all:
a ColumnSelect result as it will be is raw SQL, i.e. in this example
Paintings not folded into a List.

For me it's not clear when do we need direct List result for toMany
relationship instead of using Prefetch (or even selecting related
entities explicitly).

--
Best regards,
Nikita Timofeev
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: ColumnSelect API + Persistent objects

Andrus Adamchik

> On Mar 9, 2017, at 10:17 AM, Nikita Timofeev <[hidden email]> wrote:
>
> On Wed, Mar 8, 2017 at 3:48 AM, Aristedes Maniatis <[hidden email]> wrote:
>> On 7/3/17 11:12pm, Nikita Timofeev wrote:
>>> 2) Add new factory methods in Property class:
>>>
>>>    <T extends Persistent> Property<T> createSelf(Class<? super T> type);
>>
>> Why wouldn't we just use normal constructors?
>>
>> a = new Property(Artist.class);
>
> That's because createSelf() method will create expression needed for
> this to work.
> And as mentioned in the first mail this new expression can later be
> used in other ways.

So if we call it just "create(Class)" instead of "createSelf", will it cause any ambiguity?


>>>    <T extends Persistent> Property<T> createForRelationship(
>>>                   Property<?> property, Class<? super T> type)
>>
>>> 3) Prohibit direct usages of properties mapped on toMany
>>> relationships, so that the following code will throw a
>>> CayenneRuntimeException
>>>
>>> List<Object[]> result = ObjectSelect.query(Artist.class)
>>>        .columns(Artist.ARTIST_NAME, Artist.PAINTING_ARRAY)
>>>        .select(context);
>>
>> I'm confused about why we need a new type of property for this rather than just using Artist.PAINTING_ARRAY
>>
>
> This is because PAINTING_ARRAY have List<Painting> type and we
> currently can provide only Painting.
> So to have proper type we need another Property to be created.
> Thou it is indeed a question whether we need this at all:
> a ColumnSelect result as it will be is raw SQL, i.e. in this example
> Paintings not folded into a List.
>
> For me it's not clear when do we need direct List result for toMany
> relationship instead of using Prefetch (or even selecting related
> entities explicitly).


So while I understand what we are trying to do with "createForRelationship" (this is a "flatMap"-like transformation), I am wondering how we can make the semantics more transparent? Perhaps we add "flat()" method to the Property itself:

Property<List<Painting>> listProperty = Artist.PAINTING_ARRAY;
Property<Painting> objectProperty = listProperty.flat(); // or 'flatMap'?

Andrus



Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: ColumnSelect API + Persistent objects

Andrus Adamchik

>>>
>>> List<Object[]> result = ObjectSelect.query(Artist.class)
>>>       .columns(Artist.ARTIST_NAME, Artist.PAINTING_ARRAY)
>>>       .select(context);


>> For me it's not clear when do we need direct List result for toMany
>> relationship instead of using Prefetch (or even selecting related
>> entities explicitly).

Let's ponder on this some more. In the example above there are two possible result mappings of to-many - each cell at index 1 can store either (1) a List<Painting> or (2) a Painting. Of course in the second case Artist name at index 0 will be repeated for each painting of a given Artist (so the result is flattened).

In the proposed solution we throw in the first case and support the second. I'd like to hear opinions on the usefulness of either of the cases, and maybe some real life examples.

Andrus

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: ColumnSelect API + Persistent objects

Aristedes Maniatis-2
On 13/3/17 5:57pm, Andrus Adamchik wrote:

>
>>>>
>>>> List<Object[]> result = ObjectSelect.query(Artist.class)
>>>>       .columns(Artist.ARTIST_NAME, Artist.PAINTING_ARRAY)
>>>>       .select(context);
>
>
>>> For me it's not clear when do we need direct List result for toMany
>>> relationship instead of using Prefetch (or even selecting related
>>> entities explicitly).
>
> Let's ponder on this some more. In the example above there are two possible result mappings of to-many - each cell at index 1 can store either (1) a List<Painting> or (2) a Painting. Of course in the second case Artist name at index 0 will be repeated for each painting of a given Artist (so the result is flattened).
>
> In the proposed solution we throw in the first case and support the second. I'd like to hear opinions on the usefulness of either of the cases, and maybe some real life examples.


Oh, I get it now. You are suggesting in option (2) that Cayenne creates a CROSS JOIN in sql.

Surely most of the time we are steering users to creating ObjEntities where possible; then they have the full access to follow relations, prefetching, pagination, etc, etc.

Sometimes you want to fetch specific columns as read-only for performance reasons. I get that. You save on fetching data and you save time on constructing full ObjEntities. But then why would you want to embed full ObjEntities in the result? Any performance benefits are surely out the window. So what's the point and the use-case?



Are users going to be confused that the ObjEntities are bound to a context and editable/committable, but the raw columns are not?

result.get(0)[3][1].setName("The scream")
result.get(0)[2] = "Munch"
context.commit()



What is the output of this:

result = ObjectSelect.query(Artist.class)
       .columns(createSelf(Artist.class))
       .select(context);

Is that the same as

result = ObjectSelect.query(Artist.class)
       .select(context);

would the generated SQL be the same?

Ari




--
-------------------------->
Aristedes Maniatis
GPG fingerprint CBFB 84B4 738D 4E87 5E5C  5EFA EF6A 7D2E 3E49 102A
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: ColumnSelect API + Persistent objects

Andrus Adamchik
> Oh, I get it now. You are suggesting in option (2) that Cayenne creates a CROSS JOIN in sql.

Actually not a CROSS join. It will be a regular INNER JOIN based on relationship from the query root. JPA spec defines CROSS join operations. We don't (yet?).

> Sometimes you want to fetch specific columns as read-only for performance reasons. I get that. You save on fetching data and you save time on constructing full ObjEntities.


Yep. A typical use case for intermixing DataObjects and scalars in the result is this: [<artist>, <count_of_paintings>] that gives us each object's to-many counts without resolving to-many relationships. What I am unsure though is whether [<artist_name>, <painting>] result is of any use to anyone.

> Are users going to be confused that the ObjEntities are bound to a context and editable/committable, but the raw columns are not?

I am not concerned about this. I think everyone should understand that only DataObjects can be committed. Any other representations is non-updateable data.

Andrus


> On Mar 13, 2017, at 10:29 AM, Aristedes Maniatis <[hidden email]> wrote:
>
> On 13/3/17 5:57pm, Andrus Adamchik wrote:
>>
>>>>>
>>>>> List<Object[]> result = ObjectSelect.query(Artist.class)
>>>>>      .columns(Artist.ARTIST_NAME, Artist.PAINTING_ARRAY)
>>>>>      .select(context);
>>
>>
>>>> For me it's not clear when do we need direct List result for toMany
>>>> relationship instead of using Prefetch (or even selecting related
>>>> entities explicitly).
>>
>> Let's ponder on this some more. In the example above there are two possible result mappings of to-many - each cell at index 1 can store either (1) a List<Painting> or (2) a Painting. Of course in the second case Artist name at index 0 will be repeated for each painting of a given Artist (so the result is flattened).
>>
>> In the proposed solution we throw in the first case and support the second. I'd like to hear opinions on the usefulness of either of the cases, and maybe some real life examples.
>
>
> Oh, I get it now. You are suggesting in option (2) that Cayenne creates a CROSS JOIN in sql.
>
> Surely most of the time we are steering users to creating ObjEntities where possible; then they have the full access to follow relations, prefetching, pagination, etc, etc.
>
> Sometimes you want to fetch specific columns as read-only for performance reasons. I get that. You save on fetching data and you save time on constructing full ObjEntities. But then why would you want to embed full ObjEntities in the result? Any performance benefits are surely out the window. So what's the point and the use-case?
>
>
>
> Are users going to be confused that the ObjEntities are bound to a context and editable/committable, but the raw columns are not?
>
> result.get(0)[3][1].setName("The scream")
> result.get(0)[2] = "Munch"
> context.commit()
>
>
>
> What is the output of this:
>
> result = ObjectSelect.query(Artist.class)
>       .columns(createSelf(Artist.class))
>       .select(context);
>
> Is that the same as
>
> result = ObjectSelect.query(Artist.class)
>       .select(context);
>
> would the generated SQL be the same?
>
> Ari
>
>
>
>
> --
> -------------------------->
> Aristedes Maniatis
> GPG fingerprint CBFB 84B4 738D 4E87 5E5C  5EFA EF6A 7D2E 3E49 102A

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: ColumnSelect API + Persistent objects

Nikita Timofeev
In reply to this post by Aristedes Maniatis-2
>
> What is the output of this:
>
> result = ObjectSelect.query(Artist.class)
>        .columns(createSelf(Artist.class))
>        .select(context);
>
> Is that the same as
>
> result = ObjectSelect.query(Artist.class)
>        .select(context);
>
> would the generated SQL be the same?
>

Small comment on this.

With columns() method the result will be List<Object[]> (where
Object[0] is an Artist object).
And with column() method both results will be identical (List<Artist>).

Generated SQL should be the same too, though direct query apparently
is more efficient,
as it will use some shortcuts based on assumption that it will get
only one type of objects.

--
Best regards,
Nikita Timofeev
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: ColumnSelect API + Persistent objects

Aristedes Maniatis-2
In reply to this post by Andrus Adamchik
On 13/3/17 7:26pm, Andrus Adamchik wrote:
>
> Yep. A typical use case for intermixing DataObjects and scalars in the result is this: [<artist>, <count_of_paintings>] that gives us each object's to-many counts without resolving to-many relationships. What I am unsure though is whether [<artist_name>, <painting>] result is of any use to anyone.


Just a wild thought, but would this syntax be helpful...

List<Object[]> result = ObjectSelect.query(Artist.class)
     .addColumns(Artist.PAINTING_COUNT)
     .select(context);

So then we are adding more columns to the existing DataObject query rather than having to define the DataObjects as properties in themselves. The syntax above might be simpler to understand and write.


>> Are users going to be confused that the ObjEntities are bound to a context and editable/committable, but the raw columns are not?
> I am not concerned about this. I think everyone should understand that only DataObjects can be committed. Any other representations is non-updateable data.

Perhaps, but this is the first time these two things are inter-mingled in one list. And I've always thought of contexts as 'owning' the items within the result list. But as long as users are clear that contexts control the elements in the result and not the list itself, then I'm probably overthinking it.


Ari



--
-------------------------->
Aristedes Maniatis
GPG fingerprint CBFB 84B4 738D 4E87 5E5C  5EFA EF6A 7D2E 3E49 102A
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: ColumnSelect API + Persistent objects

Andrus Adamchik

> On Mar 13, 2017, at 1:46 PM, Aristedes Maniatis <[hidden email]> wrote:
>
>
> Just a wild thought, but would this syntax be helpful...
>
> List<Object[]> result = ObjectSelect.query(Artist.class)
>     .addColumns(Artist.PAINTING_COUNT)
>     .select(context);
>
> So then we are adding more columns to the existing DataObject query rather than having to define the DataObjects as properties in themselves. The syntax above might be simpler to understand and write.

So "addColumns" vs "columns"? IIRC we tried something similar with orderings (override all orderings vs add to the existing orderings), and that confused everybody (including me as the author), so that was undone between the milestones.

Andrus

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: ColumnSelect API + Persistent objects

Aristedes Maniatis-2
On 13/3/17 9:53pm, Andrus Adamchik wrote:

>
>> On Mar 13, 2017, at 1:46 PM, Aristedes Maniatis <[hidden email]> wrote:
>>
>>
>> Just a wild thought, but would this syntax be helpful...
>>
>> List<Object[]> result = ObjectSelect.query(Artist.class)
>>     .addColumns(Artist.PAINTING_COUNT)
>>     .select(context);
>>
>> So then we are adding more columns to the existing DataObject query rather than having to define the DataObjects as properties in themselves. The syntax above might be simpler to understand and write.
>
> So "addColumns" vs "columns"? IIRC we tried something similar with orderings (override all orderings vs add to the existing orderings), and that confused everybody (including me as the author), so that was undone between the milestones.


Except in this case it is very different result since with this syntax you get the mixed columns/DataObject results and avoid needing to create properties for 'self' or some other confusing construct.


--
-------------------------->
Aristedes Maniatis
GPG fingerprint CBFB 84B4 738D 4E87 5E5C  5EFA EF6A 7D2E 3E49 102A
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: ColumnSelect API + Persistent objects

Nikita Timofeev
On Mon, Mar 13, 2017 at 3:05 PM, Aristedes Maniatis <[hidden email]> wrote:

> On 13/3/17 9:53pm, Andrus Adamchik wrote:
>>
>>> On Mar 13, 2017, at 1:46 PM, Aristedes Maniatis <[hidden email]> wrote:
>>>
>>>
>>> Just a wild thought, but would this syntax be helpful...
>>>
>>> List<Object[]> result = ObjectSelect.query(Artist.class)
>>>     .addColumns(Artist.PAINTING_COUNT)
>>>     .select(context);
>>>
>>> So then we are adding more columns to the existing DataObject query rather than having to define the DataObjects as properties in themselves. The syntax above might be simpler to understand and write.
>>
>> So "addColumns" vs "columns"? IIRC we tried something similar with orderings (override all orderings vs add to the existing orderings), and that confused everybody (including me as the author), so that was undone between the milestones.
>
>
> Except in this case it is very different result since with this syntax you get the mixed columns/DataObject results and avoid needing to create properties for 'self' or some other confusing construct.
>

I think we can add Property<EntityType> SELF to default cgen
templates, then everything will be pretty simple:

    ObjectSelect
        .columnQuery(Artist.class, Artist.SELF, Artist.PAINTING_ARRAY.count())
        .select(context);


--
Best regards,
Nikita Timofeev
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: ColumnSelect API + Persistent objects

Andrus Adamchik


> On Mar 16, 2017, at 11:14 AM, Nikita Timofeev <[hidden email]> wrote:
>
> On Mon, Mar 13, 2017 at 3:05 PM, Aristedes Maniatis <[hidden email]> wrote:
>> On 13/3/17 9:53pm, Andrus Adamchik wrote:
>>>
>>>> On Mar 13, 2017, at 1:46 PM, Aristedes Maniatis <[hidden email]> wrote:
>>>>
>>>>
>>>> Just a wild thought, but would this syntax be helpful...
>>>>
>>>> List<Object[]> result = ObjectSelect.query(Artist.class)
>>>>    .addColumns(Artist.PAINTING_COUNT)
>>>>    .select(context);
>>>>
>>>> So then we are adding more columns to the existing DataObject query rather than having to define the DataObjects as properties in themselves. The syntax above might be simpler to understand and write.
>>>
>>> So "addColumns" vs "columns"? IIRC we tried something similar with orderings (override all orderings vs add to the existing orderings), and that confused everybody (including me as the author), so that was undone between the milestones.
>>
>>
>> Except in this case it is very different result since with this syntax you get the mixed columns/DataObject results and avoid needing to create properties for 'self' or some other confusing construct.
>>
>
> I think we can add Property<EntityType> SELF to default cgen
> templates, then everything will be pretty simple:
>
>    ObjectSelect
>        .columnQuery(Artist.class, Artist.SELF, Artist.PAINTING_ARRAY.count())
>        .select(context);
>

A good idea except for a possible naming conflict if a Java object has a persistent property called "self".

Andrus
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: ColumnSelect API + Persistent objects

Nikita Timofeev
Here is latest version that already committed and can be tested:

    ObjectSelect
        .columnQuery(Artist.class,
            Property.createSelf(Artist.class),
     // query root
            Artist.PAINTING_ARRAY.dot(Painting.GALLERY),  // toOne relationship
            Artist.PAINTING_ARRAY.flat(Painting.class))          //
flattened toMany relationship
        .select(context);

On Thu, Mar 16, 2017 at 11:48 AM, Andrus Adamchik
<[hidden email]> wrote:

>
>
>> On Mar 16, 2017, at 11:14 AM, Nikita Timofeev <[hidden email]> wrote:
>>
>> On Mon, Mar 13, 2017 at 3:05 PM, Aristedes Maniatis <[hidden email]> wrote:
>>> On 13/3/17 9:53pm, Andrus Adamchik wrote:
>>>>
>>>>> On Mar 13, 2017, at 1:46 PM, Aristedes Maniatis <[hidden email]> wrote:
>>>>>
>>>>>
>>>>> Just a wild thought, but would this syntax be helpful...
>>>>>
>>>>> List<Object[]> result = ObjectSelect.query(Artist.class)
>>>>>    .addColumns(Artist.PAINTING_COUNT)
>>>>>    .select(context);
>>>>>
>>>>> So then we are adding more columns to the existing DataObject query rather than having to define the DataObjects as properties in themselves. The syntax above might be simpler to understand and write.
>>>>
>>>> So "addColumns" vs "columns"? IIRC we tried something similar with orderings (override all orderings vs add to the existing orderings), and that confused everybody (including me as the author), so that was undone between the milestones.
>>>
>>>
>>> Except in this case it is very different result since with this syntax you get the mixed columns/DataObject results and avoid needing to create properties for 'self' or some other confusing construct.
>>>
>>
>> I think we can add Property<EntityType> SELF to default cgen
>> templates, then everything will be pretty simple:
>>
>>    ObjectSelect
>>        .columnQuery(Artist.class, Artist.SELF, Artist.PAINTING_ARRAY.count())
>>        .select(context);
>>
>
> A good idea except for a possible naming conflict if a Java object has a persistent property called "self".
>
> Andrus



--
Best regards,
Nikita Timofeev
Loading...