Discussion:
[Gramps-devel] DB API addition; has_handle method.
Paul Culley
2017-06-04 22:44:30 UTC
Permalink
Gentlemen;

Per https://gramps-project.org/bugs/view.php?id=9541
which discusses various API methods which were added to DBAPI but never
officially made part of the Gramps API, I'm asking for a review of one
particular method which I would like to use to patch a bug.

The has_handle(obj_key, handle) method takes an obj_key (PERSON_KEY, etc.)
and a handle and determines if the db contains the handle, without raising
a HandleError.

This was added to dbapi by Doug Blank a while ago, presumably because he
expected it to be useful.

I've found a spot where I can use it as well; it turns out that when we
decided to raise HandleErrors for invalid handles, we started getting a lot
of these errors from our Gramplets. The Gramplets were originally designed
to deal with 'None' returns from the various db operations, but don't deal
with HandleErrors. Someone put in what I would consider to be a bandaid (a
'try' 'except' clause around our signal 'emit' code, which catches most of
these, but not all.

I'd like to fix up the base Gramplet class which has a get_active method
that returns the active handle, to check if the handle is valid before
passing to the Gramplets, thus not having to fix all of the Gramplets
individually. But to do this, I have to check the handle for the various
object types used by the Gramplets. This can be done pretty easily via the
has_handle method.

I've submitted a PR https://github.com/gramps-project/gramps/pull/412 that
adds the has_handle method to bsddb and patches the bug if anyone cares to
see what this looks like.

Paul Culley
Doug Blank
2017-06-05 00:52:03 UTC
Permalink
Post by Paul Culley
Gentlemen;
Per https://gramps-project.org/bugs/view.php?id=9541
which discusses various API methods which were added to DBAPI but never
officially made part of the Gramps API, I'm asking for a review of one
particular method which I would like to use to patch a bug.
The has_handle(obj_key, handle) method takes an obj_key (PERSON_KEY, etc.)
and a handle and determines if the db contains the handle, without raising
a HandleError.
This was added to dbapi by Doug Blank a while ago, presumably because he
expected it to be useful.
It is useful logically (as your code shows). But I have a different feeling
now: if you think about database accesses, then you could be accessing the
database twice for no particular gain. Gramps code has a lot of code like
that (because the database access has been fairly cheap with local,
single-user databases where you can even cache data). But as you head
towards more efficient database access (because it is remote or shared), I
think you'll probably want to undo your beautiful logic:

if db.has_media_handle(handle):
obj = db.get_media_from_handle(handle)

into something that if successfully found (which is going to be a majority
of the time, presumably) then you'll want it to be efficient too. But if
you treat handles that don't return objects as an error, then you have your
current dilemma.

My goal is to have referential integrity in gPrime databases so that they
can be efficient without errors. But that will take additional processing
in Gramps after imports that allow referentially invalid data.

-Doug
Post by Paul Culley
I've found a spot where I can use it as well; it turns out that when we
decided to raise HandleErrors for invalid handles, we started getting a lot
of these errors from our Gramplets. The Gramplets were originally designed
to deal with 'None' returns from the various db operations, but don't deal
with HandleErrors. Someone put in what I would consider to be a bandaid (a
'try' 'except' clause around our signal 'emit' code, which catches most of
these, but not all.
I'd like to fix up the base Gramplet class which has a get_active method
that returns the active handle, to check if the handle is valid before
passing to the Gramplets, thus not having to fix all of the Gramplets
individually. But to do this, I have to check the handle for the various
object types used by the Gramplets. This can be done pretty easily via the
has_handle method.
I've submitted a PR https://github.com/gramps-project/gramps/pull/412
that adds the has_handle method to bsddb and patches the bug if anyone
cares to see what this looks like.
Paul Culley
------------------------------------------------------------
------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Gramps-devel mailing list
https://lists.sourceforge.net/lists/listinfo/gramps-devel
Paul Culley
2017-06-05 13:25:45 UTC
Permalink
Doug;
I'd love to make this more efficient, even if it is only operating at human
rates in the GUI.
But Gramps problem is not only one of 'referential integrity' but of
signaling. It should be a rare event that the db is indeed inconsistent.
But due to sloppiness or other design decisions, it is a common event that
our db and/or signals are temporarily inconsistent during multi-commit
operations. The best that can be said is that the db is consistent before
and after the overall transaction.

Our undo/redo data is stored in a dict, that doesn't always return the data
in the order it was put in. Our signals are delayed until after the
complete transactions, guaranteeing that the signaled objects see
inconsistencies at times. For example, on the issue that got me started on
the current set of fixes, during a delete of a person with some references
from other places, we
1) locate and find references to the person
2) remove the references, and commit each object effected
3) delete the person
4) use the undo/redo data to 'play back' the signals (out of a dict in some
order).
5) use very generic signal handlers in a lot of cases that update
everything (Gramplets)

Even if the signals came out in the same order as the commits occurred, on
getting a signal for one of the reference commits, the Gramplet will often
try to re-display some portion of the (now deleted) persons data, creating
a HandleError.

I expect that there are other cases where the commits themselves don't
occur in an order that leaves the db consistent at all times.

None of this used to matter when db accesses with bad references (handles)
just returned 'None'. (Most) of our code tested for 'None' and dealt with
it correctly. In fact our filters used this feature and deliberately
created proxy dbs that are quite inconsistent. So now some of our code in
the GUI has to deal with BOTH the 'None' and HandleError issues.

As a relatively new developer, I certainly don't have a full understanding
of the history and underlying concepts and plan for how I am seeing things
work. So I'm may be missing a lot of things. So as I patch bugs I tend to
patch symptoms (which are usually small fixes and easier for the other
developers to accept).

Perhaps if we had lots of experienced developers we could take a step back
and rethink the whole problem. But I fear that the best we can hope for is
to keep patching things well enough to keep our user base happy.

Paul C.
Post by Doug Blank
Post by Paul Culley
Gentlemen;
Per https://gramps-project.org/bugs/view.php?id=9541
which discusses various API methods which were added to DBAPI but never
officially made part of the Gramps API, I'm asking for a review of one
particular method which I would like to use to patch a bug.
The has_handle(obj_key, handle) method takes an obj_key (PERSON_KEY,
etc.) and a handle and determines if the db contains the handle, without
raising a HandleError.
This was added to dbapi by Doug Blank a while ago, presumably because he
expected it to be useful.
It is useful logically (as your code shows). But I have a different
feeling now: if you think about database accesses, then you could be
accessing the database twice for no particular gain. Gramps code has a lot
of code like that (because the database access has been fairly cheap with
local, single-user databases where you can even cache data). But as you
head towards more efficient database access (because it is remote or
obj = db.get_media_from_handle(handle)
into something that if successfully found (which is going to be a majority
of the time, presumably) then you'll want it to be efficient too. But if
you treat handles that don't return objects as an error, then you have your
current dilemma.
My goal is to have referential integrity in gPrime databases so that they
can be efficient without errors. But that will take additional processing
in Gramps after imports that allow referentially invalid data.
-Doug
Post by Paul Culley
I've found a spot where I can use it as well; it turns out that when we
decided to raise HandleErrors for invalid handles, we started getting a lot
of these errors from our Gramplets. The Gramplets were originally designed
to deal with 'None' returns from the various db operations, but don't deal
with HandleErrors. Someone put in what I would consider to be a bandaid (a
'try' 'except' clause around our signal 'emit' code, which catches most of
these, but not all.
I'd like to fix up the base Gramplet class which has a get_active method
that returns the active handle, to check if the handle is valid before
passing to the Gramplets, thus not having to fix all of the Gramplets
individually. But to do this, I have to check the handle for the various
object types used by the Gramplets. This can be done pretty easily via the
has_handle method.
I've submitted a PR https://github.com/gramps-project/gramps/pull/412
that adds the has_handle method to bsddb and patches the bug if anyone
cares to see what this looks like.
Paul Culley
------------------------------------------------------------
------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Gramps-devel mailing list
https://lists.sourceforge.net/lists/listinfo/gramps-devel
Nick Hall
2017-06-05 15:43:00 UTC
Permalink
Post by Paul Culley
As a relatively new developer, I certainly don't have a full
understanding of the history and underlying concepts and plan for how
I am seeing things work. So I'm may be missing a lot of things. So
as I patch bugs I tend to patch symptoms (which are usually small
fixes and easier for the other developers to accept).
To understand this you need to go back a few years when we were looking
into NoneType errors. Some of these resulted from broken links in the
database. Unfortunately, because we don't enforce referential integrity
at database level, and were also not raising errors when we tried to
retrieve an object from a broken link, these were almost impossible to
track down. Patching symptoms actually made it harder to find the real
problem.

A None return from a "get" method could either mean a broken link or a
link hidden by a proxy. The solution was to raise a HandleError
exception when a broken link was encountered. This has already resulted
in some program errors being fixed and one major source of broken links
found. Hopefully, we can eliminate all broken links in the future and
HandleError exceptions should never be raised.

In the bug report you are looking at, the history list is not consistent
with the database. It is being updated to reflect the database after
the gramplet is updated. The database itself is consistent. The
solution would seem to be to make sure that the history objects are
updated first.

I can still see some valid reasons to want a has_handle method, for
example, when importing a Gramps XML file into an existing database.
Perhaps instead of an object key we could use the class name or table name?

Nick.
Paul Culley
2017-06-05 16:09:21 UTC
Permalink
I'd be quite happy with Class name instead of object key for has_handle
(would avoid a conversion in _gramplet). Only reason for object key is
that dbapi had it that way, and I try to do things in a minimal way.

Regarding the db consistency, you are right, in this case it was consistent
throughout. However, since the signals went out after the fact (in the
wrong order), the GUI elements 'saw' a db that was inconsistent with their
signals. And during undo/redo we do in fact end up with temporary db
inconsistencies since it uses the history. Lots of issues here that are
somewhat interrelated...

Paul C.
Post by Nick Hall
Post by Paul Culley
As a relatively new developer, I certainly don't have a full
understanding of the history and underlying concepts and plan for how I am
seeing things work. So I'm may be missing a lot of things. So as I patch
bugs I tend to patch symptoms (which are usually small fixes and easier for
the other developers to accept).
To understand this you need to go back a few years when we were looking
into NoneType errors. Some of these resulted from broken links in the
database. Unfortunately, because we don't enforce referential integrity at
database level, and were also not raising errors when we tried to retrieve
an object from a broken link, these were almost impossible to track down.
Patching symptoms actually made it harder to find the real problem.
A None return from a "get" method could either mean a broken link or a
link hidden by a proxy. The solution was to raise a HandleError exception
when a broken link was encountered. This has already resulted in some
program errors being fixed and one major source of broken links found.
Hopefully, we can eliminate all broken links in the future and HandleError
exceptions should never be raised.
In the bug report you are looking at, the history list is not consistent
with the database. It is being updated to reflect the database after the
gramplet is updated. The database itself is consistent. The solution
would seem to be to make sure that the history objects are updated first.
I can still see some valid reasons to want a has_handle method, for
example, when importing a Gramps XML file into an existing database.
Perhaps instead of an object key we could use the class name or table name?
Nick.
------------------------------------------------------------
------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Gramps-devel mailing list
https://lists.sourceforge.net/lists/listinfo/gramps-devel
Tim Lyons
2017-06-06 11:10:55 UTC
Permalink
Post by Paul Culley
I'd be quite happy with Class name instead of object key for has_handle
(would avoid a conversion in _gramplet). Only reason for object key is
that dbapi had it that way, and I try to do things in a minimal way.
Regarding the db consistency, you are right, in this case it was consistent
throughout. However, since the signals went out after the fact (in the
wrong order), the GUI elements 'saw' a db that was inconsistent with their
signals. And during undo/redo we do in fact end up with temporary db
inconsistencies since it uses the history. Lots of issues here that are
somewhat interrelated...
Paul C.
I agree with Nick that we need HandleError as well as None, to help us
diagnose and avoid bugs. (I do understand the difficulty with temporary db
inconsistencies and sorry I don't have a suggestion as to how to deal with
it).

In regard to the detail of the details of the has_handle interfaces, I
would prefer to avoid the signatures without the 'class name' in the
signature (and class name or object key as a parameter), and instead use the
signatures that are specific to the object type. (i.e. use the
has_*_gramps_id methods).

I said in:
https://gramps-project.org/bugs/view.php?id=9541

As I understand it, the has_*_gramps_id methods for dbapi now call
has_gramps_id (which has a parameter for the object of interest), which is
good, because there is only one underlying implementation.

However, there are still two different styles of signature, (one with the
table name in the function name, and one with the table as a parameter).
This is very confusing for developers, because they have an unnecessary
choice to make.

As it happens, has_gramps_id is used only in importxml.py so now is an ideal
time to remove it and just have one signature style with the object name in
the method name, which is the general case in other database methods.

In the case of bsddb, the two interfaces, has_*_gramps_id and has_gramps_id
use different implementations! (One using maps and one doing a table get).
This is a really TERRIBLE idea, because it is implicitly exposing the
implementation of the method to the caller, violating all principles of
information hiding.

This would be avoided by only providing one set of signatures.

Regards,
Tim.



--
View this message in context: http://gramps.1791082.n4.nabble.com/DB-API-addition-has-handle-method-tp4680119p4680141.html
Sent from the GRAMPS - Dev mailing list archive at Nabble.com.
Nick Hall
2017-06-05 17:31:58 UTC
Permalink
Post by Paul Culley
Regarding the db consistency, you are right, in this case it was
consistent throughout. However, since the signals went out after the
fact (in the wrong order), the GUI elements 'saw' a db that was
inconsistent with their signals. And during undo/redo we do in fact
end up with temporary db inconsistencies since it uses the history.
Lots of issues here that are somewhat interrelated...
The database signals will always be emitted after the transaction if it
is committed, or not if it fails and is rolled back.

Don't confuse the undo/redo history with the history of active objects.

Is the signal order important? Could we emit the delete signals first,
which would update the history objects?

Nick.
Paul Culley
2017-06-05 19:20:19 UTC
Permalink
Assuming we always delay emits until after the transaction completes:
At first thinking, it seems that putting deletes (and adds) first might do
it. Again assuming that ALL the get_active source data (active object
history?) were updated *before* the various other GUI elements.
I think that whatever signals updates the GUI active object history would
have to be emitted first (lest a delete signal git a GUI element, and it
got the old, not yet removed, history object as active).
I've considered add, delete, and simple merges, but there may be other more
complex transactions I'm not aware of.
I know we can do multiple deletes at once, but I'm not clear if this is a
single transaction (it looks like it isn't based on the Undo History
Gramplet).

Stream of consciousness here, not carefully examined, debugged etc.

Paul C.
Post by Paul Culley
Regarding the db consistency, you are right, in this case it was
consistent throughout. However, since the signals went out after the fact
(in the wrong order), the GUI elements 'saw' a db that was inconsistent
with their signals. And during undo/redo we do in fact end up with
temporary db inconsistencies since it uses the history. Lots of issues
here that are somewhat interrelated...
The database signals will always be emitted after the transaction if it is
committed, or not if it fails and is rolled back.
Don't confuse the undo/redo history with the history of active objects.
Is the signal order important? Could we emit the delete signals first,
which would update the history objects?
Nick.
Nick Hall
2017-06-05 17:27:22 UTC
Permalink
Post by Paul Culley
I'd be quite happy with Class name instead of object key for
has_handle (would avoid a conversion in _gramplet). Only reason for
object key is that dbapi had it that way, and I try to do things in a
minimal way.
The only reason has_handle uses the object key is that has_gramps_id
already does. The object keys should really be internal to the database
layer.

Nick.
Loading...