Mindoo Blog - Cutting edge technologies - About Java, Lotus Notes and iPhone

  • The pain of reading data as a Domino developer - and solutions

    Karsten Lehmann  19 June 2024 11:02:08
    On my endless path of reinventing the wheel regarding Domino APIs, my latest adventure has once again led me to find efficient and powerful ways to query Domino data.

    For many years, this topic has been a pain point for me. Back in the IBM Design Partner program, I wrote many entries in the discussions database, asking IBM core development for better and faster ways to read Domino data.

    View traversal speed

    In the old days, according to Domino core developers, there was no priority for LotusScript or Java API methods to read data as fast as possible, "since most of the code was running at night in agents anyway" (that is what they told me).

    Things changed a bit when XPages were introduced in 2009. Core development began to tweak the codebase for performance, for example, to not just read one view row at a time in the ViewNavigator API, since the underlying C API call NIFReadEntries has never had this restriction. It has others; the return data buffer has a max size of 64K, but depending on the type of view data, this buffer can contain lots of rows. For example, if you just read note IDs (4 bytes each), one NIFReadEntries call returns about 16,000 rows at once.

    New API methods like ViewNavigator.setBufferMaxEntries(...) and ViewNavigator.setCacheGuidance(...) were the result. They sped up view traversal A LOT.

    Unfortunately, Domino core development thought there was no need to ever write documentation for them. Check for yourself: Go into your Domino Designer help and try to find information about these methods, same with a Google search. At least you will find a few community blogs (like ours) that wrote about this topic and how to use these methods.

    This video on YouTube might also help.


    Data consistency

    In addition to faster view traversal, Domino core development also started working on ways to read view data more consistently and better handle concurrent data updates.

    Compared to SQL databases or more modern NoSQL databases like CouchDB, Domino has always had poor data consistency. Although transaction support has existed in the Domino API since Domino 12, these transactions can only be used to modify multiple database documents atomically. They do not cover reading data from views.

    For example, if you want to read all view rows matching a lookup key using the publicly available C API methods, you would open the view (NIFOpenCollection), find the position of the first row matching the lookup key and determine the number of matches (NIFFindByKey, which returns a position like "1.2.3"), and start reading row data at that position (using one or more NIFReadEntries calls, depending on how much row data fits into the 64K buffer).

    These are three separate C API calls. The view index that you are working on is NOT your private copy. So, if other users change data at the same time and the view indexer kicks in, things become challenging.

    To be fair, you receive signals from NIFReadEntries (e.g., SIGNAL_ANY_CONFLICT) indicating that the view index or view design has changed in the meantime. This means that your start position "1.2.3" might be completely wrong.

    If you know the note ID that your code read last in addition to its position, NIFLocateNote might be handy. It tries to find the new position of this row "nearby." However, it's possible that this exact document no longer matches the lookup key and has been moved somewhere else in the view.

    Instead, you could re-run your lookup repeatedly until none of the NIFReadEntries calls signal a view index change.

    If only there were a way to prevent view index updates while lookups were running!


    NIFFindByKeyExtended2 to the rescue

    The C API of Domino 12 exposed a new method, NIFFindByKeyExtended2, which had been added to Domino years earlier. NIFFindByKeyExtended2 combines NIFFindByKey and the first NIFReadEntries call in an atomic manner. This means that the view index is locked, ensuring data consistency, at least if you don't have more than 16,000 rows for the same lookup key, or significantly fewer rows if you are also reading the column values (e.g., in one of our applications, only 70 rows filled the entire 64K buffer).


    I am not a C developer! Do I have to care?

    Life as a LotusScript or Java developer on Domino is easier than for C developers.

    You just call View.getAllDocumentsByKey(...) or View.getDocumentByKey(...), and the black magic happens under the covers. These methods have been using NIFFindByKeyExtended2 internally for many years, I think since R9 (or the undocumented NIFFindByKeyExtended3, which returns the result in a callback).

    There is a small chance that the document in the returned DocumentCollection might have changed in the meantime and no longer matches the lookup key that produced the DocumentCollection. As a paranoid person, you could check this. But as soon as the document is loaded and concurrent updates to it happen, you get a save error and can decide in your code if you want to reload the document or create a save conflict.


    I use View.setAutoUpdate(false), does this help to get consistent view data?

    In short: no.
    There is an old article in the Domino development wiki (https://ds-infolib.hcltechsw.com/ldd/ddwiki.nsf/dx/View.AutoUpdate_?OpenDocument&sa=true) which explains the AutoUpdate flag.

    It just changes the strategy how the API recovers in case of view index changes:

    AutoUpdate=true means "use NIFLocateNote and go on reading at the new position".
    AutoUpdate=false means "don't care, just go on reading at the current position"

    The same applies for LotusScript/Java as for C - you don't get a private copy of the view index.

    So things can change all the time and they sometimes shine through when using the LotusScript/Java API:

    NotesException: Notes error: Entry not found in index (Articles\Latest Issue)

            at lotus.domino.local.ViewNavigator.NgotoEntry(Native Method)
            at lotus.domino.local.ViewNavigator.gotoX(Unknown Source)
            at lotus.domino.local.ViewNavigator.gotoChild(Unknown Source)
            at lotus.domino.local.ViewNavigator.getChild(Unknown Source)

    NotesException: Notes error: Entry not found in index (Articles\Quota Computation)
            at lotus.domino.local.ViewNavigator.NgotoEntry(Native Method)
            at lotus.domino.local.ViewNavigator.gotoX(Unknown Source)
            at lotus.domino.local.ViewNavigator.gotoNextSibling(Unknown Source)
            at lotus.domino.local.ViewNavigator.getNextSibling(Unknown Source)

    Here, the ViewNavigator is trying to change the cursor position and notices that the view index has changed in a way that it cannot recover.

    These are actual exceptions from production apps.

    By using setBufferMaxEntries() / setCacheGuidance(), these errors might happen less often, because the next children or sibling entries are already precached in the buffer.

    But the ViewNavigator precaching is only working with AutoUpdate=false and the cache is invalidated when you change navigation direction. When the ViewNavigator then runs out of cache entries, it does not care about view index changes and consistency.

    Hooray.


    What about DQL? Does that help?

    In short: yes.

    When using DQL, it is HCL's responsibility to ensure data consistency during view lookups, FT searches, or NSF scans with a search formula. The DQL engine returns matches as an IDTable for C developers (compressed storage of note IDs), which the LotusScript/Java API translates into a DocumentCollection. This translation is slower because it unnecessarily loads each document even if you only need the note IDs.

    If you believe you have found an example of data inconsistency, create a reproducible test case and submit a support ticket.

    Cliffhanger :-)

    In the next article, I will discuss the new QueryResultsProcessor in Domino 12, covering its capabilities and limitations. I will also explain how I rebuilt cross-database Domino view indexing in Domino JNA from scratch, avoiding these limitations.

    So stay tuned.