Provided by: libmarpa-r2-perl_12.000000-1_amd64 bug

NAME

       Marpa::R2::Semantics::Order - How the SLIF ranks ambiguous parses

Description

       Marpa allows ambiguous parses.  While an unambiguous parse can produce at most one parse tree and one
       parse result, an ambiguous parse will produce a parse series.  A parse series is a sequence of parse
       trees, each of which will have its own parse result.

       This document describes ways of controlling the order in which the SLIF recognizer's value() method
       evaluates the parse trees of an ambiguous parse.  It also describes ways to exclude selected parse trees
       from the parse series.

   Default parse order
       By calling the recognizer's value() method repeatedly, Marpa can produce all the parse results in the
       current parse series.  The default is for the parse results to be returned in an arbitrary parse order.
       This corresponds to the ""none"" value of the recognizer's "ranking_method" named argument.

       Traversal of the parse trees in arbitrary parse order will be always be well-behaved in the sense that no
       two parse trees will be semantic duplicates, and no unique (semantic non-duplicate) parse tree will be
       omitted in it.  No other property of arbitrary parse order is guaranteed.  For example, the order may
       change each time the parse series is traversed.

   Choicepoints
       When ranking, the logic traverses each node of the parse bocage.  In this context, the nodes are also
       called "choicepoints".  From the point of view of the individual parse trees, the traversal will be top-
       down and left-to-right.

       Each choicepoint has one or more "choices".  Often a choicepoint has only a single choice, in which case
       the choicepoint is called "trivial".  For two rule instances to be choices of the same choicepoint, they
       must end at the same location, and their rules must have the same LHS.

       The terms "choicepoint" and "choice" are defined carefully in a separate document.  That document also
       gives several examples of ranking, which are explained in detail.

   Ranking methods
       SLIF recognizer objects have a "ranking_method" named argument, whose value can be the name of a ranking
       method, or ""none"", indicating that the default ranking method is to be used.

   The "rule" ranking method
       The rule method ranks alternative parses according to their rule alternatives.  Every rule alternative
       has a numeric rank.  A rule's rank can be specified using the the "rank" adverb argument for that RHS
       alternative.  Rule ranks must be integers.  They may be negative.  If no numeric rank is specified, the
       numeric rank is 0.

   The "high_rule_only" ranking method
       The "high_rule_only" ranking method is similar to the "rule" ranking method, except that, at every
       choicepoint, it discards all of the choices which have a rank lower than that of the highest ranked
       choice.

       The "high_rule_only" ranking method can reduce the ambiguity of a parse, but it does not necessarily do
       so.  This is because, at each choicepoint among the parse trees, it is possible that several of the
       choices, or all of them, will have the same rank as the highest ranked choice.

   Rule ranking
       At each choicepoint, the choices are ranked as follows:

       •   Different numeric ranks:

           If  the  two  parse  choices  have  different  numeric  ranks,  they  must  also  have different rule
           alternatives.  The parse choice whose rule alternative has the higher numeric rank will rank high.

       •   Same rule alternative:

           If the two parse choices have the same rule alternative, they rank as described under  "Null  variant
           ranking".

       •   Same numeric rank, different rule alternatives:

           Two  different  rule  alternatives  can have the same numeric rank.  If the two parse choices are for
           rule alternatives that are different, but that have the same numeric rank, the relative order of  the
           two parse choices is arbitrary.

       Rule  alternatives  may  be part of a single rule in the DSL -- for example, a prioritized rule.  Lexical
       order within a DSL rule makes no difference when ranking rule alternatives.  For  example,  it  makes  no
       difference  if  two  rule  alternatives  come  from  the  same  prioritized  rule;  or from two different
       prioritized rules.

   Null variant ranking
       Some rules have a RHS which contains proper nullables: symbols which may be nulled,  but  which  are  not
       nulling symbols.  (Nulling symbols are symbols which are always nulled.)

       When  a rule alternative contains proper nullables, each instance of that rule creates a nulling variant.
       A nulling variant is a specific pattern of null and non-null symbols in a rule instance's RHS.   In  many
       cases, this creates an ambiguity -- different nulling variants can match the same substring in the input.
       In  ambiguous  parsings of this kind, some applications may want to rank nulling variants that start with
       non-null symbols higher.  Other applications may want to do the opposite -- to rank nulling variants that
       start with null symbols higher.

       The "null-ranking" adverb for RHS alternatives specifies which nulling variants are ranked high  or  low.
       If the "null-ranking" is ""low"", then the closer a nulling variant places its visible (non-null) symbols
       to  the start of the rule instance, the higher it ranks.  A null ranking of "low" is the default.  If the
       "null-ranking" is ""high"", then the closer a nulling variant places its null symbols to the start of the
       rule instance, the higher it ranks.  In ranking nulling variants with  more  than  one  proper  nullable,
       major-to-minor is left-to-right.

   A general approach to sorting parses
       The  most  general  way to sort Marpa parses is for the application to take control.  The application can
       set up the Marpa semantic actions so that the parse result of every parse tree is a "<rank,  true_value>"
       duple.   The duples can then be sorted by "rank".  Once the results are sorted, the "rank" element of the
       duple can be discarded.  (Those familiar with the Schwartzian transform may note a resemblance.  In Perl,
       duples can be implemented as references to arrays of 2 elements.)

       The user needs to be careful.  In theory, ambiguity can cause an exponential explosion in the  number  of
       results.   In  practice,  ambiguity  tends to get out of hand very easily.  Producing and sorting all the
       parses can take a very long time.

Details

       This section contains additional explanations, not essential to understanding the rest of this  document.
       Often  they  are  formal  or  mathematical.   While  some  people  find  these  helpful, others find them
       distracting, which is why they are segregated here.

   Duplicate parses
       When evaluating the parse trees in a parse series, Marpa never evaluates the same parse tree twice.  What
       this means probably matches the programmer's intuition of what it should mean.  Marpa considers two parse
       trees to be the same if they are semantic equivalents.

       Two parse trees are semantic equivalents if and only if a recursive, top-down evaluation of each  applies
       the  same  rules  in the same order at the same G1 locations.  If the semantics are deterministic, and if
       two parse trees are semantic equivalents according to this definition, the two parse  trees  will  always
       produce the same parse result.

       The  two  parse  trees are called semantic equivalents, because from the point of view of a deterministic
       semantics they are indistinguishable.  When the Marpa documentation refers to  duplicate  parses,  unless
       otherwise stated, it means that the two are semantic equivalents.

       Formally,  semantic  equivalence  is  defined  as  follows:  Call  the set of parse trees, "T".  Semantic
       equivalence is an equivalence relation on "T".  Call this relation "~".  Call "E", the  quotient  set  of
       "T"  by  "~".  In this document, the term arbitrary parse order is used to mean an arbitrary choice among
       the relations which are strict total orders of "E".

Copyright and License

         Copyright 2022 Jeffrey Kegler
         This file is part of Marpa::R2.  Marpa::R2 is free software: you can
         redistribute it and/or modify it under the terms of the GNU Lesser
         General Public License as published by the Free Software Foundation,
         either version 3 of the License, or (at your option) any later version.

         Marpa::R2 is distributed in the hope that it will be useful,
         but WITHOUT ANY WARRANTY; without even the implied warranty of
         MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
         Lesser General Public License for more details.

         You should have received a copy of the GNU Lesser
         General Public License along with Marpa::R2.  If not, see
         http://www.gnu.org/licenses/.

perl v5.40.1                                       2025-06-25                   Marpa::R2::Semantics::Order(3pm)