Provided by: libmarpa-r2-perl_12.000000-1_amd64 bug

Name

       Marpa::R2::Glade - Low-level interface to Marpa's Abstract Syntax Forests (ASF's)

Synopsis

         my $grammar = Marpa::R2::Scanless::G->new(
             {   source => \(<<'END_OF_SOURCE'),
         :start ::= pair
         pair ::= duple | item item
         duple ::= item item
         item ::= Hesperus | Phosphorus
         Hesperus ::= 'a'
         Phosphorus ::= 'a'
         END_OF_SOURCE
             }
         );

         my $slr = Marpa::R2::Scanless::R->new( { grammar => $grammar } );
         $slr->read( \'aa' );
         my $asf = Marpa::R2::ASF->new( { slr => $slr } );
         die 'No ASF' if not defined $asf;
         my $output_as_array = asf_to_basic_tree($asf);
         my $actual_output   = array_display($output_as_array);

       The code for asf_to_basic_tree() represents a user-supplied call using the interface described below.  An
       full example of asf_to_basic_tree(), which constructs a Perl array "tree", is given below.
       array_display() displays the tree in a compact form.  The code for it is also given below.  The return
       value of array_display() is as follows:

           Glade 2 has 2 symches
             Glade 2, Symch 0, pair ::= duple
                 Glade 6, duple ::= item item
                     Glade 8 has 2 symches
                       Glade 8, Symch 0, item ::= Hesperus
                           Glade 13, Hesperus ::= 'a'
                               Glade 15, Symbol 'a': "a"
                       Glade 8, Symch 1, item ::= Phosphorus
                           Glade 1, Phosphorus ::= 'a'
                               Glade 17, Symbol 'a': "a"
                     Glade 7 has 2 symches
                       Glade 7, Symch 0, item ::= Hesperus
                           Glade 22, Hesperus ::= 'a'
                               Glade 24, Symbol 'a': "a"
                       Glade 7, Symch 1, item ::= Phosphorus
                           Glade 9, Phosphorus ::= 'a'
                               Glade 26, Symbol 'a': "a"
             Glade 2, Symch 1, pair ::= item item
                 Glade 8 revisited
                 Glade 7 revisited

About this document

       This document describes the low-level interface to Marpa's abstract syntax forests (ASF's).  It assumes
       that you are already familiar with the high-level interface.  This low-level interface allows the maximum
       flexiblity in building the forest, but requires the application to do much of the work.

Ambiguity: factoring versus symches

       An abstract syntax forest (ASF) is similar to an abstract syntax tree (AST), but it has an additional
       ability -- it can represent an ambiguous parse.  Ambiguity in a parse can come in two forms, and Marpa's
       ASF's treat the distinction as important.  An ambiguity can be a symbolic choice (a symch), or a
       factoring.  Symbolic choices are the kind of ambiguity that springs first to mind -- a choice between
       rules, or a choice between a rule and token.  Factorings involve only one rule, but the RHS symbols of
       that rule divide the input up ("factor it") in different ways.  I'll give examples below.

       Symches and factorings are treated separately, because they behave very differently:

       •   Symches are less common than factorings.

       •   Factorings are frequently not of interest; symches are almost always of major interest.

       •   Symches usually have just a few alternatives; the possible number of factorings easily grows into the
           thousands.

       •   In  the  worst case, the number of symches is a constant that depends on size of the grammar.  In the
           worst case, the number of factorings  grows  exponentially  with  the  length  of  the  string  being
           factored.

       •   The  constant limiting the number of symches will almost always be of manageable size.  The number of
           factorings can grow without limit.

   An example of a symch
       Hesperus is Venus's traditional name as an evening star, and Phosphorus (aka Lucifer) is its  traditional
       name as a morning star.  For the grammar,

           :start ::= planet
           planet ::= hesperus
           planet ::= phosphorus
           hesperus ::= venus
           phosphorus ::= venus
           venus ~ 'venus'

       and the input string '"venus"', the forest would look like

           Symbol #0 planet has 2 symches
             Symch #0.0
             GL2 Rule 0: planet ::= hesperus
               GL3 Rule 2: hesperus ::= venus
                 GL4 Symbol venus: "venus"
             Symch #0.1
             GL2 Rule 1: planet ::= phosphorus
               GL5 Rule 3: phosphorus ::= venus
                 GL6 Symbol venus: "venus"

       Notice  the  tags  of the form ""GLn"", where n is an integer.  These identify the glade.  Glades will be
       described in detail below.

       The rules allow the string '"venus"' to  be  parsed  as  either  one  of  two  planets:  '"hesperus"'  or
       '"phosphorus"',  depending  on whether rule 0 or rule 1 is used.  The choice, at glade 2, between rules 0
       and 1, is a symch.

   An example of a factoring
       For the grammar,

           :start ::= top
           top ::= b b
           b ::= a a
           b ::= a
           a ~ 'a'

       and the input '"aaa"', a successful parse will always have two "b"'s.  Of these two "b"'s one will always
       be short, deriving a string of length 1: '"a"'.  The other will always be  long,  deriving  a  string  of
       length  2:  '"aa"'.   But  they  can be in either order, which means that the two "b"'s can divide up the
       input stream in two different ways: long string first; or short string first.

       These two different ways of dividing the input stream using the rule

           top ::= b b

       are called a factoring.  Here's Marpa's dump of the forest:

           GL2 Rule 0: top ::= b b
             Factoring #0
               GL3 Rule 2: b ::= a
                 GL4 Symbol a: "a"
               GL5 Rule 1: b ::= a a
                 GL6 Symbol a: "a"
                 GL7 Symbol a: "a"
             Factoring #1
               GL8 Rule 1: b ::= a a
                 GL9 Symbol a: "a"
                 GL10 Symbol a: "a"
               GL11 Rule 2: b ::= a
                 GL12 Symbol a: "a"

The structure of a forest

       The terminology I use for ASF's pictures an ASF as a forest on a  mountain.   This  mountain  forest  has
       glades,  and  there  are  paths between the glades.  The term "glade" comes from the idea of a glade as a
       distinct place in a forest that is open to light.

       The paths between glades have a direction -- they are always thought of as running one-way: downhill.  If
       a path connects two glades, the one uphill is called  an  upglade  and  the  one  downhill  is  called  a
       downglade.

       There is a glade at the top of mountain called the "peak".  The peak has no upglades.

The glade hierarchy

       The  glade  hierarchy  is directed graph whose nodes are glades.  Internally, glade nodes have an 3-level
       internal structure made up of

       •   one or more symches, which contain

       •   zero or more factorings, which contain

       •   one or more downglades.

       When discussing symches and factorings inside the internal structure of a glade G, G will be  called  the
       containing glade.

       If  G  is  a glade, the downglades in the internal structure will be the child nodes of G in the directed
       graph of glades.  There may not be any downglades, in which case G will be a  terminal  in  the  directed
       graph of glades.

   Glades
       Each  glade  node represents an instance of a symbol in one of the possible parse trees.  This means that
       each glade has a symbol (called the "glade symbol"), and an "input span".  An  input  span  is  an  input
       start  location,  and  a length in characters.  Because it has a start location and a length, a span also
       specifies an end location in the input.

   Symches
       Every glade contains one or more symches.  If a glade has only one  symch,  that  symch  is  said  to  be
       unique.   A  symch  is  either a token symch or a rule symch.  For a token symch, the glade symbol is the
       token symbol.  For a rule symch, the glade symbol is the LHS of the rule.

   Token symches
       At most one of the symches in a glade can be a token symch.  Token symches  have  no  children.   In  the
       internal  structure  of  a glade, they are terminals.  If a glade has a unique symch, and that symch is a
       token symch, that glade will be a terminal in the glade hierarchy.

   Rule symches
       There can be many rule symches in a glade -- one for every rule with the glade symbol on its LHS.  As the
       name suggests, every rule symch has a rule associated with it.  In every rule symch, the LHS of its  rule
       will always be the same as the glade symbol of the glade that contains the rule symch.

   Factorings
       Every  rule  symch  has one or more factorings, or factors, as children.  A rule symch is said to contain
       its child factorings.  Each factoring is the RHS of the parent symch's rule, mapped to a unique set of G1
       locations.  The term "factoring" is traditional in  the  literature  and  is  meant  to  suggest  that  a
       factoring is a way of "dividing up" the input span of the containing glade among its RHS symbols.

       If  a  rule  symch  has  only  one factoring, that factoring is said to be unique.  Because the number of
       factorings can get out of hand, factorings may be omitted.  A symch which omits factorings is said to  be
       truncated.  By default, every symch is truncated down to its first 42 factorings.

   Downglades
       Every  factoring  has one or more downglades as children.  The number of downglades will be L, where L is
       the length of the RHS of the parent symch's rule.  (The parent symch will always be a rule symch.)   Each
       of  L  RHS  symbols will be located at a specific G1 span, and will therefore be a symbol instance.  This
       symbol instance will be the symbol instance of a glade, and that glade will be the downglade.

       Within a glade, symches, factorings and downglades are  tracked  using  zero-based  indexes.   Since  the
       internal  structure  of a glade has a 3-level structure, within a given containing glade G, any downglade
       can be uniquely identified as "($symch_ix, $factor_ix, $downglade_ix)", where $symch_ix is  a  zero-based
       symch  index,  $factor_ix  is  a  zero-based factoring index, and $downglade_ix is a zero-based downglade
       index.

The glade ID

       Each glade has a glade ID.  This can be relied on to be a non-negative integer.  A glade ID may be  zero.
       Glade ID's are obtained from the "peak()" and "factoring_downglades()" methods.

Techniques for traversing ASF's

   Memoization
       When  traversing  a  forest, you should take steps to avoid traversing the same glades twice.  You can do
       this by memoizing the result of each glade, perhaps using its glade ID to index an array.  When  a  glade
       is  visited, the array can be checked to see if its result has been memoized.  If so, the memoized result
       should be used.

       This memoization eliminates the need to revisit the downglades of an already visited glade.  It does  not
       eliminate  multiple  visits to a glade, but it does eliminate retraversal of the glades downhill from it.
       In practice, the improvement in speed can be stunning.  It  will  often  be  the  difference  between  an
       program  which  is  unuseably  slow  even for very small inputs, and one which is extremely fast even for
       large inputs.

       Repeated subtraversals happen when two glades share the same downglades, something that occurs frequently
       in ASF's.  Additionally, some day the SLIF may allow cycles.   Memoization  will  prevent  a  cycle  from
       causing an infinite loop.

       The  example  in  this  POD  includes  a  memoization  scheme which is very simple, but adequate for most
       purposes.  The main logic of its memoization is shown here.

               my ( $asf, $glade, $seen ) = @_;
               return bless ["Glade $glade revisited"], 'My_Revisit'
                   if $seen->[$glade];
               $seen->[$glade] = 1;

       Putting memoization in one of the very first drafts of your code will save you time and trouble.

Forest method

   peak()
           my $peak = $asf->peak();

       Returns the glade ID of the peak.  This may be zero.  All failures are thrown as exceptions.

Glade methods

   glade_literal()
               my $literal = $asf->glade_literal($glade);

       Returns the literal substring of the input associated with the glade.  Every glade is associated  with  a
       span -- a start location in the input, and a length.  On failure, throws an exception.

       The  literal  is  determined  by  the  range.  This works as expected if your application reads the input
       characters one-by-one, in order and without gaps except for characters that are normally discarded,  such
       as whitespace.  (We will call applications which read in this fashion, monotonic.)  Most applications are
       monotonic,  and yours is, unless you've taken special pains to make it otherwise.  Computation of literal
       substrings  for   non-monotonic   applications   is   addressed   in   "Literals   and   G1   spans"   in
       Marpa::R2::Scanless::R.

   glade_span()
           my ( $glade_start, $glade_length ) = $asf->glade_span($glade_id);

       Returns  the  span  of  the  input associated with the glade.  Every glade is associated with a span -- a
       start location in the input, and a length.  On failure, throws an exception.

       The span will be as expected if your application reads the input characters  one-by-one  in  order.   (We
       will  call  applications  which  read  in this fashion, monotonic.)  Most applications are monotonic, and
       yours is, unless you've taken special pains to make it otherwise.  Computation of literal substrings  for
       non-monotonic applications is addressed in "Literals and G1 spans" in Marpa::R2::Scanless::R.

   glade_symch_count()
           my $symch_count = $asf->glade_symch_count($glade);

       Requires a glade ID as its only argument.  Returns the number of symches contained in the glade specified
       by the argument.  On failure, throws an exception.

   glade_symbol_id()
           my $symbol_id    = $asf->glade_symbol_id($glade);
           my $display_form = $grammar->symbol_display_form($symbol_id);

       Requires  a  glade  ID  as  its only argument.  Returns the symbol ID of the "glade symbol" for the glade
       specified by the argument.  On failure, throws an exception.

Symch methods

   symch_rule_id()
           my $rule_id = $asf->symch_rule_id( $glade, $symch_ix );

       Requires two arguments: a glade ID and a zero-based symch index.  These specify a symch.   If  the  symch
       specified is a rule symch, returns the rule ID.  If it is a token symch, returns -1.

       Returns  a Perl undef, if the glade exists, but the symch index is too high.  On other failure, throws an
       exception.

   symch_is_truncated()
       [ To be written. ]

   symch_factoring_count()
           my $factoring_count =
               $asf->symch_factoring_count( $glade, $symch_ix );

       Requires two arguments: a glade ID and a zero-based symch index.  These specify  a  symch.   Returns  the
       count  of  factorings  if the specified symch is a rule symch.  This count will always be one or greater.
       Returns zero if the specified symch is a token symch.

       Returns a Perl undef, if the glade exists, but the symch index is too high.  On other failure, throws  an
       exception.

Factoring methods

   factoring_downglades()
           my $downglades =
               $asf->factoring_downglades( $glade, $symch_ix,
               $factor_ix );

       Requires  three  arguments:  a  glade  ID,  the zero-based index of a symch and the zero-based index of a
       factoring.  These specify a factoring.  On success, returns a reference to an array.  The array  contains
       the glade IDs of the downglades in the factoring specified.

       Returns  a  Perl  undef,  if  the  glade  and symch exist, but the factoring index is too high.  On other
       failure, throws an exception.  In particular, exceptions are thrown if the symch is for a token;  and  if
       the glade exists, but the symch index is too high.

Methods for reporting ambiguity

           if ( $recce->ambiguity_metric() > 1 ) {
               my $asf = Marpa::R2::ASF->new( { slr => $recce } );
               die 'No ASF' if not defined $asf;
               my $ambiguities = Marpa::R2::Internal::ASF::ambiguities($asf);

               # Only report the first two
               my @ambiguities = grep {defined} @{$ambiguities}[ 0 .. 1 ];

               $actual_value = 'Application grammar is ambiguous';
               $actual_result =
                   Marpa::R2::Internal::ASF::ambiguities_show( $asf, \@ambiguities );
               last PROCESSING;
           } ## end if ( $recce->ambiguity_metric() > 1 )

   ambiguities()
           my $ambiguities = Marpa::R2::Internal::ASF::ambiguities($asf);

       Returns  a reference to an array of ambiguity reports in the ASF.  The first and only argument must be an
       ASF object.  The array returned will be be zero length if the parse was not ambiguous.  Ambiguity reports
       are as described below.

       While the ambiguities() method can be called to determine whether or not ambiguities  exist,  it  is  the
       more  expensive  way to do it.  The $slr->ambiguity_metric() method tests an already-existing boolean and
       is therefore extremely fast.  If you are simply testing for ambiguity, use the ambiguity_metric()  method
       instead.   If  you  can  save  time  when  you know that a parse is unambiguous, you may want to test for
       ambiguity with the ambiguity_metric() method before calling the ambiguities() method.

   ambiguities_show()
         $actual_result =
           Marpa::R2::Internal::ASF::ambiguities_show( $asf, \@ambiguities );

       Returns a string which contains a description of the ambiguities in its arguments.  Takes two  arguments,
       both  required.   The  first  is an ASF, and the second is a reference to an array of ambiguities, in the
       format returned by the ambiguities() method.

       Major applications will often have their own customized  ambiguity  formatting  routine,  one  which  can
       formulate  error  messages based, not just on the names of the rules and symbols, but on knowledge of the
       role that the rules and symbols play in the application.  This method is intended for applications  which
       do  not  have  their own customized ambiguity handling.  For those which do, it can be used as a fallback
       for handling those reports that the customized method does not recognize or  that  do  not  need  special
       handling.   The  format  of  the returned string and is subject to change, and is intended for reading by
       humans only.

Ambiguity reports

       The ambiguity reports returned by the ambiguities() method are of two kinds: symch reports and  factoring
       reports.   Only  ambiguities  uppermost  on  a  path  are reported -- in other words, an ambiguity is not
       reported if it is downhill and does not have a higher altitude.  (Because of cycles, it is possible for a
       downhill glade to be at a higher altitude.)  Within glade, all of the factorings are considered  downhill
       from,  and  of  equal altitude to the symches, so that if a glade has a symch ambiguity, there will be no
       factoring ambiguity reports for that glade.

       Typically, when there is more than one kind of ambiguity in an input span, only one is of real  interest,
       and  symch ambiguities are usually of more interest than factorings.  And if one ambiguity is uphill from
       another, the downhill ambiguity is usually a side effect of the uphill one and of little interest.

   Symch reports
       A symch report is issued whenever, in a top-down traversal of the ASF, a glade is encountered  which  has
       more than one symch.  A symch report takes the form

          [ 'symch', $glade ]

       where $glade is the ID of the glade with the symch ambiguity.  With this and the accessor methods in this
       document, an application can report full details of the symch ambiguity.

   Factoring reports
       A  factoring  report  identifies  a sequence of RHS symbols which has more than one factoring.  Factoring
       reports identify not just a rule, but specific subsequences within the RHS which are multiply factored --
       multifactored stretches.  Marpa goes down to the symbol level, because a Marpa  RHS  can  be  very  long.
       Marpa's  sequence  rules,  especially,  can  have  long stretches where the symbols are in sync with each
       other, broken by other stretches where they are out of sync.  (A  detailed  definition  of  multifactored
       stretches is below.)

       A factoring report takes the form

           [ 'factoring', $glade, $symch_ix, $rhs_ix1, $factor_ix2, $rhs_ix2 ];

       where $glade is the ID of the glade with the factoring ambiguity, and $symch_ix is the index of the symch
       involved.   The  multifactored stretch is described by two "identifying downglades".  Both downglades are
       at the beginning of the stretch, and both will have the  same  input  start  location.   The  identifying
       downglades will differ in length.

       The first of the two identifying factors has a factoring index of 0, and its downglade index is $rhs_ix1.
       The second identifying factor has a factoring index of $factor_ix2, and its downglade index is $rhs_ix2.

       The  identifying downglades will usually be enough for error reporting, which is the usual application of
       these reports.  A multifactored stretch can be extremely large.  Full details  of  it  can  be  found  by
       following  up  on  the  information  in the factoring report using the accessor methods described in this
       document.

The code for the synopsis

   The asf_to_basic_tree() code
         sub asf_to_basic_tree {
             my ( $asf, $glade ) = @_;
             my $peak = $asf->peak();
             return glade_to_basic_tree( $asf, $peak, [] );
         } ## end sub asf_to_basic_tree

         sub glade_to_basic_tree {
             my ( $asf, $glade, $seen ) = @_;
             return bless ["Glade $glade revisited"], 'My_Revisit'
                 if $seen->[$glade];
             $seen->[$glade] = 1;
             my $grammar     = $asf->grammar();
             my @symches     = ();
             my $symch_count = $asf->glade_symch_count($glade);
             SYMCH: for ( my $symch_ix = 0; $symch_ix < $symch_count; $symch_ix++ ) {
                 my $rule_id = $asf->symch_rule_id( $glade, $symch_ix );
                 if ( $rule_id < 0 ) {
                     my $literal      = $asf->glade_literal($glade);
                     my $symbol_id    = $asf->glade_symbol_id($glade);
                     my $display_form = $grammar->symbol_display_form($symbol_id);
                     push @symches,
                         bless [qq{Glade $glade, Symbol $display_form: "$literal"}],
                         'My_Token';
                     next SYMCH;
                 } ## end if ( $rule_id < 0 )

                 # ignore any truncation of the factorings
                 my $factoring_count =
                     $asf->symch_factoring_count( $glade, $symch_ix );
                 my @symch_description = ("Glade $glade");
                 push @symch_description, "Symch $symch_ix" if $symch_count > 1;
                 push @symch_description, $grammar->rule_show($rule_id);
                 my $symch_description = join q{, }, @symch_description;

                 my @factorings = ($symch_description);
                 for (
                     my $factor_ix = 0;
                     $factor_ix < $factoring_count;
                     $factor_ix++
                     )
                 {
                     my $downglades =
                         $asf->factoring_downglades( $glade, $symch_ix,
                         $factor_ix );
                     push @factorings,
                         bless [ map { glade_to_basic_tree( $asf, $_, $seen ) }
                             @{$downglades} ], 'My_Rule';
                 } ## end for ( my $factor_ix = 0; $factor_ix < $factoring_count...)
                 if ( $factoring_count > 1 ) {
                     push @symches,
                         bless [
                         "Glade $glade, symch $symch_ix has $factoring_count factorings",
                         @factorings
                         ],
                         'My_Factorings';
                     next SYMCH;
                 } ## end if ( $factoring_count > 1 )
                 push @symches, bless [ @factorings[ 0, 1 ] ], 'My_Factorings';
             } ## end SYMCH: for ( my $symch_ix = 0; $symch_ix < $symch_count; ...)
             return bless [ "Glade $glade has $symch_count symches", @symches ],
                 'My_Symches'
                 if $symch_count > 1;
             return $symches[0];
         } ## end sub glade_to_basic_tree

   The array_display() code
       Because of the blessings in this example, a standard dump of  the  output  array  is  too  cluttered  for
       comfortable  reading.   The following code displays the output from asf_to_basic_tree() in a more compact
       form.  Note that this code makes no use of Marpa, and works for all Perl  arrays.   It  is  included  for
       completeness, and as a simple example of array traversal.

           sub array_display {
               my ($array) = @_;
               my ( undef, @lines ) = @{ array_lines_display($array) };
               my $text = q{};
               for my $line (@lines) {
                   my ( $indent, $body ) = @{$line};
                   $indent -= 6;
                   $text .= ( q{ } x $indent ) . $body . "\n";
               }
               return $text;
           } ## end sub array_display

           sub array_lines_display {
               my ($array) = @_;
               my $reftype = Scalar::Util::reftype($array) // '!undef!';
               return [ [ 0, $array ] ] if $reftype ne 'ARRAY';
               my @lines = ();
               ELEMENT: for my $element ( @{$array} ) {
                   for my $line ( @{ array_lines_display($element) } ) {
                       my ( $indent, $body ) = @{$line};
                       push @lines, [ $indent + 2, $body ];
                   }
               } ## end ELEMENT: for my $element ( @{$array} )
               return \@lines;
           } ## end sub array_lines_display

Details

       This  section  contains  details  not  essential to understanding the main topic of this document.  These
       details include restatements of what is said above  in  more  formal  terms,  and  formal  statements  of
       concepts  that  have  been  left to the intuition.  Some readers find these details helpful, while others
       find them distracting.  Segregating these details here serves the needs of both.

   An alternative way of defining glade terminology
       Here's a way of defining some of the above terms which is  less  intuitive,  but  more  precise.   First,
       define  the  glade length from glades A to glade B in an ASF as the number of glades on the shortest path
       from A to B, not including glade A.  (Recall that paths are directional.)  If there is  no  path  between
       glades  A  and  B,  the  glade length is undefined.  Glade B is a downglade of glade A, and glade A is an
       upglade of glade B, if and only if the glade length from A to B is 1.

       A glade A is uphill with respect to glade B, and a glade B is downhill with respect to glade  A,  if  and
       only if the glade length from A to B is defined.

       A  peak  of  an ASF is a node without upglades.  By construction of the ASF, there is only one peak.  The
       distance-to-peak of a glade "A" is the glade length from the peak to glade "A".  Glade  "A"  is  said  to
       have  a  higher  altitude  than glade "B" if the distance-to-peak of glade "A" is less than that of glade
       "B".  Glade "A" has a lower altitude than glade "B" if the distance-to-peak of glade "A" is greater  than
       that  of glade "B".  Glade "A" has the same altitude as glade "B" if the distance-to-peak of glade "A" is
       equal to that of glade "B".

   Cycles
       In the current SLIF implementation, a forest is a directed acyclic graph  (DAG).   (In  the  mathematical
       literature  a  DAG  is  also  called  a  "tree",  but that use is confusing in the present context.)  The
       underlying Marpa algorithm allows parse trees with cycles, and someday the SLIF probably  will  as  well.
       When that happens, ASF's will no longer be "acyclic" and therefore will be directed graphs (DG's) but not
       DAG's.   This  document  talks  about  ASF's as if that day had already come -- it assumes that the ASF's
       might contain cycles.

       In an ASF that contains one or more cycles, the concepts of uphill and downhill become much  less  useful
       for  describing  the relative positions of glades.  For example, if glade A cycles back to itself through
       glade B, then

       •   Glade A will be uphill from glade B, and

       •   Glade B will be uphill from glade A; so that

       •   Glade B will be downhill from glade A, and

       •   Glade A will be downhill from glade B; and

       •   Glade A will be both downhill and uphill from itself; and

       •   Glade B will be both downhill and uphill from itself.

       ASF's will always be constructed so that the peak has no upglades.  Because of this, the peak  can  never
       be  part of a cycle.  This means that altitude will always be well defined in the sense that, for any two
       glades "A" and "B", one and only one of the following statements will be true:

       •   Glade "A" is lower in altitude than glade "B".

       •   Glade "A" is higher in altitude than glade "B".

       •   Glade "A" is equal in altitude to glade "b".

   Token symches
       In the current SLIF implementation, a symbol is always either a token or the LHS of a rule.   This  means
       that any glade that contains a token symch cannot contain any rule symches.  It also means that any glade
       that contains a rule symch will not contain a token symch.

       However, the underlying Marpa algorithm allows LHS terminals, and someday the SLIF probably will as well.
       This  document  is written as if that day has already come, and describes glades as if they could contain
       both rule symches and a token symch.

   Maximum symches per glade
       Above, the point is made that the number of symches in a glade,  even  in  the  worst  case,  is  a  very
       manageable  number.   For  a particular case, it is not hard to work out the exact maximum.  Here are the
       details.

       There can be at most one token symch.  There can be only rule symch for every  rule.   In  addition,  all
       rules  in a glade must have the glade symbol as their LHS.  Let the number of rules with the glade symbol
       on their LHS be "r".  The maximum number of symches in a glade is "r+1".

   Multifactored stretches
       Marpa locates factoring ambiguities, not just by rule, but by subsequences of symbols within a  RHS.   It
       finds  multifactored  stretches,  input spans where the sequence of symbols can have multiple factorings.
       Sequence rules can have a very long RHS.  If ambiguity reporting is to be precise,  it  is  necessary  to
       narrow down factoring ambiguities to the specific input spans where they occur.

       Marpa  tries  to break up each ambiguously factored RHS into as many multifactored stretches as possible.
       Nonetheless, there will be cases in which a multifactored stretch encompasses the entire RHS of a rule.

       The main body of this document worked with an intuitive "know one when I see one" idea  of  multifactored
       stretches.   In  this  section  we  will  give an exact definition.  First, we will need some preliminary
       definitions.

       Consider the case of a arbitrary, ambiguous rule symch.  The ambiguous rule symch can be seen as a set of
       factorings.  We need a way to identify individual downglades within each factoring of the set.  Within  a
       factoring,  there is a one-to-one correspondence between downglade and locations in the string of symbols
       that is the RHS.  Therefore downglades can be identified  uniquely  as  "glade_id,  symch_ix,  factor_ix,
       rhs_ix",  where  "glade_id"  is a unique identifier of the glade, "symch_ix" is a zero-based symch index,
       "factor_ix" is a zero-based factoring index, and "rhs_ix" is a  zero-based  index  identifying  a  symbol
       location in the RHS string.

       In  much  of  what follows, we will assume we are talking about a specific glade and rule symch, which is
       understood in context, so that we can take "factor_ix, rhs_ix" as uniquely identifying a  downglade.   We
       will call "factor_ix, rhs_ix" a downglade duple.

       Let  N  be  the number of factorings in the rule symch of interest.  Since we are dealing with a specific
       rule symch, we also have a specific rule that is of interest.  Call the length of the RHS of  that  rule,
       "rhs_length".  For every "factor_ix", the values of "rhs_ix" will be in the range "0 ... rhs_length-1".

       Each downglade "factor_ix, rhs_ix" corresponds one-to-one to a G1 span.  This means that there is a total
       function "G1_start(factor_ix, rhs_ix)" from a downglade to a G1 start location; and that there is a total
       function "G1_length(factor_ix, rhs_ix)" from a downglade to a G1 length.

       A symch location set is a set of N downglade duples such there is exactly one element "factor_ix, rhs_ix"
       in  the set for each "0 <= factor_ix < N".  In other words, a symch location set is a total function from
       the factor indexes to the RHS indexes.

       A symch alignment is a symch location set whose elements shares a common  G1  start  location.   A  symch
       alignment  is  synced if its elements share a common G1 length as well.  A symch alignment is unsynced if
       it is not synced.

       More formally, if we represent the symch location set function as "SLS", "SLS" is a  symch  alignment  if
       and  only  if  there  is  some  constant  G1  location pos such that, for all "factor_ix" such that "0 <=
       factor_ix < N", "G1_start(factor_ix, SLS(factor_ix)) = pos".  We say the "pos" is  the  location  of  the
       symch  alignment.   "SLS"  is a synced symch alignment if and only if it is a symch aligment and there is
       some constant number L such "G1_length(factor_ix, SLS(factor_ix)) = L".

       Of special interest is the initial symch location set.  The initial  symch  location  set  is  the  symch
       position  all  of  whose  RHS  indexes  are  0.   Equivalently,  it is the constant function "ISLS" where
       "ISLS(factor_ix)=0" for all "factor_ix" such that "0 <= factor_ix < N".  The initial symch  location  set
       is always a symch alignment, and is called the initial symch alignment.

       We are now in a position to define a multifactored stretch for a specific rule symch.  Call a G1 location
       an  anchor if it is the location of a synced symch alignment or, as a special case, if it is one past the
       last location of the G1 stream.  We say that a G1 span is anchored if its start and end G1 locations  are
       anchors.   A  multifactored  stretch is an anchored G1 span which contains only one anchored G1 location.
       (Recall that, by the definition of a span, the end location of a span is not contained in the span.)

       As a consequence of its definition, the start location of a  multifactored  stretch  will  always  be  an
       anchor,  and the only anchored location in the multifactored stretch will be its start location.  (Again,
       by definition, the end location of the stretch is not contained in the stretch.)

       Multifactored stretches are aligned and anchored in terms of G1 locations, and  not  in  terms  of  input
       stream  locations or RHS indexes.  It may be useful, however, to look at syncing and alignment from other
       points of view.  This is done in the next two sections.

       Alignment and input stream locations

       Because G1 spans map to input stream spans, symch location sets that are aligned or synced  in  G1  terms
       will  be  aligned  or  synced  from  the  input  stream  point of view as well.  But, if the input is not
       monotonic, the opposite is not necessarily the case.  Because several G1 locations  can  share  an  input
       stream  location, symch location sets that seem aligned or synced from the input stream point of view may
       not be considered aligned or synced from the G1 location point.

       Alignment and RHS indexes

       Symch location sets that are aligned from the G1 location point will typically not be  aligned  from  the
       point of the RHS indexes.  Non-aligned RHS indexes are particularly likely for long sequence rules, where
       the RHS is a long string containing many repetitions of a single symbol.

Copyright and License

         Copyright 2022 Jeffrey Kegler
         This file is part of Marpa::R2.  Marpa::R2 is free software: you can
         redistribute it and/or modify it under the terms of the GNU Lesser
         General Public License as published by the Free Software Foundation,
         either version 3 of the License, or (at your option) any later version.

         Marpa::R2 is distributed in the hope that it will be useful,
         but WITHOUT ANY WARRANTY; without even the implied warranty of
         MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
         Lesser General Public License for more details.

         You should have received a copy of the GNU Lesser
         General Public License along with Marpa::R2.  If not, see
         http://www.gnu.org/licenses/.

perl v5.40.1                                       2025-06-25                              Marpa::R2::Glade(3pm)