Please choose  from the categories below
Cuneiform Signs

Analysis and reports to support an international standard for computer encoding of the Cuneiform writing system

Research on the development of Cuneiform signs

 

Productivity of Container-Infix Signs
The proportion of signs having the structure Container-with-Infixed Sign is very high, and increasingly greater as our attempts to achieve wide coverage lead us beyond the basic sign lists. We may expect more signs to be discovered especially of this type. This argues that we should either encode these signs reflecting their productivity as the sequence of three components CONTAINER <infx> INFIXED, or else specify a table for default sort order which sorts the signs as if they were encoded this way (so as to accomodate newly added signs with minimal adjustment).

Counting all signs in the draft concordance on October 20th, including the Fara and ZATU lists, there were

  530  Atoms
1056  Container-Infixed signs and some behaving like them, overlayed parts etc.
    50  Sign-Over-Sign or Sign-Crossing-Sign structures
    11  Squared arrangement of four rotated copies of base sign
    73  Number signs (including punctuation signs)
    62  Ligatures (living ones like "fi" not fossile like "&";
              including number+base sign ligatures)
  120  Seqences of signs from one or another sign list
               (not warranting separate encoding)

 

One stage earlier, including signs from the PSL merged CDLI and Miguel Civil lists, the numbers were these:

  386  Atoms
  513   Container-Infixed signs and some behaving like them, overlayed parts etc.
    36  Sign-Over-Sign or Sign-Crossing-Sign structures
    11  Squared arrangement of four rotated copies of base sign
    65  Number signs (including punctuation signs)
    14  Ligatures (living ones like "fi" not fossile like "&";
              including number+base sign ligatures)
    93  Seqences of signs from one or another sign list
               (not warranting separate encoding)

A first count, including signs from Borger ABZ, von Soden, Labat, and Ellermeier:

  373  Atoms
  371  Container-Infixed signs and some behaving like them, overlayed parts etc.
    36  Sign-Over-Sign or Sign-Crossing-Sign structures
    11  Squared arrangement of four rotated copies of base sign

*
I do not have statistics independently for results of adding Fara without ZATU, but the order in which I created the concordance precluded that from being easy to do.

The classification was done using these criteria (among others which implement the instruction to "code signs, not readings")

(a) that a sign is not decomposed into parts which do not have an independent existence (as urged explicitly or implicitly by several of us, this also serves to widely reinforce traditional names).

(b) that forms of the same sign across historical periods should be codable the same way, unless the sign is absent for a given period.

(c) that mere "text elements" like English "th" in "brother" are not given separate status.

Where the balance lies among these and other criteria will become more evident from the full set of data.

The increase in the (+) category as more older texts are covered is not significant. I simply had not marked them well when tallying the second time. The significant pattern is the giant increase in the sign types which are Container-contained or sign-overlapping-sign etc. to a total of 1056. This is a highly productive category especially in earlier signs. The largest number of as-yet-uncatalogued signs probably belong to this group. It was a productive process for users of the script.

One or another method can be very important to ensure that our encoding proposal covers the Container-Infixed type with their close relatives, either (1) by encoding each of these as a sequence of three code points using a special "infix" (etc.) code here called INFX, or else (2) by having a default sorting table decompose each of these signs into a sequence of three codes, so that signs can be added later coded as a sequence and can then be sorted interleaved where they belong, even if the default sorting table is not modified.

I think alternative (1) is the more stable and transparent of the two. In either case, whether via original encoding or via second-level default sorting table, an encoding which does this for the (x) signs is of the form LAGAB INFX HA.

Since we must specify sorting order in any case for implementation, we should do it now for the hard examples, to see whether we discover surprises for our expectations.