Please choose  from the categories below
Cuneiform Signs

Analysis and reports to support an international standard for computer encoding of the Cuneiform writing system

Research on the development of Cuneiform signs

 

Two approaches to encoding Cuneiform signs are here contrasted.

One(A) is based on Ken Whistler's approach, on the left, treating as "characters" sometimes signs, sometimes parts of recognized signs which do not divide into parts in actual usage. The other, on the right, respects the 150 year tradition of assyriological scholarship. The two differ greatly in the simplicity of implementation, in how dependent cuneiform users will be on others, vs. in how direct and simple usage will be.

 
The comparisons below are a sincere attempt is to be fair, but my choice is clearly in favor of the approach on the right, for the reasons of greater simplicity there given. Anyone who thinks the comparison can be made more fair please email Lloyd Anderson with suggestions on what to add or change. The comparison in the third line, concerning avoidance of identical surface forms, is the chief advantage which I think is claimed by advocates of approach A, but I am not sure how to state it in a way which is satisfactory to various people's preferences. Given the complex alternative spellings needed in any case for searching cuneiform, I'm not sure there is much difference here even if there were any significant difference in similar surface forms with distinct character sequences.
 

Evidence used by the approach on the right includes prominently the following. This evidence is not used slavishly, but with discretion, and any knowledge from experts explaining particular cases is taken into consideration. The two sorts of evidence noted here are highly correlated (see chart), presumably because the assyriologists who determined what are the distinctive "signs" of the script took into account the very behavior of the signs which is most normally diagnostic of status as single sign vs. as sequence of signs.

1. The distinction between signs, compounds (sign sequences), and components of single signs are made by the assyriological tradition. This accumulated knowledge is respected unless there is clear indication not to. This was the original basis for encoding adopted by the small ICE group.

2. Where there is enough space available to see space between signs, as in the Gudea statues, (that is in registers where they are not crushed together), single traditional signs are normally kept together so components of the single sign are at least almost touching, or do touch or overlap. By contrast, compound signs ("diri" compounds and other lexical items) are not kept so close together, thus contrasting in those same texts. Single signs normally are not split into parts across indents (what the rest of the world means by line breaks; not talking here about register breaks).

 
A B
Encoding as characters sometimes the traditional signs, sometimes parts of the traditional signs Encoding as characters the traditional "signs" of cuneiform, as distinct from compounds or sign components..
Neither approach encodes what are traditionally regarded as sign sequences as single characters. (same)
Belief that this approach avoids having two identical surface forms resulting from distinct sequences of coded characters. Belief that there will be few or no such cases in normal use and normal spacing, because surface forms differ more often than thought by proponents of approach A.
It is expected that inputting is done by knowledgeable assyriologists. (same)
 

 

Keeping what are traditionally single signs together requires extensive use of "combining grapheme joiner". Characters (traditional single signs) are kept together by default, as one would expect, keeping their form, but allowing justification space between characters. No special devices needed.
The special joiner is needed in the *normal* cases even to keep traditional signs together, not just in the exceptional cases. A special joiner would be needed only in exceptional cases, where a document editor might want to control flow to be other than its default.
A word-joiner is needed when one wishes to keep words together contrary to normal flow of text, precisely as in the other approach. (same)
   
Fonts are more complex to create, as extensive kerned forms, ligatured forms, fused forms and single glyphs substituting for sequences of "characters". Even when kerning and ligaturing are not the appropriate analyses. Fonts are far simpler to create. Normally one character corresponds to one font element (glyph).
True ligatures like AN+EN or AN+AG or EN+ME or SHU+LAGAB, which vary between sign sequence and ligatured substitute for that sequence carrying the same function, must be built into fonts in both approaches. (same)
   
Input methods are more complex, requiring large numbers of extra elements like "combining grapheme joiner" to be generated. Input methods are simpler, so more users and semi-programmers can design their own for special time periods.
Sorting tables by various preferences are more complex, since they will more often have to take account of sequences of characters (often components of traditional signs rather than single traditional signs). Sorting tables by various preferences are simpler, since sort position can be specified for each traditional sign.
   
Code table order (and binary sort order) is by alphabetical order of sign names. Keeps together signs with the same first named components. Good idea, with the difference noted immediately below.
Sign names are decomposition descriptions of glyphic forms, so names are based on components of glyphs to the maximum degree possible. Keep traditional sign names for familiarity as the official names of the code standard, or use the most structurally revealing sign names. But order signs in code table and binary sort order by the same method as in the other proposal, so those with the same components are kept together.
   
Searching in assyriology must consider multiple possible spellings as soon as one goes beyond the simplest default. (same)
   
Makes relation of readings with characters more complex. The "readings" of cuneiform writing are correlated with the traditional signs and sequences of signs. They are not *as generally and universally* correlated with components which are fragments of signs. Despite some signs.being *named* by their components, that is a separate question.
A belief that the "characters" of a script are whatever the encoders decide they are. Relatively less interest in empirical evidence about distinctive units of a script using standard linguistic criteria for what is distinctive. A belief that the distinctive units of a communication system like a language or a script are normally and effectively determined by using empirical evidence and standard reasoning, so that the resulting understanding is most structurally appropriate and simplest. (There can be borderline cases allowing two radically different analyses, but the majority of communicaton systems present no major problems of analysis of that type.)