Bi-Di languages support - Bi-Di API proposal

By Franck Portaneri <franck@langbox.com>
Last Update : July 3th, 1998

Note to readers:
This document is a draft proposal of the definition of an API for the BiDi support to Mozilla. Any comment/feedback to get it better will be welcome.

1- Why we need a specific Bi-Di API: ?

BI-DI languages represent Native languages that are read from Right to Left, (Mainly Arabic, Farsi, Hebrew, Urdu...) where English sub-strings can be inserted (from Left to Right) within global national strings (From Right to Left). This fact generates mainly problems for text display and text selection process.

When using a "BiDi" operating system (such Arabic or Hebrew Windows, or ALM / XLANGBOX-ARA on Unix), the display might look like correct in a first impression, but the following problems will appear as soon as we go further in the testing :
 

So the BiDi support must handle this from Netscape source before addressing the low level OS BiDi rendering routines.


2- Global Orientation Handling Consideration

Normally, the BI-DI language support needs to allow the main HTML page to be built from right to left (and top to bottom), and to always provide the ability to return to the original position of left to right for supporting a bilingual environment. This is a de-facto standard that was born with the first DOS solutions implemented with a <Alt><RightShift> and <Alt><LeftShift> dynamic orientation selection. Following this market rule, most if Bi-Di application or systems allow this dynamic setting. This is for example the case with AraMosaic, which is able to force either right to left or  left to right global page direction by a menu selection. However, according the HTML 4.0 specifications, the global direction of the page or of a specific paragraph should be explicitly defined using the <DIR> directives. So at this stage, unlike AraMosaic, we let the HTML document to indicate the global direction explicitaly. Of course in the absence od <DIR> directive, we can assume that any HTML document using the charset ISO8859-8 or ISO8859-6 is by default in Right to Left mode.

Netscape code  already know how to manage these HTML 4.0 directives for all HTML elements.
 
This BI-DI support should be able to know this global direction value to perform all text handling of the HTML pages.
 
 


3- API scope

        - The internal ISO 8859-6 string is            :   abc123def

        - In LTR main mode, the visual string is       :   abc321def

          +------------------------------+
          |abc321def                     |
          |                              |
          +------------------------------+

        - In RTL main mode, the visual string is       :   def321abc

          +------------------------------+
          |                     def321abc|
          |                              |
          +------------------------------+
   

 

4- Outstanding issue :

4-1  How to deal with existing BiDi OS :
 

4-1 How to deal with Latin OS : 4-3 Existing Netscape Global Right to Left direction management:
 

4-4 Printing
 


Franck Portaneri - July 4th 1998