When using a "BiDi" operating system (such Arabic or Hebrew Windows,
or ALM / XLANGBOX-ARA on Unix), the display might look like correct in
a first impression, but the following problems will appear as soon as we
go further in the testing :
Moreover, an other problem (due to the internal Netscape coding) is that we can get more segment to display after the BiDi processing (if we consider that 1 segment is composed only with characters having the same set of attribute: bold, fontsize, color, anchor...) : As per the above example :
The logical line : ABC DEF 012 345 MNO PQR 67 89
YZ has 5 segments, the LTR (Left to Right)
visual transformation gives : ABC DEF 543 210 MNO PQR
98 76 YZ , i.e. 7 segments.
The normal rendering (where : b is begin shape, m is middle shape, f is final shape and i is isolate shape) should be:
5f4m3m2m1b
The display using a Bi-Di OS without any other enhancement will be:
5i4i3i2i1i
Netscape code already know how to manage these HTML 4.0 directives
for all HTML elements.
This BI-DI support should be able to know this global direction value
to perform all text handling of the HTML pages.
If you have the string "abc123def" where numbers represent RTL directional letters, and others are just Latin letters, you can have the following situations :
- The internal ISO 8859-6 string is : abc123def - In LTR main mode, the visual string is : abc321def +------------------------------+ |abc321def | | | +------------------------------+ - In RTL main mode, the visual string is : def321abc +------------------------------+ | def321abc| | | +------------------------------+
- BiDi re-ordering
- Contextual Glyph Shaping
However, this can be done automatically within the same function by checking some charset information.
The transformation process can be done using an Implicit Algorithm (this is the case with 8 bit codeset where Arabic characters are dissociated from Latin one using the 8th bit) or an Explicit algorithm that uses Control character to determine language, direction... But this is done inside the MozBiDi_TransformLogicalToVisual and is opaque to the browser.
For non BI-DI languages, this entry point should simply be replaced by something like:
MozBiDi_TransformLogicalToVisual(logical_str,vlogical_len, visual_str
visual_len,positionLtoV, positionVtoL)
unsigned char *logical_str; /* Input string, input */
int
logical_len; /* Input string lenght, input */
unsigned char *visual_str; /* Output string,
output */
int
visual_len; /* Output string lenght, output */
int
*positionLtoV; /* Position mapping from Logical to Visual, output
*/
int
*positionVtoL; /* Position mapping from Visual to Logical, output
*/
{
int i;
strncpy(visual_str,logical_str,logical_len);
*visual_len = logical_len;
for(i=0;i<logical_len;i++)
positionLtoV[i] = positionVtoL[i]
= i;
return 0;
}
In the same idea, we might need the reverse transformation function for HTML Visual Hebrew support on the Hebrew Window. So the following function might be defined:
MozBiDi_TransformVisualToLogical(visual_str, visual_len ,logical_str,
logical_len, positionVtoL, positionLtoV)
unsigned char *visual_str; /* Output string,
input */
int
visual_len; /* Output string lenght, input */
unsigned char *logical_str; /* Input string, output
*/
int
logical_len; /* Input string lenght, output */
int
*positionVtoL; /* Position mapping from Visual to Logical, output
*/
int
*positionLtoV; /* Position mapping from Logical to Visual, output
*/
3-2 Mouse pointing management
The mouse management function should manage the above changes when calculating the pointed element according the mouse position on the screen. A specific mouse pointing position routine might be written, but this function will call the MozBiDi_TransformLogicalToVisual() function and uses the positionLtoV and positionVtoL arrays to calculate lpos.
Also, since the visual string is different from the internal one, the mouse pointing function also needs to follow a specific algorithm when clicking on text. The solution is to call a specific external function that calculates the internal position for a visual position.
MozBiDi_CheckPositionVisualToLogical(logical_str,vpos,lpos) unsigned char *logical_str; /* Logical string, input */ int vpos; /* visual position, input */ int *lpos; /* logical position, output */
The visual position of the mouse (i.e. order number of the character located at the mouse position) is given with vpos and the internal buffer is pointed by logical_str. The function return the corresponding logical position of the character in lpos.
So during a selection action, the mouse pointing routine (To_be_located_in_the_src())
returns the logical character position of the mouse, and the text display
function uses this value as selection area to calculate and draw the highlighted
areas.
However, an existing "optimization" of the Netscape code has for effect
to only re-draw the new characters for which the highlight attribute has
changed since the last redraw... Keeping this optimization algorithm could
be very difficult for BiDi support, at least in a first time. So at this
stage, refresh the full line could be a little bit more time consuming,
but more easy to implement.
For non BI-DI languages, this function should just assign *lpos = vpos
Remark: If the system displays a mouse insertion position (like the I beam in editing areas, we need to use a counterpart function that makes the reverse operation:
MozBiDi_CheckPositionLogicalToVisual(logical_str,lpos,vpos) unsigned char *logical_str; /* Logical string, input */ int lpos; /* logical position, input */ int *vpos; /* visual position, output */
But this is not the case with a WEB browser, since all input areas are handled with OS libraries and not directly managed by the HTML main widget. This function will be very useful for the Page Composer support.
This is the case for example with mixed text selection: If you have the string "123abc456" where numbers represent RTL directional letters and letters abc are just a Latin "abc" string, you can have the following situations : ('Bold' attribute represents Highlight)
- The internal string is : 123abc456 - The visual string is : 321abc654 - You select from '2' to '3' : 321abc654 (1 highlight area) - You select from '2' to 'b' : 321abc654 (2 highlight areas) - You select from '2' to '5' : 321abc654 (3 highlight areas)This process is should be done by the routine during the Text Refresh. The Selection_start and the Selection_end are stored into new HTML structure variable and should be updated permanently. In all cases, the Text Refresh function should have a pointer on the whole text to make the correct context analysis. General Latin text display optimization that allows to display only the last modified (or selected) character should be disabled.
This function can use the MozBiDi_TransformLogicalToVisual()
and then calculate the positionLtoV and positionVtoL array to redefine
the new highlighted area.
3-4 BiDi Object parameters setting
The main BiDi Object layout values and internal parameter must
be setable from Netscape (such as to define the charset, the fontset, the
global direction, the type of numeric, etc...) In order to set the current
BiDi engine values, we can use this function :
MozBiDi_SetValue(attribute, value)
BiDi_attr attribute; /* BiDi Engine
attribute, input */
void *value;
/* value to be set, input */
The allowed attribute list could be the following :
|
|
|
CHARSET | Define the charset used (allow to dissociate the Hebrew and the Arabic processing) - (AH) | "iso8859-6" , "cp1256", .... "iso8859-8", "iso8859-8-i",... |
FONTSET | Define the system fontset to address (AH) | "iso8859-6-8", iso8859-6-16, "WinAra",... |
ORIENTATION | Define the global page orientation (AH) | ORIENTATION_R2L, ORIENTATION_LTR, ORIENTATION_CONTEXTUAL |
TYPEOFSOURCE | Define the input type of text to handle (AH) | TEXT_VISUAL, TEXT_IMPLICIT, TEXT_EXPLICIT |
SWAPPING | Define if a symmetric shape swapping must be done (for characters such as (){}[]<>) (AH) | SWAPPING_YES, SWAPPING_NO |
NEUTRAL | List of characters that take the global orientation writing direction (space for example) - (AH) | string |
NUMERALS | Define the type of numeric shaping - (A) | NUMERALS_NOMINAL, NUMERALS_NATIONAL, NUMERALS_CONTEXTUAL |
DIACRITICS | Define if the diacritics are displayed or not -(A) | DIACRITICS_YES, DIACRITICS_NO |
etc... | to be extendended ... |
In the same way, we need to get the BiDi engine parameters value, using this function:
MozBiDi_GetValue(attribute, value)
BiDi_attr attribute; /* BiDi Engine
attribute, input */
void *value;
/* value returned, output */
All these API functions return -1 on error and 0 on success.
4-1 How to deal with existing BiDi OS :
4-4 Printing