The
Bilingual English/Arabic Environment for UNIX
Technical
Product Description
Arabic,
as a calligraphic language, presents major processing problems.
An
Arabic character may take one, two, or sometimes four different
shapes, yet it is represented by one code. The shape of
the character is determined depending on its position in
the word. This is but one problem called "Character
Shaping".
Another
is the direction of writing. Arabic text is written from
right to left. This conflicts with English, which is written
in the opposite direction. When mixing text languages, characters
are added in one language and pushed in the
other. This is called "Bi-Directionality".
Some
users speak only Arabic. They will not accept a cursor positioned
at the leftmost position of the screen. They want to have
an option allowing them to start at the rightmost position
of the line, i.e. in brief, a mirror image of the screen.
The implication is that, in this mode, English characters
are pushed from right to left. This is called "Screen
Mirroring".
Yet
one more complication: vocalization (or diacritics).
These characters, like their counterparts in English, the
vowels, are a linguistic necessity, yet, in Arabic, they
appear on top or below their respective consonants.
These
linguistic complications - and more - make Arabic a difficult
language to handle.
Another
aspect of the problem is that standard UNIX Operating system
support of Internationalization (local) supplies only support
for European Languages generally based on the ISO 8859-1
character codeset. This National Language Support (NLS or
XLOCALE) allows keyboard mapping, handling of collating
sequences and character types, date and time format. It
could also include, in the near future, a specific extension
for the Japanese language, but cannot handle Arabic characters,
and, in general "right-to-left" writing direction
languages..
LangBox
International is specialized in the design and development
of bilingual and multilingual Operating Systems. It has
implemented bilingual capabilities on a large number of
machines operating under UNIX, XENIX, AIX, SUN/OS, ISC,
SINIX, DG/UX, EP/IX, CLIX,.SOLARIS, IRIX, OSF/1, etc.
In
response to a clear market demand in the Arab countries,
LangBox International has developed a Bilingual System supporting
both the Arabic and English languages. LANGBOX for Arabic
(named LANGBOX-ARA) has thus been conceived with the
following principles in mind:
LANGBOX-ARA
is a bilingual Operating System and a bilingual development
environment. LANGBOX-ARA is based on the ISO 8859-6 codeset.
The
languages supported are:
The
LANGBOX-ARA Working Environment
The
LANGBOX-ARA system is built around TTY driver of
the UNIX System V. When loaded onto the UNIX system, LANGBOX-ARA
provides the user with a full bilingual environment in Operating
System interface, systems development and character based
applications run time operations.
The
LANGBOX-ARA system supports any application running on TTY
terminal screen, from the dump terminal (such as VT100 RS232
connected terminal) to any PTY connection through the Network
(Xterm emulation or PC based product running telnet sessions).
To
conform to the UNIX operating system, LANGBOX-ARA
is also designed to run in a multi-user and multitasking
environment and to co-reside with the standard UNIX facilities.
Having LANGBOX-ARA added to a UNIX system does not
prohibit its users from operating in a pure UNIX environment.
While
operating under the LANGBOX-ARA bilingual environment,
users can select and set their default language, English
or Arabic. Users can login to the system in the language
of their choice and communicate with the host using the
standard UNIX commands and utilities. The commands could
be entered either in Latin or in Arabic and are executed
by the LANGBOX-ARA shell command interpreter. The
system responses are displayed in the language chosen by
the user.
Although
set with a default language prior to login, a LANGBOX-ARA
user can start multiple work sessions, (shell child processes),
each with a different base language. He will be able to
alternate languages within the same work session, at the
command line, directory and file level, or when running
a standard UNIX application (ex: text processing, spreadsheets,
data base management, etc.).
In
addition to the bilingual UNIX user interface, LANGBOX-ARA
provides a comprehensive bilingual software development
environment for programmers. The programmer will be able,
with or without the knowledge of the Arabic language, to
develop with ease bilingual applications or to adapt, with
minimal effort, current English software packages to run
in a bilingual mode under the LANGBOX-ARA environment.
Internationalization of Applications |
LANGBOX-ARA
provides an enhanced environment for the internationalization
of applications as compared with classical techniques.
Usually,
application programmers incorporate international character
strings' manipulation, "context analysis" (automatic
shape determination), and display processing within the
application. Under LANGBOX-ARA, 8 bit clean character
based application will run with no modification.
The
following table lists the benefits obtained by comparing
a full LANGBOX-ARA bilingual work environment with a standard
software internationalization approach:
s
LANGBOX-ARA
BILINGUAL ENVIRONMENT |
SOFTWARE
INTERNATIONALIZATION |
Operating
system user interface. |
Not
available. |
Language
& I/O processing handled by LANGBOX-ARA with no
application overhead. |
Language
& I/O processing implemented in every application
program, bigger overhead. |
English
only software developer able to provide bilingual
products. |
Sophisticated
developers required with knowledge of the Arabic language
particularities. |
Bilingual
software environment is uniform. |
Potential
inconsistency in software internationalization. |
Bilingual
UNIX Mail easily implemented under LANGBOX-ARA. |
Major
effort required. |
Communication
and networking easily implemented under LANGBOX-ARA. |
Major
effort required. |
LANGBOX-ARA
supports
exclusively the TTY interface, used under shell sessions
under Dump terminals, telnet or rlogin connection to a UNIX
operating system. It is composed of two packages:
The
Runtime System must be installed prior to the installation
of the Development System package. These packages include
the standard UNIX V modules to which a set of LANGBOX-ARA
specific facilities is added.
a)
The Shell
LANGBOX-ARA
includes two national language shells. They are differentiated
by the first characters prefixing their label. Thus
Each is a special shell version created to provide.
When
the LANGBOX-ARA shell is invoked
Each
shell can be called from another language shell and achieves
the same effect. Exiting from a shell is done via the traditional
"control-d." The national language variables are
created with a new shell and restored when the spawned sub-shell
is exited.
The
main advantages of the LANGBOX-ARA shells are:
b) Commands and utilities
There
are two groups of commands and utilities supplied with LANGBOX-ARA.
1)
The UNIX-like Group:
This
is a set of executable modules that have the same calling
sequence as their UNIX counter parts, except that they work
in 8-bits and are bilingual.
These
commands are differentiated by their prefix "a".
Example:
aa2ps, avi, auemacs, alp...
They
also share a set of characteristics:
2)
The LANGBOX-ARA specific group:
This
is a set of commands and utilities supplied with LANGBOX-ARA
and aimed at servicing the bilingual community of users.
They include:
Character
management.
acharset
aload
aloadp
asmo449 |
display
the character set on the terminal
download Arabic character set on terminal
download Arabic character set on printer
convert a file to ASMO 449 codeset |
Display
processing and character shape management
afps
amask
amode
arabic
asetup
astatus
context
dataproc
english
hindi
months
neutral
nocontext
nohindi
notashkil
tm
tashkil
wordproc
|
Arabic
floating point symbol definition
video line language attribute
manage the space character
select the Arabic screen display mode
general LANGBOX-ARA function setup
display the status of the Arabic environment
enable context analysis
force numeral to 7-bits for computation
select the English screen display mode
control the display of Arabic numerals
select months names
set neutral characters in the display processing
disable context analysis
control the display of Arabic numerals
disable the diacritics management
display the LANGBOX-ARA driver level release
enable the diacritics management
record numerals in their language |
Terminals
management
atic
toggle
aresize |
Arabic
terminal information compiler
define the toggle key for the bilingual keyboard
adjust the Arabic screen after a resizing action
|
a)
The LANGBOX-ARA commands:
In
addition to the standard development power of UNIX, LANGBOX-ARA
provides a set of tools to develop bilingual applications.
They include string management functions reconfigured
to service the 8-bit character sets, messages extraction
and handing tools, sorts and conversion functions, etc.
The
message extraction capabilities provided permit the separation
of strings out of "C" program into files, and
a formatter for these messages to simplify the subsequent
translations.
These
tools are extremely powerful when "internationalization"
of applications are envisaged.
b)
The LANGBOX-ARA function libraries
These
libraries include:
Bilingual
applications can benefit by recompiling under LANGBOX-ARA
as these libraries have the same calling sequence as their
UNIX counterparts, thus generating object code capable
of handling bilingual messages.
Sample screen output of a Sun cmdtool window, running
ash shell
LANGBOX-ARA
is adapted to the standards as set forth by AT&T's SVID
and ISO.
The
standards adopted under LANGBOX-ARA are related to
the following:
The
Arabic characters are 8-bits wide and conform to the following
standard:
Characters
will be displayed according to their language specific conventions.
Latin characters will always appear separately, while Arabic
characters will be contexted and displayed in their composite
form.
The
display technique adopted depends on the base and current
languages chosen by the user.
When
the environment is Latin, the initial cursor position is
at the left of the line and characters are added to the
right as they are entered. Arabic characters are inserted,
and pushed to the right as they are entered.
When
the environment is Arabic, the reverse phenomenon occurs:
the initial cursor position is at the right most position
of the line, Arabic characters are added and English are
inserted.
Vocalization
is fully supported on terminals capable of displaying 256
downloadable characters.
These
characters are not context sensitive and do not, therefore,
affect the shape of Arabic characters. Yet they have, in
certain instances, opposite meaning in Arabic, due to the
direction in which this language is written. Typical examples
are:
(
) { } < > [ ] etc.
A
special command handles the meaning the user wants to assign
to these special characters.
The
user is provided with a command allowing the display of
numerals in Arabic shape (Latin shape) or Hindi shape (used
in the Middle East). All known complications associated
with this subject have been solved and incorporated in the
package.
Automatic
Shape Determination (Shaping process) |
The
standard rule is to display characters in the way calligraphy
requires it. It was, however, found that this feature should
be optional, since system users are frequently in debug
work sessions and prefer to have their characters displayed
in their original, base form. A pair of commands has been
included to inhibit or restore Automatic Shape Determination.
LANGBOX-ARA
Level Support Standards |
LANGBOX-ARA
capabilities will not change when installed on different
hardware system configurations. The following standards
have been adopted, and will be supported even if the hardware
does not have the required features:
It
is further stressed here that LANGBOX-ARA generates
and supports a Bilingual environment. LANGBOX-ARA
allows users to set a SESSION LANGUAGE and to intersperse
English and Arabic any time in the session, even within
one command line.
Sample screen copy of a Curses(3X) based application
running under LANGBOX-ARA
This
diagram shows how the LANGBOX-ARA kernel driver take controls
of all TTY line I/O within the UNIX kernel and is able to
perform exception process from the Application flow of character
to the physical real terminal screen.
In
the same way, keyboard codes sent by the terminals are mapped
into a dual logical keyboard management system and are sent
back to the application.
LANGBOX-ARA 3.3 is available on the following target
machines:
LANGBOX-ARA 3.3 provides support for this set of terminals
or compatible:
Terminal emulation programs
under the X Window system and other windowing environments,
such as CDE, Open Windows, DECwindows, Environment V, etc.,
are fully supported under LANGBOX-ARA version 3.3 in VT220
emulation..
LANGBOX-ARA
3.3 provides support for this set of printers:
- IBM
4201us (9 pin)
- IBM
4201 (9 pin)
- Epson
lq1000 (24 pin)
- Fujitsu
dl3400 (24 pin)
- HP
deskjet 500+
- HP
laserjet (PCL)
- DEC
LA75+
- Postscript
|