Title: Cultural Elements in Internet Software Localization
1Cultural Elements in Internet Software
Localization
- Valentina Dagiene, Tatjana Jevsikova,
dagiene_at_ktl.mii.lt tatjanaj_at_ktl.mii.lt - Institute of Mathematics and Informatics
- Lithuania
2Internet Software
- The term Internet software here is used as a
general term to address - software, used to access Internet resources
(usually on a client side), - web-based applications (server side).
3Culture
- Provides the context in which the world is
understood rules for behavior, communication,
interaction and understanding. - Multilevel onion-like models, e.g. basic
assumptions and values, with resultant behavioral
norms, attitudes and beliefs which manifest
themselves in systems and institutions as well as
behavioral patterns and non-behavioral items. - There is a relation between users culture and
software usability. Software can influence
culture as well (this especially applies to
Internet software).
4Software Localization
- Software localization is software adaptation for
particular cultural environment (locale). - Unfortunately, still usually referred to as
language translation. - Localized software must look and feel as if it
would have been made for the target language and
culture.
5Solving Culture-sensitive Issues
- At software production time, making it language-
and culture-neutral and suitable for localization
(internationalization). - After software production time, modifying the
original code at localization time.
6An Aim of the Presentation
- To look at software elements that are based on
culture and cultural conventions (a kind of
reflection of cultural dimensions). - To classify and discuss the most important
software elements for successful cultural
portability, basing on analysis of related
normative documents and more than 10-year
experience in software localization.
7Classification
- A topic of studies by G. Hofstede, F. Trompenaar,
E. Hall. The cultural dimensions identified by
Hofstede offer possibility to structure culture
according to the five concepts - Power Distance.
- Individualism vs. Collectivism.
- Masculinity vs. Femininity.
- Uncertainty Avoidance.
- Long-term vs. Short-term Orientation.
- These are categories that organize general
cultural data. Speaking about software, we can
look at software elements that are based on
culture and cultural conventions.
8Structure of Cultural Elements in Software
9Possible Users of the Classification
- Researchers to evaluate the level of
internationalization of the original software,
check the user-friendliness of localized
software. - Software developers to develop
better-internationalized software. - Localizers to adapt more cultural elements to
the target culture and detect internationalization
bugs.
10Formal Definition of Cultural Elements
- International standard on procedures for
registration of cultural elements (ISO/IEC 15897)
defines locale as - the definition of the subset of a users
information technology environment that depends
on language, territory, or other cultural
customs. - Locale is usually identified by the language,
using two-letter language code (ISO 639-1), and
by territory, using two-letter territory code
(ISO 3166-1).
11POSIX Locale Categories
12Set of Formal Definitions of Cultural Conventions
(FDCC), ISO/IEC 14652
- format of postal addresses
- information on measurement system
- format of writing personal names
- format for telephone numbers and other telephone
information.
13International standard on procedures for
registration of cultural elements (ISO/IEC 15897)
- Specifies the procedures to be followed in
preparing, publishing and maintaining a register
of cultural specifications for computer use. - First six clauses coincide with POSIX locale
categories. - Additional information national or cultural
Information Technology terminology personal
naming rules inflection hyphenation spelling
numbering coding of national entities
identification of persons and organizations
electronic mail addresses keyboard layout
man-machine dialogue, etc.
14Unicode CLDR (more than 100 locales registered)
- Date and time formats
- Number and currency formats
- Measurement system
- Collation specification (sorting, searching,
matching) - Translated names for languages, territories,
scripts, timezones, and currencies - Script and characters used by a language.
15Locale Implementation in Software
16Locale Defined Elements(red rectangles)
17Language-driven elementsAlphabets and Names
- Names (identifiers of various objects in Internet
software, e.g. files, logins, passwords,
domains...) are not only used by computers, but
also by humans. - Names in a native language and script are easier
to - devise,
- memorize,
- guess,
- understand,
- manipulate,
- correct, etc.
18Restriction to use in names only English alphabet
letters (in outdated software)
- Forces a user not to use some/all letters from
his/her native alphabet, but allow using foreign
letters - Most languages (even using Latin script), have
some extra letters - e.g., å, , ,
- Some English letters are not used in most of
languages (using Latin script) - usually q, w, and x.
- Makes impossible to use characters of non-Latin
scripts.
19The main reasons, why international characters
are not used in names today
- External some aspects of restriction for
character use in names still exist in todays
software. - Internal previous experience on restriction had
been applied for names affects users not to use
national characters in names, unless such usage
is technically possible.
20Login Name
- Used in many web-based applications (virtual
learning environments, e-mail clients, instant
messengers, etc.). - Characters
- Usually only underscores, numbers, and letters
from the basic Latin alphabet are accepted. - Some systems use the login name not only for
internal identification but also for addressing
the user in the system.
21Personal Name
- Today, practically all the software allows using
all letters of alphabet to write person's first
and last name (surname) (a user shouldn't change
or misspell his/her real name to register in the
system). - However, in telecommunications many users avoid
using their native alphabet and write their names
with spelling errors. - For example, the number of incorrectly written
names of Skype users varies from 10 to 90
depending on the language. - Such a great illiteracy may be caused by
previous experience with outdated software or
influence of present restriction on login names.
22Passwords
- Used in software that performs users
authorization (virtual learning environments,
e-mail clients, instant messengers, etc.). - Usually may be composed from letters and digits.
- Many programs still restrict the set of letters
to ASCII alphabet. - The restriction of the character set available
for password reduces its security.
23Passwords (an example)
- User usually does not think that letters in
this context are only letters of English
alphabet, but of his native language.
24File/Folder Names for
- Storing documents on a local computer
- No technical problems in todays OS.
- Exchanging documents between computers by
removable storage devices - Works well as long as the same 8-bit encoding is
used in both computers. - Sending documents as parts of e-mail messages or
as their attachments, or directly by instant
messengers - No technical problems. Before sending are encoded
in UTF-8 (FF sequences) without non-ASCII
letters, after receiving are decoded back. - Storing web pages or other web content on a
server - Theoretically solved, the same method as sending
by e-mail. - Using inside applications
- A duty of developer to provide user-friendly
names for visible items.
25Domain names
- Till 2003 letters of Basic Latin alphabet (26
letters), digits, dash. - 2003 documents on using international characters
in domain names were issued (RFC 3490, RFC 3491,
RFC 3492) - International characters (represented in Unicode)
are converted to ASCII string (Punycode), and
before showing it to user, it is converted back
to Unicode characters again - räksmörgås.josefsson.org ?
- xn--rksmrgs-5wao1o.josefsson.org
- Problems usage of homographs.
26Domain names in browsers
27Semantically-expressed elements Matching of
plural and singular forms
- English 2 forms
- 1 object, 2 objects, 10 objects
- Lithuanian, Polish, Russian ... 3 forms
- Some European languages, e.g. Slovenian, Maltese
4 forms
28Plural and Singular Forms in Other Languages
29Grammatical Name Forms
- In inflective languages (Lithuanian, Finnish,
Polish, etc.) names in dialog windows may appear
in various cases. - 'Hello, Jonas' (in English) will be
- 'Sveikas, Jonai' (in Lithuanian)
30Gender
- S is logged in, S is a user name.
- English
- John is logged in.
- Mary is logged in.
- Lithuanian (and many other languages)
- John yra prisijunges.
- Mary yra prisijungusi.
31Human-sensitive Elements
- Usually not defined by national or international
standards (normative documents). - Depend on deep cultural habits, country or its
historical units cultural conventions. - They can also depend on individual persons and
should be adaptable to persons habits. - They are difficult to express in a formal way
(e.g. include into formal locale definition).
32Some Examples
- Icons/Metaphors.
- Images, photos.
- Colour meaning.
- Usage of sounds and videos.
- Examples.
- Jokes and analogies.
- Political statements.
- Navigation scheme.
- Page layout.
- ...
33Colour-Culture Chart(Boor Russo, 1993)
34Icons Example Home Function
MS Internet Explorer
Possible Chinese icons
Mozilla Firefox
35Problems
- Mentioned elements are more difficult to
implement in internet software than in
autonomously running software - they are deeply grown into the program,
- internet software has many links with other
software. - Requirements
- flexibly adaptable to software and other cultural
components - flexibly fitting to each other
- flexibly chosen by the user (multiple choices).
36Existing Ways of Solution
- Cultural Web Spider, designed to extract
information on culture specific webpage design
elements (cultural markers) from the HTML and CSS
code of websites for a particular country domain,
that could help to create a cultural interface
design look and feel prototyping tool
(Kondratova I., Goldfarb I., Gervais R.,
Fournier, L., 2005). - Many researchers confirm an importance of the
cultural dimensions, set by Hofstede. They are
used to create recommendations for a website
navigation scheme and content presentation
(Marcus A., Gould E.W., and others).
37Existing Ways of Solution
- Recent research on incorporation cultural
dimensions into global software includes attempts
to create culturally adaptive software, applying
AI mechanisms. - It is also proposed to incorporate culture into a
usermodel in order to implement adaptable
personalization mechanisms, assigning Hofstedes
value for each cultural dimension according to
users birthplace, country of current and former
residence, languages, sex, age, political
orientation and education level (Reinecke K. et
al, 2007).
38Conclusions
- Existing shortcomings in software
internationalization can be explained by the lack
of categories included in formal locale
definitions, and lack of compatibility of
different locale models. - While the developed list of cultural elements is
limited, we hope that it can help to pay more
attention to the complex set of cultural elements
while designing, localizing and testing localized
or intended to localize internet software. - Special attention during internet software
development should be paid not only for a
generalized set of elements, defined in existing
locale models, but also to the ability to use
international characters in object names (names
of logins, files, domains, passwords) an ability
to include a component for languages grammatical
forms generation usage of parameters in
localizable strings should be reduced due to
different rules of words and phrases composition
in different cultures. - Another trend for future work could be some
formalization of human-sensitive elements, used
in software.