PHP Tutorials - Tutorial Addendum - Non ASCII Characters as Cord Literals
| |
This affiliate explains:
- Basic Rules
- French Characters in Cord Literals - UTF-8 encoding
- French Characters in Cord Literals - ISO-8859-1 Encoding
- Chinese Characters in Cord Literals - UTF-8 Encoding
- Chinese Characters in Cord Literals - GB2312 Encoding
- Characters of Assorted Languages in Cord Literals
Basic Rules
As you see from the antecedent chapters, if PHP scripts are complex in a Web based application,
they are consistently acclimated abaft a Web server. PHP scripts are accepted to accomplish HTML abstracts and
pass them aback to the Web server. There are about four means non ASCII characters can get into the HTML document
through PHP scripts: a) Access them as cord literals; b) Accept from HTTP request; c) Retrieve them from files;
d) Retrieve them from a database.
In this chapter, we will apply on how to cover non ASCII characters in PHP scripts as cord literals.
Here are the accomplish complex in this scenario:
A1. Key Sequences from keyboard
|
|- Argument editor
v
A2. PHP File
|
|- PHP CGI engine
v
A3. HTML Document
Based on my experience, actuality are some basal rules accompanying to those steps:
1. You haveto adjudge on the appearance encoding action to be acclimated in your PHP software file.
For alotof of the languages, you accept two options, a: use a encoding action specific to that language;
b: use a Unicode schema. For example, you can use either GB2312 (a simplified Chinese appearance schema)
or UTF-8 (a Unicode appearance schema) for Chinese characters. My advancement acclimated to be "a". But today,
I am suggesting "b", because Unicode action can abutment all characters of all languages.
2. From move "A1" to "A2", you charge baddest acceptable argument editor that supports the encoding action you accept
decided. The end ambition of this move is simple - characters in cord literals haveto be stored in the PHP
file using the absitively encoding schema.
Don t beneath appraisal the adversity akin of this step. It could be actual frustrating,
because alotof computer keyboards abutment alphabetic belletrist only. You may accept to use some accent specific
input software to construe alphabetic belletrist into accent specific characters. The editor sometimes may
also abundance characters in anamnesis in one encoding schema, and action you altered encoding action if saving
files to harddisk.
3. Cord data blazon is authentic as a arrangement of bytes in PHP, like C language. This is altered than
Java language, area cord data blazon is authentic as a arrangement of Unicode characters. Cord literals in
PHP are aswell taken as sequences of bytes. This is a nice feature. It allows us to access non ASCII characters
4. All PHP congenital cord functions accept that strings are sequences of bytes. For example, strlen()
returns the amount of bytes of the accustomed string, not the amount of characters of a specific language.
To administer strings as sequences of characters, we charge to use Multibyte Cord functions, mb_*().
5. From move "A2" to "A3", HTML abstracts are generated from PHP software mainly through the print() function.
The print() action will accurately archetype every bytes from the defined cord to HTML documents. This guarantees
that any non ASCII characters encoded in any encoding action will be affected accurately to the HTML document.
Again, this is altered than JSP pages, area strings will be adapted into bytes beck based a specified
encoding schema, if you are using appearance based achievement beck functions.
6. If you do wish to catechumen from one encoding action to addition encoding action during the print() action
call, you can use mb_output_handler as the alarm aback action on the achievement buffer: ob_start("mb_output_handler").
(Continued on next part...)
|
string, characters, schema, encoding, literals, bytes, ascii, language, specific, output, function, unicode, character, scripts, sequences, print, languages, strings, based, documents, different, editor, functions, , string literals, encoding schema, non ascii, ascii characters, php scripts, print function, html documents, non ascii characters, string data type, one encoding schema, retrieve them from, php tutorials tutorial, tutorials tutorial notes, |
Also see ...
i(Continued from antecedent part...)/iFrench Characters in Cord Literals UTF 8 EncodingFirst, let s play to some French characters in UTF 8 encoding first. p
i(Continued from antecedent part...)/i3. To access the French character, "e with grave", you can run Alpha All Programs Arrangement Accoutrement Appearance Map. Baddest "e with grave" on the appearance map.C
i(Continued from antecedent part...)/iChinese Characters in Cord Literals GB2312 EncodingI anticipate we are accessible to analysis Chinese characters in PHP scripts with GB2312 e
i(Continued from antecedent part...)/i3. Don t try to access those accost letters yourself. Go to the Google accent apparatus site,http://www.google.com/language_tools. You can access "Hello world!" and construe it to a
This affiliate explains:Storing Non ASCII Characters in DatabaseTransmitting Non ASCII Characters to the ServerMySqlUnicode.php UTF 8 Sample Script/ul
i(Continued from antecedent part...)/iTransmitting Non ASCII Characters to the ServerHandling non ASCII characters with MySQL not alone requires us ambience up the table columns
i(Continued from antecedent part...)/iIf you run it, you will get: Default settings... character_set_client latin1 character_set_connection latin1 character_set_data
This affiliate explains:What Is Localization / Internationalization?Managing Characters in Web Based Applications Character Traveling PathsASCII Char
i(Continued from antecedent part...)/iCharacter Traveling PathsIn a archetypal Web based application, characters will biking from one allotment of the appliance to addition part. b
i(Continued from antecedent part...)/iASCII Characters in PHP PagesAs I mentioned earlier, ASCII characters can biking from PHP files to browsers easily after any trouble. Ac