xml_parser_create

(PHP 4, PHP 5)

xml_parser_createCreate an XML parser

Description

resource xml_parser_create ([ string $encoding ] )

xml_parser_create() creates a new XML parser and returns a resource handle referencing it to be used by the other XML functions.

Parameters

encoding

The optional encoding specifies the character encoding for the input/output in PHP 4. Starting from PHP 5, the input encoding is automatically detected, so that the encoding parameter specifies only the output encoding. In PHP 4, the default output encoding is the same as the input charset. If empty string is passed, the parser attempts to identify which encoding the document is encoded in by looking at the heading 3 or 4 bytes. In PHP 5.0.0 and 5.0.1, the default output charset is ISO-8859-1, while in PHP 5.0.2 and upper is UTF-8. The supported encodings are ISO-8859-1, UTF-8 and US-ASCII.

Return Values

Returns a resource handle for the new XML parser.

See Also

add a note add a note

User Contributed Notes 6 notes

up
2
marek995 at seznam dot cz
4 years ago
I created a function, which combines xml_paresr_create and all functions around.

<?php
function html_parse($file)
     {
     
$array = str_split($file, 1);
     
$count = false;
     
$text = "";
     
$end = false;
      foreach(
$array as $temp)
       {
        switch(
$temp)
         {
          case
"<":
          
between($text);
          
$text = "";
          
$count = true;
          
$end = false;
           break;
          case
">":
           if(
$end == true) {end_tag($text);}
           else {
start_tag($text);}
          
$text = "";
           break;
          case
"/":
           if(
$count == true) {$end = true;}
           else {
$text = $text . "/";}
           break;
          default:
          
$count = false;
          
$text = $text . $temp;
         }
       }
     }
?>
The input value is a string.
It calls functions start_tag() , between() and end_tag() just like the original xml parser.

But it has a few differences:
  - It does NOT check the code. Just resends values to that three functions, no matter, if they are right
  - It works with parameters. For example: from tag <sth b="42"> sends sth b="42"
  - It works wit diacritics. The original parser sometimes wrapped the text before the first diacritics appearance.
  - Works with all encoding. If the input is UTF-8, the output will be UTF-8 too
  - It works with strings. Not with file pointers.
  - No "Reserved XML name" error
  - No doctype needed
  - It does not work with commentaries, notes, programming instructions etc. Just the tags

definition of the handling functions is:

<?php
function between($stuff) {}
?>

No other attributes
up
0
php at stock-consulting dot com
9 years ago
Even though I passed "UTF-8" as encoding type PHP (Version 4.3.3) did *not* treat the input file as UTF-8. The input file was missing the BOM header bytes (which may indeed be omitted, according to RFC3629...but things are a bit unclear there. The RFC seems to make mere recommendations concering the BOM header). If you want to sure that PHP treats an UTF-8 encoded file correctly, make sure that it begins with the corresponding 3 byte BOM header (0xEF 0xBB 0xBF)
up
0
jcalvert at gmx dot net
10 years ago
To maintain compatibility between PHP4 and PHP5 you should always pass a string argument to this function. PHP4 autodetects the format of the input if you leave it out whereas PHP5 will assume the format to be ISO-8859-1 (and choke on the byte order marker of UTF-8 files).

Calling the function as <?php $res = xml_parser_create('') ?> will cause both versions of PHP to autodetect the format.
up
-1
Anonymous
8 years ago
I'd also recommend adding the option below
xml_parser_set_option($parser,XML_OPTION_SKIP_WHITE,1);
up
-2
Tobbe
9 years ago
The above "XML to array" code does not work properly if you have several tags on the same level and with the same name, example:

<currenterrors>
<error>
<description>This is a real error...</description>
</error>
<error>
<description>This is a second error...</description>
</error>
<error>
<description>Lots of errors today...</description>
</error>
<error>
<description>This is the last error...</description>
</error>
</currenterrors>

It will then only display the first <error>-tag.
In this case you will need to number the tags automatically or maybe have several arrays for each new element.
up
-3
juanhdv at NOSPAM dot divvol dot org
7 years ago
In PHP 5, when including in your xml file the definition '<?xml version="1.0" encoding="ISO-8859-1" ?>',   I'd also recommend adding the option below:

xml_parser_set_option($xml_parser,XML_OPTION_TARGET_ENCODING, "ISO-8859-1").

It works fine!

If your enconding is 'UTF-8', just replace 'ISO-8859-1'.
To Top