SunshinePHP Developer Conference 2015

preg_replace

(PHP 4, PHP 5)

preg_replaceRealiza una búsqueda y sustitución de una expresión regular

Descripción

mixed preg_replace ( mixed $pattern , mixed $replacement , mixed $subject [, int $limit = -1 [, int &$count ]] )

Busca en subject coincidencias de pattern y las reemplaza con replacement.

Parámetros

pattern

El patrón de búsqueda. Puede ser tanto una cadena como una matriz de cadenas.

También están disponibles varios modificadores de PCRE, incluyendo el obsoleto 'e' (PREG_REPLACE_EVAL), que es específico de esta función.

replacement

La cadena o matriz de cadenas a reemplazar. Si este parámetro es una cadena y el parámetro pattern es una matriz, todos los patrones serán sustituidos por esa cadena. Si ambos parámetros, pattern y replacement, son matrices, cada pattern será reemplazado por el replacement equivalente. Si hay menos elementos en la matriz replacement que en la matriz pattern, cualquier pattern extra será reemplazado por una cadena vacía.

replacement puede contener referencias de la forma \\n o (desde PHP 4.0.4) $n, siendo preferida la última forma. Cada referencia de este tipo será sustituida por el texto capturado por el n-simo patrón entre paréntesis. n puede ser desde 0 a 99, y \\0 o $0 se refiere al texto coincidido por el patrón completo. Los paréntesis de apertura se cuentan de izquierda a derecha (comenzando por 1) para obtener el número de sub-patrones de captura. Se debe doblar la barra invertida para poder usarla en la sustitución (cadena PHP "\\\\").

Cuando se trabaja con un patrón de sustitución donde una retro-referencia es immediatamente seguida de otro número (p.ej.: colocar un número literal immediatamente después de un patrón coincidido), no puede usar la notación familiar \\1 para sus retro-referencias. \\11, por ejemplo, confundiría a preg_replace() ya que no sabe si quiere que la retro-referencia \\1 esté seguida por un literal 1, o que la retro-referencia \\11 esté seguida de nada. En este caso la solución es usar \${1}1. Esto crea una retro-referencia $1 aislada, dejando el 1 como un literal.

Cuando se usa el modificador obsoleto e, esta función escapa algunos caracteres (a saber, ', ", \ y NULL) en la cadena que sustituye a las retro-referencias. Esto está hecho para asegurarse de que no surjan errores de sintaxis en el uso de retro-referencias con comillas simples o dobles (p.ej. 'strlen(\'$1\')+strlen("$2")'). Asegúrese de que conoce la sintaxis de cadena para saber cómo se asemejarán las cadenas interpretadas.

subject

La cadena o matriz de cadenas a buscar y sustituir.

Si subject es una matriz, entonces la búsqueda y sustitución se llevan a cabo para cada entrada de subject, y el valor devuleto también es una matriz.

limit

Las sustituciones máximas posibles por cada patrón en cada cadena subject. Por defecto es -1 (sin límite).

count

Si se especifica, esta variable será rellenada con el número de sustituciones hechas.

Valores devueltos

preg_replace() devuelve una matriz si el parámetro subject es una matriz, o, por el contrario, una cadena.

Si se encuentran coincidencias, el nuevo subject será devuelto, de otro modo subject será devuelto sin cambios o NULL si se produjo un error.

Errores/Excepciones

Se emite un error de nivel E_DEPRECATED al pasar el modificador "\e".

Historial de cambios

Versión Descripción
5.5.0 El modificador /e está obsoleto. Utilice preg_replace_callback() en su lugar. Véase la documentación de PREG_REPLACE_EVAL para obtener información adicional sobre riesgos de seguridad.
5.1.0 Añadido el parámetro count

Ejemplos

Ejemplo #1 Usar retro-referencias seguidas de literales numéricos

<?php
$cadena 
'Abril 15, 2003';
$patrón '/(\w+) (\d+), (\d+)/i';
$sustitución '${1}1,$3';
echo 
preg_replace($patrón$sustitución$cadena);
?>

El resultado del ejemplo sería:

Abril1,2003

Ejemplo #2 Usar matrices indexadas con preg_replace()

<?php
$cadena 
'El veloz murciélago hindú comía feliz cardillo y kiwi.';
$patrones = array();
$patrones[0] = '/veloz/';
$patrones[1] = '/hindú/';
$patrones[2] = '/murciélago/';
$sustituciones = array();
$sustituciones[2] = 'galápago';
$sustituciones[1] = 'africano';
$sustituciones[0] = 'lento';
echo 
preg_replace($patrones$sustituciones$cadena);
?>

El resultado del ejemplo sería:

El galápago lento africano comía feliz cardillo y kiwi.

Al usar ksort en patrones y sustituciones, podríamos obtener lo que buscábamos.

<?php
ksort
($patrones);
ksort($sustituciones);
echo 
preg_replace($patrones$sustituciones$cadena);
?>

El resultado del ejemplo sería:

El lento galápago africano comía feliz cardillo y kiwi.

Ejemplo #3 Sustituir varios valores

<?php
$patrones 
= array ('/(19|20)(\d{2})-(\d{1,2})-(\d{1,2})/',
                   
'/^\s*{(\w+)}\s*=/');
$sustitución = array ('\4/\3/\1\2''$\1 =');
echo 
preg_replace($patrones$sustitución'{fechaInicio} = 1999-5-27');
?>

El resultado del ejemplo sería:

$fechaInicio = 27/5/1999

Ejemplo #4 Quitar los espacios en blanco

Este ejemplo quita los espacios en blanco en exceso de una cadena.

<?php
$cadena 
'foo   o';
$cadena preg_replace('/\s\s+/'' '$cadena);
// Ahora esto será 'foo o'
echo $cadena;
?>

Ejemplo #5 Usar el parámetro count

<?php
$cuenta 
0;

echo 
preg_replace(array('/\d/''/\s/'), '*''xp 4 to', -$cuenta);
echo 
$cuenta//3
?>

El resultado del ejemplo sería:

xp***to
3

Notas

Nota:

Cuando se usan matrices con pattern y replacement, las claves se procesan en el orden en que aparecen en la matriz. Éste no es necesariamente el mismo que el orden de índice numérico. Si usa índices para identificar qué pattern debería ser sustituido por cuál replacement, debería usar ksort() en cada matriz antes de llamar a preg_replace().

Ver también

add a note add a note

User Contributed Notes 63 notes

up
190
arkani at iol dot pt
5 years ago
Because i search a lot 4 this:

The following should be escaped if you are trying to match that character

\ ^ . $ | ( ) [ ]
* + ? { } ,

Special Character Definitions
\ Quote the next metacharacter
^ Match the beginning of the line
. Match any character (except newline)
$ Match the end of the line (or before newline at the end)
| Alternation
() Grouping
[] Character class
* Match 0 or more times
+ Match 1 or more times
? Match 1 or 0 times
{n} Match exactly n times
{n,} Match at least n times
{n,m} Match at least n but not more than m times
More Special Character Stuff
\t tab (HT, TAB)
\n newline (LF, NL)
\r return (CR)
\f form feed (FF)
\a alarm (bell) (BEL)
\e escape (think troff) (ESC)
\033 octal char (think of a PDP-11)
\x1B hex char
\c[ control char
\l lowercase next char (think vi)
\u uppercase next char (think vi)
\L lowercase till \E (think vi)
\U uppercase till \E (think vi)
\E end case modification (think vi)
\Q quote (disable) pattern metacharacters till \E
Even More Special Characters
\w Match a "word" character (alphanumeric plus "_")
\W Match a non-word character
\s Match a whitespace character
\S Match a non-whitespace character
\d Match a digit character
\D Match a non-digit character
\b Match a word boundary
\B Match a non-(word boundary)
\A Match only at beginning of string
\Z Match only at end of string, or before newline at the end
\z Match only at end of string
\G Match only where previous m//g left off (works only with /g)
up
18
hello at weblap dot ro
4 years ago
Post slug generator, for creating clean urls from titles.
It works with many languages.

<?php
function remove_accent($str)
{
 
$a = array('À', 'Á', 'Â', 'Ã', 'Ä', 'Å', 'Æ', 'Ç', 'È', 'É', 'Ê', 'Ë', 'Ì', 'Í', 'Î', 'Ï', 'Ð', 'Ñ', 'Ò', 'Ó', 'Ô', 'Õ', 'Ö', 'Ø', 'Ù', 'Ú', 'Û', 'Ü', 'Ý', 'ß', 'à', 'á', 'â', 'ã', 'ä', 'å', 'æ', 'ç', 'è', 'é', 'ê', 'ë', 'ì', 'í', 'î', 'ï', 'ñ', 'ò', 'ó', 'ô', 'õ', 'ö', 'ø', 'ù', 'ú', 'û', 'ü', 'ý', 'ÿ', 'Ā', 'ā', 'Ă', 'ă', 'Ą', 'ą', 'Ć', 'ć', 'Ĉ', 'ĉ', 'Ċ', 'ċ', 'Č', 'č', 'Ď', 'ď', 'Đ', 'đ', 'Ē', 'ē', 'Ĕ', 'ĕ', 'Ė', 'ė', 'Ę', 'ę', 'Ě', 'ě', 'Ĝ', 'ĝ', 'Ğ', 'ğ', 'Ġ', 'ġ', 'Ģ', 'ģ', 'Ĥ', 'ĥ', 'Ħ', 'ħ', 'Ĩ', 'ĩ', 'Ī', 'ī', 'Ĭ', 'ĭ', 'Į', 'į', 'İ', 'ı', 'IJ', 'ij', 'Ĵ', 'ĵ', 'Ķ', 'ķ', 'Ĺ', 'ĺ', 'Ļ', 'ļ', 'Ľ', 'ľ', 'Ŀ', 'ŀ', 'Ł', 'ł', 'Ń', 'ń', 'Ņ', 'ņ', 'Ň', 'ň', 'ʼn', 'Ō', 'ō', 'Ŏ', 'ŏ', 'Ő', 'ő', 'Œ', 'œ', 'Ŕ', 'ŕ', 'Ŗ', 'ŗ', 'Ř', 'ř', 'Ś', 'ś', 'Ŝ', 'ŝ', 'Ş', 'ş', 'Š', 'š', 'Ţ', 'ţ', 'Ť', 'ť', 'Ŧ', 'ŧ', 'Ũ', 'ũ', 'Ū', 'ū', 'Ŭ', 'ŭ', 'Ů', 'ů', 'Ű', 'ű', 'Ų', 'ų', 'Ŵ', 'ŵ', 'Ŷ', 'ŷ', 'Ÿ', 'Ź', 'ź', 'Ż', 'ż', 'Ž', 'ž', 'ſ', 'ƒ', 'Ơ', 'ơ', 'Ư', 'ư', 'Ǎ', 'ǎ', 'Ǐ', 'ǐ', 'Ǒ', 'ǒ', 'Ǔ', 'ǔ', 'Ǖ', 'ǖ', 'Ǘ', 'ǘ', 'Ǚ', 'ǚ', 'Ǜ', 'ǜ', 'Ǻ', 'ǻ', 'Ǽ', 'ǽ', 'Ǿ', 'ǿ');
 
$b = array('A', 'A', 'A', 'A', 'A', 'A', 'AE', 'C', 'E', 'E', 'E', 'E', 'I', 'I', 'I', 'I', 'D', 'N', 'O', 'O', 'O', 'O', 'O', 'O', 'U', 'U', 'U', 'U', 'Y', 's', 'a', 'a', 'a', 'a', 'a', 'a', 'ae', 'c', 'e', 'e', 'e', 'e', 'i', 'i', 'i', 'i', 'n', 'o', 'o', 'o', 'o', 'o', 'o', 'u', 'u', 'u', 'u', 'y', 'y', 'A', 'a', 'A', 'a', 'A', 'a', 'C', 'c', 'C', 'c', 'C', 'c', 'C', 'c', 'D', 'd', 'D', 'd', 'E', 'e', 'E', 'e', 'E', 'e', 'E', 'e', 'E', 'e', 'G', 'g', 'G', 'g', 'G', 'g', 'G', 'g', 'H', 'h', 'H', 'h', 'I', 'i', 'I', 'i', 'I', 'i', 'I', 'i', 'I', 'i', 'IJ', 'ij', 'J', 'j', 'K', 'k', 'L', 'l', 'L', 'l', 'L', 'l', 'L', 'l', 'l', 'l', 'N', 'n', 'N', 'n', 'N', 'n', 'n', 'O', 'o', 'O', 'o', 'O', 'o', 'OE', 'oe', 'R', 'r', 'R', 'r', 'R', 'r', 'S', 's', 'S', 's', 'S', 's', 'S', 's', 'T', 't', 'T', 't', 'T', 't', 'U', 'u', 'U', 'u', 'U', 'u', 'U', 'u', 'U', 'u', 'U', 'u', 'W', 'w', 'Y', 'y', 'Y', 'Z', 'z', 'Z', 'z', 'Z', 'z', 's', 'f', 'O', 'o', 'U', 'u', 'A', 'a', 'I', 'i', 'O', 'o', 'U', 'u', 'U', 'u', 'U', 'u', 'U', 'u', 'U', 'u', 'A', 'a', 'AE', 'ae', 'O', 'o');
  return
str_replace($a, $b, $str);
}

function
post_slug($str)
{
  return
strtolower(preg_replace(array('/[^a-zA-Z0-9 -]/', '/[ -]+/', '/^-|-$/'),
  array(
'', '-', ''), remove_accent($str)));
}
?>

Example: post_slug(' -Lo#&@rem  IPSUM //dolor-/sit - amet-/-consectetur! 12 -- ')
will output: lorem-ipsum-dolor-sit-amet-consectetur-12
up
19
denis_truffaut a t hotmail d o t com
2 years ago
If you want to catch characters, as well european, russian, chinese, japanese, korean of whatever, just :
- use mb_internal_encoding('UTF-8');
- use preg_replace('`...`u', '...', $string) with the u (unicode) modifier

For further information, the complete list of preg_* modifiers could be found at :
http://php.net/manual/en/reference.pcre.pattern.modifiers.php
up
7
akniep at rayo dot info
6 years ago
preg_replace (and other preg-functions) return null instead of a string when encountering problems you probably did not think about!
-------------------------

It may not be obvious to everybody that the function returns NULL if an error of any kind occurres. An error I happen to stumple about quite often was the back-tracking-limit:
http://de.php.net/manual/de/pcre.configuration.php
#ini.pcre.backtrack-limit

When working with HTML-documents and their parsing it happens that you encounter documents that have a length of over 100.000 characters and that may lead to certain regular-expressions to fail due the back-tracking-limit of above.

A regular-expression that is ungreedy ("U", http://de.php.net/manual/de/reference.pcre.pattern.modifiers.php) often does the job, but still: sometimes you just need a greedy regular expression working on long strings ...

Since, an unhandled return-value of NULL usually creates a consecutive error in the application with unwanted and unforeseen consequences, I found the following solution to be quite helpful and at least save the application from crashing:

<?php

$string_after
= preg_replace( '/some_regexp/', "replacement", $string_before );

// if some error occurred we go on working with the unchanged original string
if (PREG_NO_ERROR !== preg_last_error())
{
   
$string_after = $string_before;
   
   
// put email-sending or a log-message here
} //if

// free memory
unset( $string_before );

?>

You may or should also put a log-message or the sending of an email into the if-condition in order to get informed, once, one of your regular-expressions does not have the effect you desired it to have.
up
8
spamthishard at wtriple dot com
1 year ago
If you want to replace only the n-th occurrence of $pattern, you can use this function:

<?php

function preg_replace_nth($pattern, $replacement, $subject, $nth=1) {
    return
preg_replace_callback($pattern,
        function(
$found) use (&$pattern, &$replacement, &$nth) {
               
$nth--;
                if (
$nth==0) return preg_replace($pattern, $replacement, reset($found) );
                return
reset($found);
        },
$subject,$nth  );
}

echo
preg_replace_nth("/(\w+)\|/", '${1} is the 4th|', "|aa|b|cc|dd|e|ff|gg|kkk|", 4);

?>

this outputs |aa|b|cc|dd is the 4th|e|ff|gg|kkk|
backreferences are accepted in $replacement
up
4
Terminux (dot) anonymous at gmail
3 years ago
This function will strip all the HTML-like content in a string.
I know you can find a lot of similar content on the web, but this one is simple, fast and robust. Don't simply use the built-in functions like strip_tags(), they dont work so good.

Careful however, this is not a correct validation of a string ; you should use additional functions like mysql_real_escape_string and filter_var, as well as custom tests before putting a submission into your database.

<?php

$html
= <<<END
<div id="function.preg-split" class="refentry"> Bonjour1 \t
<div class="refnamediv"> Bonjour2 \t
<h1 class="refname">Bonjour3 \t</h1>
<h1 class=""">Bonjour4 \t</h1>
<h1 class="*%1">Bonjour5 \t</h1>
<body>Bonjour6 \t<//body>>
</ body>Bonjour7 \t<////        body>>
<
a href="image.php" alt="trans" /        >
some leftover text...
     < DIV class=noCompliant style = "text-align:left;" >
... and some other ...
< dIv > < empty>  </ empty>
  <p> This is yet another text <br  >
     that wasn't <b>compliant</b> too... <br   />
     </p>
<div class="noClass" > this one is better but we don't care anyway </div ><P>
    <input   type= "text"  name ='my "name' value  = "nothin really." readonly>
end of paragraph </p> </Div>   </div>   some trailing text
END;

// This echoes correctly all the text that is not inside HTML tags
$html_reg = '/<+\s*\/*\s*([A-Z][A-Z0-9]*)\b[^>]*\/*\s*>+/i';
echo
htmlentities( preg_replace( $html_reg, '', $html ) );

// This extracts only a small portion of the text
echo htmlentities(strip_tags($html));

?>
up
2
gabe at mudbuginfo dot com
10 years ago
It is useful to note that the 'limit' parameter, when used with 'pattern' and 'replace' which are arrays, applies to each individual pattern in the patterns array, and not the entire array.
<?php

$pattern
= array('/one/', '/two/');
$replace = array('uno', 'dos');
$subject = "test one, one two, one two three";

echo
preg_replace($pattern, $replace, $subject, 1);
?>

If limit were applied to the whole array (which it isn't), it would return:
test uno, one two, one two three

However, in reality this will actually return:
test uno, one dos, one two three
up
3
Ray dot Paseur at SometimesUsesGmail dot com
2 years ago
Please see Example #4 Strip whitespace.  This works as designed, but if you are using Windows, it may not work as expected.  The potential "gotcha" is the CR/LF line endings.  On a Unix system, where there is only a single character line ending, that regex pattern will preserve line endings.  On Windows, it may strip line endings.
up
3
dyer85 at gmail dot com
6 years ago
There seems to be some unexpected behavior when using the /m modifier when the line terminators are win32 or mac format.

If you have a string like below, and try to replace dots, the regex won't replace correctly:

<?php
$s
= "Testing, testing.\r\n"
  
. "Another testing line.\r\n"
  
. "Testing almost done.";

echo
preg_replace('/\.$/m', '.@', $s); // only last . replaced
?>

The /m modifier doesn't seem to work properly when CRLFs or CRs are used. Make sure to convert line endings to LFs (*nix format) in your input string.
up
7
hvishnu999 at gmail dot com
2 years ago
To covert a string to SEO friendly, do this:

<?php
$realname
= "This is the string to be made SEO friendly!"

$seoname = preg_replace('/\%/',' percentage',$realname);
$seoname = preg_replace('/\@/',' at ',$seoname);
$seoname = preg_replace('/\&/',' and ',$seoname);
$seoname = preg_replace('/\s[\s]+/','-',$seoname);    // Strip off multiple spaces
$seoname = preg_replace('/[\s\W]+/','-',$seoname);    // Strip off spaces and non-alpha-numeric
$seoname = preg_replace('/^[\-]+/','',$seoname); // Strip off the starting hyphens
$seoname = preg_replace('/[\-]+$/','',$seoname); // // Strip off the ending hyphens
$seoname = strtolower($seoname);

echo
$seoname;
?>

This will print: this-is-the-string-to-be-made-seo-friendly
up
1
timitheenchanter
3 years ago
If you have issues where preg_replace returns an empty string, please take a look at these two ini parameters:

pcre.backtrack_limit
pcre.recursion_limit

The default is set to 100K.  If your buffer is larger than this, look to increase these two values.
up
4
anyvie at devlibre dot fr
3 years ago
A variable can handle a huge quantity of data but preg_replace can't.

Example :
<?php
$url
= "ANY URL WITH LOTS OF DATA";

// We get all the data into $data
$data = file_get_contents($url);

// We just want to keep the content of <head>
$head = preg_replace("#(.*)<head>(.*?)</head>(.*)#is", '$2', $data);
?>

$head can have the desired content, or be empty, depends on the length of $data.

For this application, just add :
$data = substr($data, 0, 4096);
before using preg_replace, and it will work fine.
up
3
Anonymous
6 years ago
People using functions like scandir with user input and protecting against "../" by using preg_replace make sure you run ir recursivly untill preg_match no-long finds it, because if you don't the following can happen.

If a user gives the path:
"./....//....//....//....//....//....//....//"
then your script detects every "../" and removes them leaving:
"./../../../../../../../"
Which is proberly going back enough times to show root.

I just found this vunrability in an old script of mine, which was written several years ago.

Always do:
<?php
while( preg_match( [expression], $input ) )
{
  
$input = preg_replace( [expression], "", $input );
}
?>
up
1
akarmenia at gmail dot com
3 years ago
[Editor's note: in this case it would be wise to rely on the preg_quote() function instead which was added for this specific purpose]

If your replacement string has a dollar sign or a backslash. it may turn into a backreference accidentally! This will fix it.

I want to replace 'text' with '$12345' but this becomes a backreference to $12 (which doesn't exist) and then it prints the remaining '34'. The function down below will return a string that escapes the backreferences.

OUTPUT:
string(8) "some 345"
string(11) "some \12345"
string(8) "some 345"
string(11) "some $12345"

<?php

$a
= 'some text';

// Either of these will backreference and fail
$b1 = '\12345'; // Should be '\\12345' to avoid backreference
$b2 = '$12345'; // Should be '\$12345' to avoid backreference

$d = array($b1, $b2);

foreach (
$d as $b) {
   
$result1 = preg_replace('#(text)#', $b, $a); // Fails
   
var_dump($result1);
   
$result2 = preg_replace('#(text)#', preg_escape_back($b), $a); // Succeeds
   
var_dump($result2);
}

// Escape backreferences from string for use with regex
function preg_escape_back($string) {
   
// Replace $ with \$ and \ with \\
   
$string = preg_replace('#(?<!\\\\)(\\$|\\\\)#', '\\\\$1', $string);
    return
$string;
}

?>
up
1
craiga at craiga dot id dot au
3 years ago
If there's a chance your replacement text contains any strings such as "$0.95", you'll need to escape those $n backreferences:

<?php
function escape_backreference($x)
{
    return
preg_replace('/\$(\d)/', '\\\$$1', $x);
}
?>
up
1
nospam at probackup dot nl
3 years ago
Warning: a common made mistake in trying to remove all characters except numbers and letters from a string, is to use code with a regex similar to preg_replace('[^A-Za-z0-9_]', '', ...). The output goes in an unexpected direction in case your input contains two double quotes.

echo preg_replace('[^A-Za-z0-9_]', '', 'D"usseldorfer H"auptstrasse')

D"usseldorfer H"auptstrasse

It is important to not forget a leading an trailing forward slash in the regex:

echo preg_replace('/[^A-Za-z0-9_]/', '', 'D"usseldorfer H"auptstrasse')

Dusseldorfer Hauptstrasse

PS An alternative is to use preg_replace('/\W/', '', $t) for keeping all alpha numeric characters including underscores.
up
1
php-comments-REMOVE dot ME at dotancohen dot com
6 years ago
Below is a function for converting Hebrew final characters to their
normal equivelants should they appear in the middle of a word.
The /b argument does not treat Hebrew letters as part of a word,
so I had to work around that limitation.

<?php

$text
="עברית מבולגנת";

function
hebrewNotWordEndSwitch ($from, $to, $text) {
  
$text=
   
preg_replace('/'.$from.'([א-ת])/u','$2'.$to.'$1',$text);
   return
$text;
}

do {
  
$text_before=$text;
  
$text=hebrewNotWordEndSwitch("ך","כ",$text);
  
$text=hebrewNotWordEndSwitch("ם","מ",$text);
  
$text=hebrewNotWordEndSwitch("ן","נ",$text);
  
$text=hebrewNotWordEndSwitch("ף","פ",$text);
  
$text=hebrewNotWordEndSwitch("ץ","צ",$text);
}   while (
$text_before!=$text );

print
$text; // עברית מסודרת!

?>

The do-while is necessary for multiple instances of letters, such
as "אנני" which would start off as "אןןי". Note that there's still the
problem of acronyms with gershiim but that's not a difficult one
to solve. The code is in use at http://gibberish.co.il which you can
use to translate wrongly-encoded Hebrew, transliterize, and some
other Hebrew-related functions.

To ensure that there will be no regular characters at the end of a
word, just convert all regular characters to their final forms, then
run this function. Enjoy!
up
2
cincodenada at gmail dot dot dot com
1 year ago
There seems to be some confusion over how greediness works.  For those familiar with Regular Expressions in other languages, particularly Perl: it works like you would expect, and as documented.  Greedy by default, un-greedy if you follow a quantifier with a question mark.

There is a PHP/PCRE-specific U pattern modifier that flips the greediness, so that quantifiers are by default un-greedy, and become greedy if you follow the quantifier with a question mark: http://www.php.net/manual/en/reference.pcre.pattern.modifiers.php

To make things clear, a series of examples:

<?php

$preview
= "a bunch of stuff <code>this that</code> and more stuff <code>with a second code block</code> then extra at the end";

$preview_default = preg_replace('/<code>(.*)<\/code>/is', "<code class=\"prettyprint\">$1</code>", $preview);
$preview_manually_ungreedy = preg_replace('/<code>(.*?)<\/code>/is', "<code class=\"prettyprint\">$1</code>", $preview);

$preview_U_default = preg_replace('/<code>(.*)<\/code>/isU', "<code class=\"prettyprint\">$1</code>", $preview);
$preview_U_manually_greedy = preg_replace('/<code>(.*?)<\/code>/isU', "<code class=\"prettyprint\">$1</code>", $preview);

echo
"Default, no ?: $preview_default\n";
echo
"Default, with ?: $preview_manually_ungreedy\n";
echo
"U flag, no ?: $preview_U_default\n";
echo
"U flag, with ?: $preview_U_manually_greedy\n";

?>

Results in this:

Default, no ?: a bunch of stuff <code class="prettyprint">this that</code> and more stuff <code>with a second code block</code> then extra at the end
Default, with ?: a bunch of stuff <code class="prettyprint">this that</code> and more stuff <code class="prettyprint">with a second code block</code> then extra at the end
U flag, no ?: a bunch of stuff <code class="prettyprint">this that</code> and more stuff <code class="prettyprint">with a second code block</code> then extra at the end
U flag, with ?: a bunch of stuff <code class="prettyprint">this that</code> and more stuff <code>with a second code block</code> then extra at the end

As expected: greedy by default, ? inverts it to ungreedy.  With the U flag, un-greedy by default, ? makes it greedy.
up
2
me at perochak dot com
3 years ago
If you would like to remove a tag along with the text inside it then use the following code.

<?php
preg_replace
('/(<tag>.+?)+(<\/tag>)/i', '', $string);
?>

example
<?php $string='<span class="normalprice">55 PKR</span>'; ?>

<?php
$string
= preg_replace('/(<span class="normalprice">.+?)+(<\/span>)/i', '', $string);
?>

This will results a null or empty string.

<?php
$string
='My String <span class="normalprice">55 PKR</span>';

$string = preg_replace('/(<span class="normalprice">.+?)+(<\/span>)/i', '', $string);
?>

This will results a " My String"
up
1
jette at nerdgirl dot dk
5 years ago
I use this to prevent users from overdoing repeated text. The following function only allows 3 identical characters at a time and also takes care of repetitions with whitespace added.

This means that 'haaaaaaleluuuujaaaaa' becomes 'haaaleluuujaaa' and 'I am c o o o o o o l' becomes 'I am c o o o l'

<?php
//Example of user input
$str = "aaaaaaaaaaabbccccccccaaaaad d d d   d      d d ddde''''''''''''";

function
stripRepeat($str) {
 
//Do not allow repeated whitespace
 
$str = preg_replace("/(\s){2,}/",'$1',$str);
 
//Result: aaaaaaaaaaabbccccccccaaaaad d d d d d d ddde''''''''''''

  //Do not allow more than 3 identical characters separated by any whitespace
 
$str = preg_replace('{( ?.)\1{4,}}','$1$1$1',$str);
 
//Final result: aaabbcccaaad d d ddde'''

 
return $str;
}
?>

To prevent any repetitions of characters, you only need this:

<?php
$str
= preg_replace('{(.)\1+}','$1',$str);
//Result: abcad d d d d d d de'
?>
up
1
mtsoft at mt-soft dot com dot ar
7 years ago
This function takes a URL and returns a plain-text version of the page. It uses cURL to retrieve the page and a combination of regular expressions to strip all unwanted whitespace. This function will even strip the text from STYLE and SCRIPT tags, which are ignored by PHP functions such as strip_tags (they strip only the tags, leaving the text in the middle intact).

Regular expressions were split in 2 stages, to avoid deleting single carriage returns (also matched by \s) but still delete all blank lines and multiple linefeeds or spaces, trimming operations took place in 2 stages.

<?php
function webpage2txt($url)
{
$user_agent = “Mozilla/4.0 (compatible; MSIE 5.01; Windows NT 5.0);

$ch = curl_init();    // initialize curl handle
curl_setopt($ch, CURLOPT_URL, $url); // set url to post to
curl_setopt($ch, CURLOPT_FAILONERROR, 1);              // Fail on errors
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);    // allow redirects
curl_setopt($ch, CURLOPT_RETURNTRANSFER,1); // return into a variable
curl_setopt($ch, CURLOPT_PORT, 80);            //Set the port number
curl_setopt($ch, CURLOPT_TIMEOUT, 15); // times out after 15s

curl_setopt($ch, CURLOPT_USERAGENT, $user_agent);

$document = curl_exec($ch);

$search = array(@<script[^>]*?>.*?</script>@si’,  // Strip out javascript
‘@<style[^>]*?>.*?</style>@siU’,    // Strip style tags properly
‘@<[\/\!]*?[^<>]*?>@si’,            // Strip out HTML tags
‘@<![\s\S]*?–[ \t\n\r]*>@’,         // Strip multi-line comments including CDATA
‘/\s{2,}/’,

);

$text = preg_replace($search, “\n”, html_entity_decode($document));

$pat[0] = “/^\s+/”;
$pat[2] = “/\s+\$/”;
$rep[0] = “”;
$rep[2] = ” “;

$text = preg_replace($pat, $rep, trim($text));

return $text;
}
?>

Potential uses of this function are extracting keywords from a webpage, counting words and things like that. If you find it useful, drop us a comment and let us know where you used it.
up
1
elliot dot greene at att dot net
4 months ago
preg_replace to only show alpha numeric characters

$info = "The Development of code . http://www.";

$info = preg_replace("/[^a-zA-Z0-9]+/", "", $info);

echo $info;

OUTPUTS: TheDevelopmentofcodehttpwww

This is a good workable code
http://www.sioure.com
up
0
eurosat7 at yahoo dot de
2 months ago
If you want to limit multiple occurences of any char in a sequence you might want to use this function.
<?php
function limit_char_repeat($string,$maxrepeat){
    return
preg_replace("/(.)\\1{".$maxrepeat.",}/ms",str_repeat('\1',$maxrepeat),$string);
}
?>

Example:
<?php
$string
="
---------------------
Heeeeeeeeeeeeeeeeeeeello Woooooooooooooooorld!!!!!!!!!!!!!!!!!!!!!!!!
===============================================================================================================
~~~~~~~~~~~~~~~~ ~ ~ ~
"
;
echo
limit_char_repeat($string,5);
?>
Output:
-----
Heeeeello Wooooorld!!!!!
=====
~~~~~ ~ ~ ~
up
0
dhgouveia at gmail dot com
5 months ago
if you want to replace only the content of specific div using the ID you can do this :

<?php
$id
= "my-id";
$content = '<div id="my-id">Hello World</div><div  id="other-id"> My World</div>';
$replacement = "Hello Planet";

$result  preg_replace('/(<div.*?id=\"'.$id.'\"[^>]*>)(.*?)(<\/div>)/i', "$1{$replacement}$3", $content);
  
echo
$result:
?>

should print :
<div id="my-id">Hello Planet</div><div  id="other-id"> My World</div>
up
1
Dustin
1 year ago
Matching substrings where the match can exist at the end of the string was non-intuitive to me.

I found this because:
strtotime() interprets 'mon' as 'Monday', but Postgres uses interval types that return short names by default, e.g. interval '1 month' returns as '1 mon'.

I used something like this:

$str = "mon month monday Mon Monday Month MONTH MON";
$strMonth = preg_replace('~(mon)([^\w]|$)~i', '$1th$2', $str);
echo "$str\n$strMonth\n";

//to output:
mon month monday Mon Monday Month MONTH MON
month month monday Month Monday Month MONTH MONth
up
0
nik at rolls dot cc
1 year ago
To split Pascal/CamelCase into Title Case (for example, converting descriptive class names for use in human-readable frontends), you can use the below function:

<?php
function expandCamelCase($source) {
  return
preg_replace('/(?<!^)([A-Z][a-z]|(?<=[a-z])[^a-z]|(?<=[A-Z])[0-9_])/', ' $1', $source);
}
?>

Before:
  ExpandCamelCaseAPIDescriptorPHP5_3_4Version3_21Beta
After:
  Expand Camel Case API Descriptor PHP 5_3_4 Version 3_21 Beta
up
-2
reigyx_x at yahoo dot co dot id
10 months ago
hy php.net
how to script preg replace for this code
>:(
i want to use this smiley, buit dont know how to regex it..
up
-1
webmaster at antoinebouchard dot net
2 years ago
It may seem useless, but the font tag in Internet Explorer won't recognize compressed hexa values. This is a simple function to uncompress hexa values in the font tag

<?php
function colorfix($text) {
    return
preg_replace('/"#([a-f0-9])([a-f0-9])([a-f0-9])"/i', '"#$1$1$2$2$3$3"', $text);
}
?>
up
0
kosalasl at gmail dot com
3 years ago
I wrote some useful function to display date format based on date function particular string. preg_replace function really help me to write this tiny code

<?php
function mysql2formatDate($strn,$outformat='n/j/Y'){
   
    return
preg_replace("/(\d{4})-(\d{2})-(\d{2})/e","Date('$outformat',strtotime('$0'))",$strn);
}
?>
up
0
someuser at dot dot com
3 years ago
Replacement of line numbers, with replacement limit per line.

Solution that worked for me.
I have a file with tasks listed each starting from number, and only starting number should be removed because forth going text has piles of numbers to be omitted.

56 Patient A of 46 years suffering ... ...
57 Newborn of 26 weeks was ...
58 Jane, having age 18 years recollects onsets of ...
...
587 Patient of 70 years ...

etc.

<?php
// Array obtained from file   
$array = file($file, true);

// Decompile array with foreach loop
foreach($array as $value)
{
   
//    Take away numbers 100-999
    //    Starting from biggest
    //
    //    %            Delimiter
    //    ^            Make match from beginning of line
    //    [0-9]        Range of numbers
    //    {3}        Multiplication of digit range (For tree digit numbers)
    //
   
if(preg_match('%^[0-9]{3}%', $value))
    {
       
// Re-assing to value its modified copy
       
$value = preg_replace('%^[0-9]{3}%', '-HERE WAS XXX NUMBER-', $value, 1);
    }
               
   
// Take away numbers 10-99
   
elseif(preg_match('%^[0-9]{2}%', $value)) {
       
$value = preg_replace('%^[0-9]{2}%', '-HERE WAS XX NUMBER-', $value, 1);
    }
               
   
// Take away numbers 0-9
   
elseif(preg_match('%^[0-9]%', $value)) {
       
$value = preg_replace('%^[0-9]%', '-HERE WAS X NUMBER-', $value, 1);
    }
               
   
// Build array back
   
$arr[] = array($value);
   
    }
}
?>
up
-1
support ~ at ~ mbnad ~ dot ~ com
2 years ago
Wrong number of letters in words and how solve this problem and also remove extra spaces in a row:

<?php
$message_body
= 'HHHHHEEEEELLLLLOOOOO             IAM COOOOOOOOOOOOOOOOOOOL      !!!!!!!!!!';
echo
"\n".$message_body."\n";
$message_body = preg_replace('~(.?)\1{3,}~', '$1', $message_body);
$message_body = preg_replace('~\s+~', ' ', trim($message_body));
echo
"\n".$message_body."\n\n";
?>
up
0
ude dot mpco at wotsrabt dot maps-on
3 years ago
I find it useful to output HTML form names to the user from time to time while going through the $_GET or $_POST on a user's submission and output keys of the GET or POST array... the only problem being in the name attribute I follow common programming guidelines and have names like the following: eventDate, eventTime, userEmail, etc. Not great to just output to the user-- so I came up with this function. It just adds a space before any uppercase letter in the string.

<?php
function caseSwitchToSpaces( $stringVariableName )
{

$pattern = '/([A-Z])/';
$replacement = ' ${1}';

return
preg_replace( $pattern, $replacement, $stringVariableName );
}

//ex.
echo( caseSwitchToSpaces( "helloWorld" ) );
?>

would output:

"hello World"

You could also do title-style casing to it if desired so the first word isn't lowercase.
up
0
sergei dot garrison at gmail dot com
4 years ago
If you want to add simple rich text functionality to HTML input fields, preg_replace can be quite handy.

For example, if you want users to be able to bold text by typing *text* or italicize it by typing _text_, you can use the following function.

<?php
function encode(&$text) {
   
$text = preg_replace('/\*([^\*]+)\*/', '<b>\1</b>', $text);
   
$text = preg_replace('/_([^_]+)_/', '<i>\1</i>', $text);
    return
$text;
    }
?>

This works for nested tags, too, although it will not fix nesting mistakes.

To make this function more efficient, you could put the delimiters (* and _, in this case) and their HTML tag equivalents in an array and loop through them.
up
1
mdrisser at gmail dot com
5 years ago
An alternative to the method suggested by sheri is to remember that the regex modifier '$' only looks at the end of the STRING, the example given is a single string consisting of multiple lines.

Try:
<?php
// Following is 1 string containing 3 lines
$s = "Testing, testing.\r\n"
  
. "Another testing line.\r\n"
  
. "Testing almost done.";

echo
preg_replace('/\.\\r\\n/m', '@\r\n', $s);
?>

This results in the string:
Testing, testing@\r\nAnother testing line@\r\nTesting almost done.
up
1
7r6ivyeo at mail dot com
5 years ago
String to filename:

<?php
function string_to_filename($word) {
   
$tmp = preg_replace('/^\W+|\W+$/', '', $word); // remove all non-alphanumeric chars at begin & end of string
   
$tmp = preg_replace('/\s+/', '_', $tmp); // compress internal whitespace and replace with _
   
return strtolower(preg_replace('/\W-/', '', $tmp)); // remove all non-alphanumeric chars except _ and -
}
?>

Returns a usable & readable filename.
up
1
marcin at pixaltic dot com
6 years ago
<?php
   
//:::replace with anything that you can do with searched string:::
    //Marcin Majchrzak
    //pixaltic.com
   
   
$c = "2 4 8";
    echo (
$c); //display:2 4 8

   
$cp = "/(\d)\s(\d)\s(\d)/e"; //pattern
   
$cr = "'\\3*\\2+\\1='.(('\\3')*('\\2')+('\\1'))"; //replece
   
$c = preg_replace($cp, $cr, $c);
    echo (
$c); //display:8*4+2=34
?>
up
0
arie dot benichou at gmail dot com
5 years ago
<?php
//Be carefull with utf-8, even with unicode and utf-8 support enabled, a pretty odd bug occurs depending on your operating system
$str = "Hi, my name is Arié!<br />";
echo
preg_replace('#\bArié\b#u', 'Gontran', $str);
//on windows system, output is "Hi, my name is Gontran<br />"
//on unix system, output is "Hi, my name is Arié<br />"
echo preg_replace('#\bArié(|\b)#u', 'Gontran', $str);
//on windows and unix system, output is "Hi, my name is Gontran<br />"
up
0
akam AT akameng DOT com
5 years ago
<?php                   
$converted   
=
array(
//3 of special chars

'/(;)/ie',
'/(#)/ie',
'/(&)/ie',

//MySQL reserved words!
//Check mysql website!
'/(ACTION)/ie', '/(ADD)/ie', '/(ALL)/ie', '/(ALTER)/ie', '/(ANALYZE)/ie', '/(AND)/ie', '/(AS)/ie', '/(ASC)/ie',

//remaining of special chars
'/(<)/ie', '/(>)/ie', '/(\.)/ie', '/(,)/ie', '/(\?)/ie', '/(`)/ie', '/(!)/ie', '/(@)/ie', '/(\$)/ie', '/(%)/ie', '/(\^)/ie', '/(\*)/ie', '/(\()/ie', '/(\))/ie', '/(_)/ie', '/(-)/ie', '/(\+)/ie',
'/(=)/ie', '/(\/)/ie', '/(\|)/ie', '/(\\\)/ie', "/(')/ie", '/(")/ie', '/(:)/'
);

$input_text = preg_replace($converted, "UTF_to_Unicode('\\1')", $text);

function
UTF_to_Unicode($data){

//return $data;
}
?>
The above example useful for filtering input data, then saving into mysql database, it's not need tobe decoded again, just use UTF-8 as charset.
Please Note escaping special chars between delimiter..
up
0
da_pimp2004_966 at hotmail dot com
6 years ago
A simple BB like thing..

<?php
function AddBB($var) {
       
$search = array(
               
'/\[b\](.*?)\[\/b\]/is',
               
'/\[i\](.*?)\[\/i\]/is',
               
'/\[u\](.*?)\[\/u\]/is',
               
'/\[img\](.*?)\[\/img\]/is',
               
'/\[url\](.*?)\[\/url\]/is',
               
'/\[url\=(.*?)\](.*?)\[\/url\]/is'
               
);

       
$replace = array(
               
'<strong>$1</strong>',
               
'<em>$1</em>',
               
'<u>$1</u>',
               
'<img src="$1" />',
               
'<a href="$1">$1</a>',
               
'<a href="$1">$2</a>'
               
);

       
$var = preg_replace ($search, $replace, $var);
        return
$var;
}
?>
up
0
Michael W
6 years ago
For filename tidying I prefer to only ALLOW certain characters rather than converting particular ones that we want to exclude. To this end I use ...

<?php
  $allowed
= "/[^a-z0-9\\040\\.\\-\\_\\\\]/i";
 
preg_replace($allowed,"",$str));
?>

Allows letters a-z, digits, space (\\040), hyphen (\\-), underscore (\\_) and backslash (\\\\), everything else is removed from the string.
up
0
ulf dot reimers at tesa dot com
6 years ago
Hi,

as I wasn't able to find another way to do this, I wrote a function converting any UTF-8 string into a correct NTFS filename (see http://en.wikipedia.org/wiki/Filename).

<?php
function strToNTFSFilename($string)
{
 
$reserved = preg_quote('\/:*?"<>', '/');
  return
preg_replace("/([\\x00-\\x1f{$forbidden}])/e", "_", $string);
}
?>

It converts all control characters and filename characters which are reserved by Windows ('\/:*?"<>') into an underscore.
This way you can safely create an NTFS filename out of any UTF-8 string.
up
0
mike dot hayward at mikeyskona dot co dot uk
7 years ago
Hi.
Not sure if this will be a great help to anyone out there, but thought i'd post just in case.
I was having an Issue with a project that relied on $_SERVER['REQUEST_URI']. Obviously this wasn't working on IIS.
(i am using mod_rewrite in apache to call up pages from a database and IIS doesn't set REQUEST_URI). So i knocked up this simple little preg_replace to use the query string set by IIS when redirecting to a PHP error page.

<?php
//My little IIS hack :)
if(!isset($_SERVER['REQUEST_URI'])){
 
$_SERVER['REQUEST_URI'] = preg_replace( '/404;([a-zA-Z]+:\/\/)(.*?)\//i', "/" , $_SERVER['QUERY_STRING'] );
}
?>

Hope this helps someone else out there trying to do the same thing :)
up
0
sternkinder at gmail dot com
7 years ago
From what I can see, the problem is, that if you go straight and substitute all 'A's wit 'T's you can't tell for sure which 'T's to substitute with 'A's afterwards. This can be for instance solved by simply replacing all 'A's by another character (for instance '_' or whatever you like), then replacing all 'T's by 'A's, and then replacing all '_'s (or whatever character you chose) by 'A's:

<?php
$dna
= "AGTCTGCCCTAG";
echo
str_replace(array("A","G","C","T","_","-"), array("_","-","G","A","T","C"), $dna); //output will be TCAGACGGGATC
?>

Although I don't know how transliteration in perl works (though I remember that is kind of similar to the UNIX command "tr") I would suggest following function for "switching" single chars:

<?php
function switch_chars($subject,$switch_table,$unused_char="_") {
    foreach (
$switch_table as $_1 => $_2 ) {
       
$subject = str_replace($_1,$unused_char,$subject);
       
$subject = str_replace($_2,$_1,$subject);
       
$subject = str_replace($unused_char,$_2,$subject);
    }
    return
$subject;
}

echo
switch_chars("AGTCTGCCCTAG", array("A"=>"T","G"=>"C")); //output will be TCAGACGGGATC
?>
up
0
rob at ubrio dot us
7 years ago
Also worth noting is that you can use array_keys()/array_values() with preg_replace like:

<?php
$subs
= array(
 
'/\[b\](.+)\[\/b\]/Ui' => '<strong>$1</strong>',
 
'/_(.+)_/Ui' => '<em>$1</em>'
 
...
  ...
);

$raw_text = '[b]this is bold[/b] and this is _italic!_';

$bb_text = preg_replace(array_keys($subs), array_values($subs), $raw_text);
?>
up
1
Alexey Lebedev
8 years ago
Wasted several hours because of this:

<?php
$str
='It&#039;s a string with HTML entities';
preg_replace('~&#(\d+);~e', 'code2utf($1)', $str);
?>

This code must convert numeric html entities to utf8. And it does with a little exception. It treats wrong codes starting with &#0

The reason is that code2utf will be called with leading zero, exactly what the pattern matches - code2utf(039).
And it does matter! PHP treats 039 as octal number.
Try <?php print(011); ?>

Solution:
<?php preg_replace('~&#0*(\d+);~e', 'code2utf($1)', $str); ?>
up
-2
tal at ashkenazi dot co dot il
6 years ago
after long time of tring get rid of \n\r and <BR> stuff i've came with this...
(i done some changes in clicklein() function...)

<?php
   
function clickable($url){
       
$url                                    =    str_replace("\\r","\r",$url);
       
$url                                    =    str_replace("\\n","\n<BR>",$url);
       
$url                                    =    str_replace("\\n\\r","\n\r",$url);

       
$in=array(
       
'`((?:https?|ftp)://\S+[[:alnum:]]/?)`si',
       
'`((?<!//)(www\.\S+[[:alnum:]]/?))`si'
       
);
       
$out=array(
       
'<a href="$1"  rel=nofollow>$1</a> ',
       
'<a href="http://$1" rel=\'nofollow\'>$1</a>'
       
);
        return
preg_replace($in,$out,$url);
    }

?>
up
0
ismith at nojunk dot motorola dot com
7 years ago
Be aware that when using the "/u" modifier, if your input text contains any bad UTF-8 code sequences, then preg_replace will return an empty string, regardless of whether there were any matches.

This is due to the PCRE library returning an error code if the string contains bad UTF-8.
up
-2
halityesil [ at at] globya [ dot dot] net
6 years ago
<?PHP

function strip_tags_attributes($sSource, $aAllowedTags = FALSE, $aDisabledAttributes = FALSE, $aAllowedProperties = 'font|font-size|font-weight|color' . '|text-align|text-decoration|margin|margin-left' . '|margin-top|margin-bottom|margin-right|padding' . '|padding-top|padding-left|padding-right|padding-bottom' . '|width|height'){

   if( !
is_array( $aDisabledAttributes ) ){
     
$aDisabledAttributes = array('onabort', 'onactivate', 'onafterprint', 'onafterupdate', 'onbeforeactivate', 'onbeforecopy', 'onbeforecut', 'onbeforedeactivate', 'onbeforeeditfocus', 'onbeforepaste', 'onbeforeprint', 'onbeforeunload', 'onbeforeupdate', 'onblur', 'onbounce', 'oncellchange', 'onchange', 'onclick', 'oncontextmenu', 'oncontrolselect', 'oncopy', 'oncut', 'ondataavaible', 'ondatasetchanged', 'ondatasetcomplete', 'ondblclick', 'ondeactivate', 'ondrag', 'ondragdrop', 'ondragend', 'ondragenter', 'ondragleave', 'ondragover', 'ondragstart', 'ondrop', 'onerror', 'onerrorupdate', 'onfilterupdate', 'onfinish', 'onfocus', 'onfocusin', 'onfocusout', 'onhelp', 'onkeydown', 'onkeypress', 'onkeyup', 'onlayoutcomplete', 'onload', 'onlosecapture', 'onmousedown', 'onmouseenter', 'onmouseleave', 'onmousemove', 'onmoveout', 'onmouseover', 'onmouseup', 'onmousewheel', 'onmove', 'onmoveend', 'onmovestart', 'onpaste', 'onpropertychange', 'onreadystatechange', 'onreset', 'onresize', 'onresizeend', 'onresizestart', 'onrowexit', 'onrowsdelete', 'onrowsinserted', 'onscroll', 'onselect', 'onselectionchange', 'onselectstart', 'onstart', 'onstop', 'onsubmit', 'onunload');
   }
  
  
$sSource = stripcslashes( $sSource );
           
  
$sSource = strip_tags( $sSource, $aAllowedTags );
       
   if( empty(
$aDisabledAttributes) ){
      return
$sSource;
   }

  
$aDisabledAttributes = @ implode('|', $aDisabledAttributes);
       
  
$sSource = preg_replace('/<(.*?)>/ie', "'<' . preg_replace(array('/javascript:[^\"\']*/i', '/(" . $aDisabledAttributes . ")[ \\t\\n]*=[ \\t\\n]*[\"\'][^\"\']*[\"\']/i', '/\s+/'), array('', '', ' '), stripslashes('\\1')) . '>'", $sSource );
  
$sSource = preg_replace('/\s(' . $aDisabledAttributes . ').*?([\s\>])/', '\\2', $sSource);
           
  
$regexp = '@([^;"]+)?(?<!'. $aAllowedProperties .'):(?!\/\/(.+?)\/)((.*?)[^;"]+)(;)?@is';   
  
$sSource = preg_replace($regexp, '', $sSource);
  
$sSource = preg_replace('@[a-z]*=""@is', '', $sSource);
           
   return
$sSource;
}

?>

Online resource help skype name : globya

good luck !
up
0
dani dot church at gmail dot youshouldknowthisone
7 years ago
Note that it is in most cases much more efficient to use preg_replace_callback(), with a named function or an anonymous function created with create_function(), instead of the /e modifier.  When preg_replace() is called with the /e modifier, the interpreter must parse the replacement string into PHP code once for every replacement made, while preg_replace_callback() uses a function that only needs to be parsed once.
up
-1
lehongviet at gmail dot com
7 years ago
I got problem echoing text that contains double-quotes into a text field. As it confuses value option. I use this function below to match and replace each pair of them by smart quotes. The last one will be replaced by a hyphen(-).

It works for me.

<?php
function smart_quotes($text) {
 
$pattern = '/"((.)*?)"/i';
 
$text = preg_replace($pattern,"“\\1”",stripslashes($text));
 
$text = str_replace("\"","-",$text);
 
$text = addslashes($text);
  return
$text;
}
?>
up
-1
131 dot php at cloudyks dot org
7 years ago
Based on previous comment, i suggest
( this function already exist in php 6 )

<?php
function unicode_decode($str){
    return
preg_replace(
       
'#\\\u([0-9a-f]{4})#e',
       
"unicode_value('\\1')",
       
$str);
}

function
unicode_value($code) {
   
$value=hexdec($code);
    if(
$value<0x0080)
        return
chr($value);
    elseif(
$value<0x0800)
        return
chr((($value&0x07c0)>>6)|0xc0)
            .
chr(($value&0x3f)|0x80);
    else
        return
chr((($value&0xf000)>>12)|0xe0)
        .
chr((($value&0x0fc0)>>6)|0x80)
        .
chr(($value&0x3f)|0x80);
}
?>

[EDIT BY danbrown AT php DOT net:  This function originally written by mrozenoer AT overstream DOT net.]
up
-2
robvdl at gmail dot com
8 years ago
For those of you that have ever had the problem where clients paste text from msword into a CMS, where word has placed all those fancy quotes throughout the text, breaking the XHTML validator... I have created a nice regular expression, that replaces ALL high UTF-8 characters with HTML entities, such as ’.

Note that most user examples on php.net I have read, only replace selected characters, such as single and double quotes. This replaces all high characters, including greek characters, arabian characters, smilies, whatever.

It took me ages to get it just downto two regular expressions, but it handles all high level characters properly.

<?php
$text
= preg_replace('/([\xc0-\xdf].)/se', "'&#' . ((ord(substr('$1', 0, 1)) - 192) * 64 + (ord(substr('$1', 1, 1)) - 128)) . ';'", $text);
$text = preg_replace('/([\xe0-\xef]..)/se', "'&#' . ((ord(substr('$1', 0, 1)) - 224) * 4096 + (ord(substr('$1', 1, 1)) - 128) * 64 + (ord(substr('$1', 2, 1)) - 128)) . ';'", $text);
?>
up
0
thewolf at pixelcarnage dot com
11 years ago
I got sick of trying to replace just a word, so I decided I would write my own string replacement code. When that code because far to big and a little faulty I decided to use a simple preg_replace:

<?php
/**
* Written by Rowan Lewis
* $search(string), the string to be searched for
* $replace(string), the string to replace $search
* $subject(string), the string to be searched in
*/
function word_replace($search, $replace, $subject) {
    return
preg_replace('/[a-zA-Z]+/e', '\'\0\' == \'' . $search . '\' ? \'' . $replace . '\': \'\0\';', $subject);
}
?>

I hope that this code helpes someone!
up
-1
steven -a-t- acko dot net
10 years ago
People using the /e modifier with preg_replace should be aware of the following weird behaviour. It is not a bug per se, but can cause bugs if you don't know it's there.

The example in the docs for /e suffers from this mistake in fact.

With /e, the replacement string is a PHP expression. So when you use a backreference in the replacement expression, you need to put the backreference inside quotes, or otherwise it would be interpreted as PHP code. Like the example from the manual for preg_replace:

preg_replace("/(<\/?)(\w+)([^>]*>)/e",
             "'\\1'.strtoupper('\\2').'\\3'",
             $html_body);

To make this easier, the data in a backreference with /e is run through addslashes() before being inserted in your replacement expression. So if you have the string

He said: "You're here"

It would become:

He said: \"You\'re here\"

...and be inserted into the expression.
However, if you put this inside a set of single quotes, PHP will not strip away all the slashes correctly! Try this:

print ' He said: \"You\'re here\" ';
Output: He said: \"You're here\"

This is because the sequence \" inside single quotes is not recognized as anything special, and it is output literally.

Using double-quotes to surround the string/backreference will not help either, because inside double-quotes, the sequence \' is not recognized and also output literally. And in fact, if you have any dollar signs in your data, they would be interpreted as PHP variables. So double-quotes are not an option.

The 'solution' is to manually fix it in your expression. It is easiest to use a separate processing function, and do the replacing there (i.e. use "my_processing_function('\\1')" or something similar as replacement expression, and do the fixing in that function).

If you surrounded your backreference by single-quotes, the double-quotes are corrupt:
$text = str_replace('\"', '"', $text);

People using preg_replace with /e should at least be aware of this.

I'm not sure how it would be best fixed in preg_replace. Because double-quotes are a really bad idea anyway (due to the variable expansion), I would suggest that preg_replace's auto-escaping is modified to suit the placement of backreferences inside single-quotes (which seemed to be the intention from the start, but was incorrectly applied).
up
-1
sreekanth at outsource-online dot net
3 years ago
if your intention to code and decode mod_rewrite urls and handle it with php and mysql ,this should work

to convert to url
$url = preg_replace('/[^A-Za-z0-9_-]+/', '-', $string);

And to check in mysql with the url value,use the same expression discounting '-'.
first replace the url value  with php using preg_replace  and use with mysql REGEXP

$sql = "select * from table where fieldname_to_check REGEXP '".preg_replace("/-+/",'[^A-Za-z0-9_]+',$url)."'"
up
-1
jas at rephunter dot net
1 year ago
It is recommended that str_replace should be used instead of preg_replace when regex is not needed. In particular, at http://php.net/manual/en/function.str-replace.php it says "If you don't need fancy replacing rules (like regular expressions), you should always use this function instead of preg_replace()."

While this is usually true, I have found a significant exception to this guideline and benchmarked to show that str_replace is in fact slower in the case when the pattern to be replaced is an array. In particular, when the number of elements in the pattern gets to be around 7, then the preg_replace is faster. Of course this would depend on the specifics of the regex.

In my case I had pattern arrays with about 40 elements. This becomes far slower with str_replace when there is a simple regex that will do the job.

Below is the benchmark with some examples of arrays for patterns and an equivalent regex.

<?php

/**
* Title:    Test Harness
* Author:    **********
* Date:    27-May-13
* Project:    *********
* Purpose:    Test timer comparing str_replace and preg_replace
*
* Results:
*
* 1. preg_replace is a lot faster!
* empty loop takes 0.12380504608154 microseconds.
* Array size: 34
*  *** str_replace:  6.0163598060608 microseconds.
*  *** preg_replace: 2.1114869117737 microseconds.
*
* 2. str_replace is faster:
* empty loop takes 0.11837291717529 microseconds.
* Array size: 3
*  *** str_replace:  1.7525472640991 microseconds.
*  *** preg_replace: 2.0717389583588 microseconds.
*
* 3. preg_replace is faster:
* empty loop takes 0.11982989311218 microseconds.
* Array size: 10
*  *** str_replace:  2.6692891120911 microseconds.
*  *** preg_replace: 2.2716360092163 microseconds.
*
* 3. about the same: 6 element array is breakeven point to switch to preg_replace
* empty loop takes 0.12036299705505 microseconds.
* Array size: 6
*  *** str_replace:  2.0874700546265 microseconds.
*  *** preg_replace: 2.1840009689331 microseconds.
*
*/

$iterations = 1000000;
$str1 = 'this is a - test';

// empty loop
$begin = microtime(true);
for (
$i = 0; $i < $iterations; $i++)
{
}
$end = microtime(true);
$empty_loop_time = $end - $begin;
echo
"empty loop takes $empty_loop_time microseconds.\n";

// test1 loop
// alternate array variations
//$aStr = array('\r\n','\r','\n','%','.','(',')',':',';','&',"\x07","\x15",'!','\'','"',
'#','^','*','_','=','+','\\','?','|','<','>','{','}','’','`','~','“','?','$');
//$aStr = array('\r\n','\r','\n');
//$aStr = array('0','1','2','3','4','5','6','7','8','9');
$aStr = array('0','1','2','3','4','5','6');
$cnt = count($aStr);
echo
'Array size: ' . $cnt . "\n";
$begin = microtime(true);
for (
$i = 0; $i < $iterations; $i++)
{
   
$str2 = str_replace($aStr,'',$str1);
}
$end = microtime(true);
$test1_loop_time = $end - $begin;

echo
' *** str_replace:  ';
$test_time = ($test1_loop_time - $empty_loop_time) * 1000000 / $iterations;
echo
"$test_time microseconds.\n";

// test2 loop
$begin = microtime(true);
for (
$i = 0; $i < $iterations; $i++)
{
   
$str2 = preg_replace('/[^a-zA-Z\s]/','',$str1);
}
$end = microtime(true);
$test2_loop_time = $end - $begin;

echo
' *** preg_replace: ';
$test_time = ($test2_loop_time - $empty_loop_time) * 1000000 / $iterations;
echo
"$test_time microseconds.\n";

?>
up
-1
henke at henke37 dot cjb dot net
2 years ago
Warning: not all strings are from a regular language, some are from context free grammars.

Here is a pair of strings that are impossible to correctly parse with regular expressions:

a*(b+(c*d)-e)/(f-(g*h)+i)-j

a[i]b[i]c[/i]d[/i]e[i]f[i]g[/i]h[/i]
up
-1
erik dot stetina at gmail dot com
3 years ago
simple function to remove comments from string

<?php
function remove_comments( & $string )
{
 
$string = preg_replace("%(#|;|(//)).*%","",$string);
 
$string = preg_replace("%/\*(?:(?!\*/).)*\*/%s","",$string); // google for negative lookahead
 
return $string;
}
?>

USAGE:
<?php
$config
= file_get_contents("config.cfg");
print
"before:".$config;
remove_comments($config);
print
"after:".$config;
?>

OUTPUT:
before:
/*
*  this is config file
*/
; logdir
LOGDIR ./log/
// logfile
LOGFILE main.log
# loglevel
LOGLEVEL 3
after:

LOGDIR ./log/

LOGFILE main.log

LOGLEVEL 3
up
-2
David
6 years ago
Take care when you try to strip whitespaces out of an UTF-8 text. Using something like:

<?php
$text
= preg_replace( "{\s+}", ' ', $text );
?>

brokes in my case the letter à which is hex c3a0. But a0 is a whitespace. So use

<?php
$text
= preg_replace( "{[ \t]+}", ' ', $text );
?>

to strip all spaces and tabs, or better, use a multibyte function like mb_ereg_replace.
up
-2
zzapper
2 years ago
yes you can use different pattern delimiters useful when working on an Url

<?php
$logo
=preg_replace('#\\.\\./images#','/images',$logo);
?>
up
-4
Svoop
5 years ago
I have written a short introduction and a colorful cheat sheet for Perl Compatible Regular Expressions (PCRE):

http://www.bitcetera.com/en/techblog/2008/04/01/regex-in-a-nutshell/
up
-6
roscoe-p-coltrane
4 years ago
Reading arguments against variables in CSS got me to thinking.

Process your CSS files in a fashion similar to the following. This particular routine is certainly not the most efficient way, but is what I came up with on the spur of the moment. The prefix token is entirely arbitrary - everything between the leading colon and terminating semicolon is the target. In this way, default values can be put in place, and the constant identifiers simply left as comments, should the stylesheet be used without processing; this would also inhibit your editor from emitting errors about your odd syntax. The declaration pattern at the top assumes something like this:

/*@css_const
[
     bgc_a=#ccccee,
        fc_a=#000099,
     bgc_b=#5555cc,
     bgc_c=#eeeeff,
     bgc_d=#599fee
]
*/

...within the target CSS file.

Usage like so:

.Element {
     font-size:10pt;
     color:#000/*fc_a*/;
     background-color:#fff/*bgc_a*/;
}

And then...

<?php
$dec_pat
= '/^\/\*\@css_const\s+\[(.*)\]\s+\*\//Ums';
preg_match_all($dec_pat,$css,$m);
$lhs = array();
$rhs = array();
foreach(
$m[1] as &$p) {
   
$p = explode(",",$p);
    foreach(
$p as &$q) {
        list(
$k,$v) = explode("=",trim($q));
       
$lhs[] = '/(\w+\:).*\/\*' . $k . '\*\/;$/Um';
       
$rhs[] = '\1' . $v . ';';
    }
}
$css = preg_replace($lhs,$rhs,$css);
// spit it out or return it; whatever
?>

...resulting, of course, in:

.Element {
     font-size:10pt;
     color:#000099;
     background-color:#ccccee;
}

Again, efficiency was not the immediate goal, so please don't slay me...
up
-3
randall dot reynolds at rightnow dot com
3 years ago
The instructions say, use \\\\ (four backslashes) to represent a backslash.  You can shorthand this and use \\\ (three backslashes) to represent a backslash, because of the way the parsers read it.

It appears the double escaping is required because the string is parsed twice, once by PHP and once by the regular expression generator. Thus it is expected that the PHP parser turns both instances above into \\ (two backslashes) for the PCRE parser.
To Top