I have UTF-8 string that I want to search for all occurrences of img_(\d+)
.I have tried original
$pattern = '/img_(\d+)/u';preg_match_all($pattern, $text, $matches, PREG_OFFSET_CAPTURE);
but it gives me wrong offsets for the patterns.
I have also tried:
mb_internal_encoding('UTF-8');$pattern = 'img_(\d+)';mb_ereg_search_init($content, $pattern);$matches = []; while ($result = mb_ereg_search_regs()) { $matches[] = ['match' => $result[0],'offset' => mb_ereg_search_getpos() - mb_strlen($result[0]), ];}
but it gives me the same result as preg_match_all
.
However, when I run manually search with this:
$pos = mb_strpos($content, "img_1", 0);
I got correct offset.
Example code:
$str = "přílišžluťoučký img_1 kůn úpěl ďábelskéódy";$pattern = '/img_(\d+)/u';preg_match_all($pattern, $str, $matches, PREG_OFFSET_CAPTURE);print_r($matches); //gives 24 (wrong)echo mb_strpos($str, "img_1", 0); //gives 17 (correct)
How to fix this?