FUDforum
Fast Uncompromising Discussions. FUDforum will get your users talking.

Home » FUDforum Development » Bug Reports » no results when a search includes a numerical character
Show: Today's Messages :: Polls :: Message Navigator
Return to the default flat view Create a new topic Submit Reply
Re: no results when a search includes a numerical character - FIXED [message #41036 is a reply to message #41018] Sun, 11 May 2008 10:32 Go to previous messageGo to previous message
srchild is currently offline  srchild   United Kingdom
Messages: 88
Registered: December 2003
Location: UK
Karma:
Member
Ilia wrote on Thu, 08 May 2008 00:47

Set locale to C.


I found it was already finding it as C.

However stepping through the code I have found the cause, a bug in function text_to_worda

function text_to_worda($text)
{
 $a = array();

 /* if no good locale, default to splitting by spaces */
 if (!$GLOBALS['good_locale']) {
   $GLOBALS['usr']->lang = 'latvian';
 }

 $text = strip_tags(reverse_fmt($text));
 while (1) {
  switch ($GLOBALS['usr']->lang) {
   case 'chinese_big5':
   case 'chinese':
   case 'japanese':
   case 'korean':
    return mb_word_split($text, $GLOBALS['usr']->lang);
    break;

   case 'latvian':
   case 'russian-1251':
    $t1 = array_unique(preg_split('![\x00-\x40]+!', $text, -1, PREG_SPLIT_NO_EMPTY));
    break;

   default:
    $t1 = array_unique(str_word_count(strtolower($text),1,'1234567890'));
    if ($text && !$t1) { /* fall through to split by special chars */
     $GLOBALS['usr']->lang = 'latvian';
     continue;
    }
    break;
                }


The first time through if finds locale as C and language as English, and so as desired goes to 'default':

array_unique(str_word_count(strtolower($text),1,'1234567890'));


However, if any message makes it fall through this:

if ($text && !$t1) { /* fall through to split by special chars */
     $GLOBALS['usr']->lang = 'latvian';
     continue;
    }


then $GLOBALS['usr']->lang is set to Latvian and this persists for the rest of the reindex, affecting parsing of every subsequent message.

When indexing a single message it wouldn't matter that $GLOBALS['usr']->lang gets set to Latvian, since the next message would be a fresh start with it set to English once more. But with the reindex running through all messages in once script, then every subsequent message is processed as though language is Latvian.

So I just changed three lines like this:

function text_to_worda($text)
{
 $a = array();

 /* if no good locale, default to splitting by spaces */
 if (!$GLOBALS['good_locale']) {
   $GLOBALS['usr']->lang = 'latvian';
 }

// use local variable for message language
$thismessagelang = $GLOBALS['usr']->lang;

 $text = strip_tags(reverse_fmt($text));
 while (1) {

//  switch ($GLOBALS['usr']->lang) {
//  switch on message language
  switch ($thismessagelang) {

   case 'chinese_big5':
   case 'chinese':
   case 'japanese':
   case 'korean':
    return mb_word_split($text, $GLOBALS['usr']->lang);
    break;

   case 'latvian':
   case 'russian-1251':
    $t1 = array_unique(preg_split('![\x00-\x40]+!', $text, -1, PREG_SPLIT_NO_EMPTY));
    break;

   default:
    $t1 = array_unique(str_word_count(strtolower($text),1,'1234567890'));
    if ($text && !$t1) { /* fall through to split by special chars */

//   if resetting language, do it locally not globally
//   $GLOBALS['usr']->lang = 'latvian';
     $thismessagelang = 'latvian';

     continue;
    }
    break;
                }


This seems to have fixed it for me, my index now includes numbers as required.

Thanks


Simon Child
[Message index]
 
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Previous Topic: i18n characters not showing in dates! (utf-8)
Next Topic: Timezones
Goto Forum:
  

-=] Back to Top [=-
[ Syndicate this forum (XML) ] [ RSS ]

Current Time: Mon Nov 25 02:43:56 GMT 2024

Total time taken to generate the page: 0.03961 seconds