FUDforum: comp.lang.php » An extremely large hash lookup mechanism

Home » Imported messages » comp.lang.php » An extremely large hash lookup mechanism

Show: Today's Messages :: Polls :: Message Navigator

An extremely large hash lookup mechanism [message #172232]

Mon, 07 February 2011 12:06

ram
Messages: 4
Registered: February 2011

Karma: 0

Junior Member

I have Mysql table with customerid (big int) & customer_unique_key
(varchar )

I have a php script that needs to upload customers into groups.
The customer_unique_key will be uploaded and all the customerids
should be entered to a new table.

Initially My script was doing a per-record query to extract
customerid and print.
But this turned out to be toooo slow , since the upload file may have
upto a million records.

Now I modified the script to read all customer_unique_key ->
customerid as key value pairs into an array
This works fine and fast , but hogs the memory and crashes whenever
the number of records crosses around 3-4 million.

What is the best way I can implement a hash lookup ? Should I use a
CDB library ?

Report message to a moderator

Re: An extremely large hash lookup mechanism [message #172233 is a reply to message #172232]

Mon, 07 February 2011 13:15

Peter H. Coffin
Messages: 245
Registered: September 2010

Karma: 0

Senior Member

On Mon, 7 Feb 2011 04:06:30 -0800 (PST), ram wrote:
> I have Mysql table with customerid (big int) & customer_unique_key
> (varchar )
>
> I have a php script that needs to upload customers into groups.
> The customer_unique_key will be uploaded and all the customerids
> should be entered to a new table.
>
> Initially My script was doing a per-record query to extract
> customerid and print.
> But this turned out to be toooo slow , since the upload file may have
> upto a million records.
>
> Now I modified the script to read all customer_unique_key ->
> customerid as key value pairs into an array
> This works fine and fast , but hogs the memory and crashes whenever
> the number of records crosses around 3-4 million.
>
>
> What is the best way I can implement a hash lookup ? Should I use a
> CDB library ?

No, you should use a real database. Unless you plan on essentially never
adding a customer. But the way you've framed the question (including
babbling about hashes) makes it sound like you've already "solved" your
problem, and have decided you want to use cdb. So, you might as well
have at it, find out that it's not fixing your problem, then go back to
comp.databases.mysql again, and this time start with talking about your
problem instead of asking how to implement your solution.

--
5. The artifact which is the source of my power will not be kept on the
Mountain of Despair beyond the River of Fire guarded by the Dragons
of Eternity. It will be in my safe-deposit box. The same applies to
the object which is my one weakness. --Peter Anspach "Evil Overlord"

Report message to a moderator

Re: An extremely large hash lookup mechanism [message #172234 is a reply to message #172232]

Mon, 07 February 2011 13:29

Erwin Moller
Messages: 228
Registered: September 2010

Karma: 0

Senior Member

On 2/7/2011 1:06 PM, ram wrote:
> I have Mysql table with customerid (big int)& customer_unique_key
> (varchar )
>
> I have a php script that needs to upload customers into groups.
> The customer_unique_key will be uploaded and all the customerids
> should be entered to a new table.
>
> Initially My script was doing a per-record query to extract
> customerid and print.
> But this turned out to be toooo slow , since the upload file may have
> upto a million records.
>
> Now I modified the script to read all customer_unique_key ->
> customerid as key value pairs into an array
> This works fine and fast , but hogs the memory and crashes whenever
> the number of records crosses around 3-4 million.
>
>
> What is the best way I can implement a hash lookup ? Should I use a
> CDB library ?

A few general approaches:

1) Increase memory for PHP for this script.
2) Start working in batches. Simply cut your original file into smaller
parts and run then sequentially through your "inserter".

But it isn't clear to me what your problem is, hence the above advice. ;-)

Please describe in more detail how your sourcedata looks and how your
table looks. And also give an example of a single insert you want to do.

(Right now I don't understand why your array-approach at all, let alone
why it is faster.)

Regards,
Erwin Moller

--
"That which can be asserted without evidence, can be dismissed without
evidence."
-- Christopher Hitchens

Report message to a moderator

Re: An extremely large hash lookup mechanism [message #172237 is a reply to message #172233]

Mon, 07 February 2011 14:22

ram
Messages: 4
Registered: February 2011

Karma: 0

Junior Member

Sorry about hashes , I am a perl programmer so used to hashes for everything.

Yes the CDB lookup is working , but I am not sure I am doing the right thing by selecting all rows from mysql to cdb and looking up the cdb.

Report message to a moderator

Re: An extremely large hash lookup mechanism [message #172239 is a reply to message #172237]

Mon, 07 February 2011 15:43

Erwin Moller
Messages: 228
Registered: September 2010

Karma: 0

Senior Member

On 2/7/2011 3:22 PM, Ram wrote:
> Sorry about hashes , I am a perl programmer so used to hashes for everything.
>
> Yes the CDB lookup is working , but I am not sure I am doing the right thing by selecting all rows from mysql to cdb and looking up the cdb.
>

Please please please, get a working newsreader.

I was expecting more of a Perl programmer than using that google-mess as
an interface to usenet.
It isn't even working. My newsreader already lost track of this
conversation and is displaying it in two different threads.

Do you use Firefox?
If you like that, have a look at Thunderbird, it was born in the same nest.
http://www.mozillamessaging.com/en-US/thunderbird/

Erwin Moller

--
"That which can be asserted without evidence, can be dismissed without
evidence."
-- Christopher Hitchens

Report message to a moderator

Re: An extremely large hash lookup mechanism [message #172241 is a reply to message #172239]

Mon, 07 February 2011 17:18

Jerry Stuckle
Messages: 2598
Registered: September 2010

Karma: 0

Senior Member

On 2/7/2011 10:43 AM, Erwin Moller wrote:
>
> Please please please, get a working newsreader.
>
> I was expecting more of a Perl programmer than using that google-mess as
> an interface to usenet.
> It isn't even working. My newsreader already lost track of this
> conversation and is displaying it in two different threads.
>

Not just you, Erwin - he didn't respond to the thread so the link was
lost.

> Erwin Moller
>
>

--
==================
Remove the "x" from my email address
Jerry Stuckle
JDS Computer Training Corp.
jstucklex(at)attglobal(dot)net
==================

Report message to a moderator

Re: An extremely large hash lookup mechanism [message #172246 is a reply to message #172232]

Tue, 08 February 2011 00:28

Jo Schulze
Messages: 15
Registered: January 2011

Karma: 0

Junior Member

ram wrote:

> Now I modified the script to read all customer_unique_key ->
> customerid as key value pairs into an array
> This works fine and fast , but hogs the memory and crashes whenever
> the number of records crosses around 3-4 million.

The penalty of PHP arrays is that they eat up memory way more than Perl
arrays do. This isn't a problem as long as you build up structures in
the size of hundred thousands or in your case millions.

Basic answer: don't do it. As suggested, don't fetch the result with
million of hits but process them one-by-one.

Report message to a moderator

Re: An extremely large hash lookup mechanism [message #172248 is a reply to message #172239]

Tue, 08 February 2011 05:38

ram
Messages: 4
Registered: February 2011

Karma: 0

Junior Member

Sorry
I sure can use a nntp client ... but I dont know which server would allow me to post.
Google is quiet a convenient interface .. I know it may still be messing up the usenet style , but I believe folks at google are working hard to make it a standard.

Report message to a moderator

Re: An extremely large hash lookup mechanism [message #172250 is a reply to message #172248]

Tue, 08 February 2011 07:06

Erwin Moller
Messages: 228
Registered: September 2010

Karma: 0

Senior Member

On 2/8/2011 6:38 AM, Ram wrote:
> Sorry
> I sure can use a nntp client ... but I dont know which server would allow me to post.

Call your ISP?
Or maybe ask somewhere, eg in here.

I am sure many will be happy to help you with such a problem, even when
it is totally off-topic, if that results in less useless threads.

> Google is quiet a convenient interface .. I know it may still be messing up the usenet style , but I believe folks at google are working hard to make it a standard.
>

If you think that Googlegroups gives you a convenient interface, I think
our tastes differ.
You seem to be unfamiliar with even the most basic usenet conventions,
like quoting what it is you are responding too. So I wonder if you
actually ever used a serious newsreader.

About Google working hard on their usenet interface: Whahahaha!
Excuse me. That was funny.
If the good folks at Google really cared they would have produced
something that worked, long ago.
And since I don't believe they lack competent programmers, I can only
conclude they must lack the will to improve it.

Regards,
Erwin Moller

PS: Did you notice you started yet another fresh (top)thread?

--
"That which can be asserted without evidence, can be dismissed without
evidence."
-- Christopher Hitchens

Report message to a moderator

Re: An extremely large hash lookup mechanism [message #172251 is a reply to message #172250]

Tue, 08 February 2011 07:47

Michael Fesser
Messages: 215
Registered: September 2010

Karma: 0

Senior Member

.oO(Erwin Moller)

> About Google working hard on their usenet interface: Whahahaha!
> Excuse me. That was funny.
> If the good folks at Google really cared they would have produced
> something that worked, long ago.
> And since I don't believe they lack competent programmers, I can only
> conclude they must lack the will to improve it.

ACK

They bought the old DejaNews archive some years ago, but since then they
didn't really take care of that treasure. Instead their own „Google
Groups“ just suck, and every now and then the system completely breaks
down, e.g. the archive search returns wrong results or none at all - for
several months! Not to mention the crappy Web interface itself.

Conclusion: The Usenet is not really important for Google, so better use
a real newsserver, for example <http://individual.net> (10€/year).

Micha

Report message to a moderator

Re: An extremely large hash lookup mechanism [message #172254 is a reply to message #172248]

Tue, 08 February 2011 13:14

alvaro.NOSPAMTHANX
Messages: 277
Registered: September 2010

Karma: 0

Senior Member

El 08/02/2011 6:38, Ram escribió/wrote:
> I sure can use a nntp client ... but I dont know which server would allow me to post.

There're several services available ranging from free to cheap. I
particularly enjoy http://www.eternal-september.org

> Google is quiet a convenient interface .. I know it may still be
> messing up the usenet style , but I believe folks at google are
> working hard to make it a standard.

That's exactly the problem: they'd love their system to become the
de-facto standard, just like IE6 become the de-facto standard for HTML
(and managed to slow down the world wide web for years).

--
-- http://alvaro.es - Álvaro G. Vicario - Burgos, Spain
-- Mi sitio sobre programación web: http://borrame.com
-- Mi web de humor satinado: http://www.demogracia.com
--

Report message to a moderator

Previous Topic:	String Replacement
Next Topic:	mail function and wordwrap

Goto Forum:

-=] Back to Top [=-

[ Syndicate this forum (XML) ] [

]

Current Time: Tue Nov 26 04:50:54 GMT 2024

Total time taken to generate the page: 0.02256 seconds