Re: insert PDF table in database [message #181560 is a reply to message #181555] |
Tue, 21 May 2013 19:48 |
J.O. Aho
Messages: 194 Registered: September 2010
Karma:
|
Senior Member |
|
|
On 21/05/13 18:27, Michael Vilain wrote:
> In article <b6c1cfb3-1f8b-48c5-8822-25d10402d896(at)googlegroups(dot)com>,
> sarika <sarikasoni12(at)gmail(dot)com> wrote:
>
>> Hi All
>>
>> What i want is to read content in PDF table and convert it into either XML or
>> associative array to be inserted in database on the fly.
>>
>> I have gone through many libraries on net providing text extraction from PDF
>> and converting in array but that array does not seem to be useful as its not
>> associative array and array indexing is also not proper.
>>
>> Thanks in advance for the replies but i am really stuck with this major
>> issue.
>> My project manager wants me to implement as soon as possible.
>
> I ran across this problem with various bank statements that I downloaded
> via my bank's personal web site. The PDFs were encrypted and set with
> certain properties that didn't allow scanning of the text layer. Unless
> you are able to decrypt and do OCR on the PDFs, you're wasting your time
> here. The problem isn't as simple as your manager would think. At
> best, you could offer a partial solution of being able to scan "some"
> PDF files but without libraries to decrypt and OCR the text, that's all
> you can do.
>
> Those libraries are probably on-line somewhere for a fee. Buy the
> solution if you're in a time crunch. Beating the fastest horse on your
> team is poor project management skills and won't get him the code any
> faster.
>
Most likely the company said they can do this to their western customer,
then a manager gets the task to see to that his team solves the problem,
the work is then pushed to a "shadow resource" who looks for solutions
online. If not managing to solve the issue, there is always hundreds of
others to replace that person with. At least that is my experience how
things work in India.
--
//Aho
|
|
|