PDF Database when going paperless

Raizinman

Platinum Member
Sep 7, 2007
2,353
74
91
meettomy.site
Our office has just decided to go paperless. We have about 10,000 reports (about 25 pages each report) that we are scanning into the computer. We do need to reference these reports.

What would you recommend as a means to database these 10,000 reports.

The scanned reports we are putting the company name and date such as: AcmeCompany101517.pdf or BakerCompany121516.pdf

Any thoughts on how to handle this data?
 

dclive

Elite Member
Oct 23, 2003
5,626
2
81
Much of your answer revolves around how you plan to use the software (for example, if it's two people, and the Windows guys give you a server with a \Server\PDFStuff share, and you can make your own subdirectories to organize, is that sufficient, or do you need more?) - and you haven't provided that information...

I know nothing of the software I speak, but I thought your question was interesting and did some googling for a few minutes.

https://www.foxitsoftware.com/products/pdf-ifilter/
https://blogs.msdn.microsoft.com/opal/2010/02/09/pdf-ifilter-test-with-sharepoint-2010/

Please let us know what you find.
 

Raizinman

Platinum Member
Sep 7, 2007
2,353
74
91
meettomy.site
Our company is relatively small: 10 employees and we do not use sharepoint. The foxit software version 3.0 appears to do the quickest and best searches after looking at their demo on their website. It's a little pricy at around $600 or so, but I think the company will go for it if their 30 day free trial goes well. Thanks everyone.
 

Kaido

Elite Member & Kitchen Overlord
Feb 14, 2004
48,518
5,340
136
Google Drive for Business (gSuite) is awesome, if you don't mind a (reasonable) recurring monthly fee:

https://gsuite.google.com/products/drive/

They can do OCR on PDF files & their search is, of course, unparalleled:

https://drive.googleblog.com/2010/06/optical-character-recognition-ocr-in.html

If you really want to get crazy about it, I've done some projects with Square9 & it is pretty dang awesome, although it is big-budget stuff:

https://www.square-9.com

On a tangent, Snagit just updated to version 2018 & added OCR back in, so you can do text capture from images: (ex. to snag text from pictures & PDF's that haven't been OCR'd yet)

https://www.techsmith.com/snagit-2018-press-release.html
 

sdifox

No Lifer
Sep 30, 2005
96,217
15,787
126
Are these your reports or someone else's? You don't have the original word docs?

Also, do you guys have filing system setup for your paper originals? Just because you digitise them doesn't mean you bave to scrap the filing system.
 

Malogeek

Golden Member
Mar 5, 2017
1,390
778
136
yaktribe.org
Do you need to find them purely by folder+file name structure or do you need to be able to search the contents of the PDF files as well? That decision will give you a clearer direction of how complex you may need to go with the filing.
 
Reactions: Mike64

Raizinman

Platinum Member
Sep 7, 2007
2,353
74
91
meettomy.site
Part of the reason we are going digital is because we have 22 four drawer file cabinets and are out of space. Also, too many files have been put in the wrong place or destroyed. Going digital eliminates all these problems. We need to seach the reports by word which is why we have been OCR'ing all the report so that we can search inside each report. We tried the demo of Foxit Phantom PDF and it appears to suit our needs, just need to overcome that learning curve to work the software.
 

Mike64

Platinum Member
Apr 22, 2011
2,108
101
91
Part of the reason we are going digital is because we have 22 four drawer file cabinets and are out of space. Also, too many files have been put in the wrong place or destroyed. Going digital eliminates all these problems. We need to seach the reports by word which is why we have been OCR'ing all the report so that we can search inside each report. We tried the demo of Foxit Phantom PDF and it appears to suit our needs, just need to overcome that learning curve to work the software.
OK, you hadn't actually mentioned that you've been OCRing them, just that they'd been "scanned".

If you need to be able search for literally any word(s) that appear anywhere in one or more of the docs, be sure whatever software you end up with can/will index the entire collection, preferably in the background and preferably as they're added to the database/file structure, rather than literally searching through every document word-by-word, every time someone needs to look for something. Otherwise a lot of those searches will (more or less randomly) take forever to run with that many total reports being accessible. If it's feasible in your circumstances, also consider using frequently used "tags" or "keywords", leaving "search entire file contents" as a last resort option for when the title/filename and/or keywords aren't enough...
 
Last edited:
sale-70-410-exam    | Exam-200-125-pdf    | we-sale-70-410-exam    | hot-sale-70-410-exam    | Latest-exam-700-603-Dumps    | Dumps-98-363-exams-date    | Certs-200-125-date    | Dumps-300-075-exams-date    | hot-sale-book-C8010-726-book    | Hot-Sale-200-310-Exam    | Exam-Description-200-310-dumps?    | hot-sale-book-200-125-book    | Latest-Updated-300-209-Exam    | Dumps-210-260-exams-date    | Download-200-125-Exam-PDF    | Exam-Description-300-101-dumps    | Certs-300-101-date    | Hot-Sale-300-075-Exam    | Latest-exam-200-125-Dumps    | Exam-Description-200-125-dumps    | Latest-Updated-300-075-Exam    | hot-sale-book-210-260-book    | Dumps-200-901-exams-date    | Certs-200-901-date    | Latest-exam-1Z0-062-Dumps    | Hot-Sale-1Z0-062-Exam    | Certs-CSSLP-date    | 100%-Pass-70-383-Exams    | Latest-JN0-360-real-exam-questions    | 100%-Pass-4A0-100-Real-Exam-Questions    | Dumps-300-135-exams-date    | Passed-200-105-Tech-Exams    | Latest-Updated-200-310-Exam    | Download-300-070-Exam-PDF    | Hot-Sale-JN0-360-Exam    | 100%-Pass-JN0-360-Exams    | 100%-Pass-JN0-360-Real-Exam-Questions    | Dumps-JN0-360-exams-date    | Exam-Description-1Z0-876-dumps    | Latest-exam-1Z0-876-Dumps    | Dumps-HPE0-Y53-exams-date    | 2017-Latest-HPE0-Y53-Exam    | 100%-Pass-HPE0-Y53-Real-Exam-Questions    | Pass-4A0-100-Exam    | Latest-4A0-100-Questions    | Dumps-98-365-exams-date    | 2017-Latest-98-365-Exam    | 100%-Pass-VCS-254-Exams    | 2017-Latest-VCS-273-Exam    | Dumps-200-355-exams-date    | 2017-Latest-300-320-Exam    | Pass-300-101-Exam    | 100%-Pass-300-115-Exams    |
http://www.portvapes.co.uk/    | http://www.portvapes.co.uk/    |