Just out of curiosity...
There are 6 messages totalling 301 lines in this issue.
Topics of the day:
1. OT: Text scanning (6)
----------------------------------------------------------------------
Date: Mon, 9 May 2011 07:44:45 +0100
From: Doug Browne <[log in to unmask]>
Subject: Re: OT: Text scanning
I use an Epson multi-function with VeloCRaptor and find it works well.
As Brian says, the paper MUST be flat on the scanner. This is no problem =
for individual sheets, but tricky if half a book is hanging out of the =
scanner. I try to make the page flat by propping the book up with a pile =
of paperbacks.
Doug
=20
On 9 May 2011, at 04:36, Brian Ferguson wrote:
> The answer is 'Yes'.
>=20
> I have used VueScan at various times despite having scanning =
applications with an Epson TX710W Printer/Scanner.
> It is extremely simple to use and not too expensive. I don't think =
there is an available scanner missing from VueScan's lists although =
sometimes the name of your scanner may differ from VueScan's names =
because of different countries. E.g. my Epson printer/scanner is known =
as Artisan.
>=20
> You can scan to TIFF or PDF; if the latter then PDFpen Pro will help =
for OCR and corrections. In PDF Pro you can create multi-page documents =
by using the 'Input | Multi page' option. You can preview multiple =
pages, rearrange the order, and then save the pages to a TIFF or PDF =
file.
>=20
> VueScan go to <www.hamrick.com>
>=20
> It is updated weekly.
>=20
> Having the paper on the scanner table is essential.
>=20
>=20
> =3D=3D=3D=3D=3D=3D
>=20
> On 06/05/2011, at 11:56 PM, THDW wrote:
>=20
>> Listers
>>=20
>> I am in the process of scanning old books.
>>=20
>> I am using a five year old Canon printer scanner and then once the MP =
Navigator has scanned, I use VelOCRaptor to do the ocr work.
>>=20
>> I sometime find that entire pages need revision. I am therefore =
wondering whether there has been an improvement in scanners for OCR work =
over the last few years. Or is my problem quite simply I am not getting =
the text flat enough against the scanner window ?
>>=20
>> Would be grateful for any ideas for making text less tedious.
>=20
------------------------------
Date: Mon, 9 May 2011 08:05:36 +0100
From: Alan & Elaine <[log in to unmask]>
Subject: Re: OT: Text scanning
Hi Tim,
Any 200 dpi B%W or greyscale resolution scanner is adequate for OCR =
scanning. That means just about every scanner on the market. All readers =
should cope with imperfectly angled scans and good OCR program will also =
cope with imperfectly flattened pages (e.g. thick book scanning), but if =
you are getting misreads from the well flattened areas, you have lousy =
software. There are some programs out there that use wild contextual =
word-guessing to compensate for poor actual character recognition. If =
you find your software guessing =96 for instance. turning 'bard' int =
'bird' =96 dump it; a good reader will call for your intervention on ALL =
glyphs it has problems reading and not fake its claimed reading score =
with contextual guesswork.=20
At the time I was involved in the OCR business, Omnipage was the clear =
leader. It probably still is, but the bells and whistles of its current =
iteration exceed most people's needs, and it charges for its quality. I =
have found the ABBYY FineReader software, provided FREE of charge with =
my Epson scanner, totally adequate for my current needs, with only badly =
inked characters being queried. I presume Canon reading software must be =
on a par with the Epson offering to stay competitive. Do you perhaps =
need to download an update to your scanner's drivers? I've not tried =
VelOCRaptor.
All the best,
Alan
On 6 May 2011, at 14:56, THDW wrote:
> Listers
>=20
> I am in the process of scanning old books.
>=20
> I am using a five year old Canon printer scanner and then once the MP =
Navigator has scanned, I use VelOCRaptor to do the ocr work.
>=20
> I sometime find that entire pages need revision. I am therefore =
wondering whether there has been an improvement in scanners for OCR work =
over the last few years. Or is my problem quite simply I am not getting =
the text flat enough against the scanner window ?
>=20
> Would be grateful for any ideas for making text less tedious.
>=20
>=20
> T
>=20
>=20
>=20
------------------------------
Date: Mon, 9 May 2011 08:54:36 +0100
From: Alan & Elaine <[log in to unmask]>
Subject: Re: OT: Text scanning
On 6 May 2011, at 14:56, THDW wrote:
> once the MP Navigator has scanned, I use
Does the above mean that you are using a two stage process (scanning to =
a Tiff or similar document and only then bringing the OCR prog into =
play)? Each stage will introduce some loss of definition. You might do =
better doing everything from the OCR interface, i.e., reading directly =
off the scanner glass.
Cheers
Alan
------------------------------
Date: Mon, 9 May 2011 10:04:21 -0400
From: THDW <[log in to unmask]>
Subject: Re: OT: Text scanning
--Apple-Mail-2628-225146986
Content-Transfer-Encoding: quoted-printable
Content-Type: text/plain;
charset=us-ascii
On 9 May 2011, at 03:54, Alan & Elaine wrote:
> Does the above mean that you are using a two stage process (scanning =
to a Tiff or similar document and only then bringing the OCR prog into =
play)? Each stage will introduce some loss of definition. You might do =
better doing everything from the OCR interface, i.e., reading directly =
off the scanner glass.
I scan with the Canon programme, then save multiple pages in .pdf.
I then drop the .pdf file on Velocraptor.
Can't help thinking that putting a pun into a programme's name along =
with the four letters C R A P is not the best of marketing policies.
T
THDW
[log in to unmask]
--Apple-Mail-2628-225146986
Content-Transfer-Encoding: quoted-printable
Content-Type: text/html;
charset=us-ascii
<html><head></head><body style=3D"word-wrap: break-word; =
-webkit-nbsp-mode: space; -webkit-line-break: after-white-space; =
"><br><div><div>On 9 May 2011, at 03:54, Alan & Elaine =
wrote:</div><br class=3D"Apple-interchange-newline"><blockquote =
type=3D"cite"><span class=3D"Apple-style-span" style=3D"border-collapse: =
separate; font-family: 'Lucida Grande'; font-style: normal; =
font-variant: normal; font-weight: normal; letter-spacing: normal; =
line-height: normal; orphans: 2; text-indent: 0px; text-transform: none; =
white-space: normal; widows: 2; word-spacing: 0px; =
-webkit-border-horizontal-spacing: 0px; -webkit-border-vertical-spacing: =
0px; -webkit-text-decorations-in-effect: none; -webkit-text-size-adjust: =
auto; -webkit-text-stroke-width: 0px; font-size: medium; ">Does the =
above mean that you are using a two stage process (scanning to a Tiff or =
similar document and only then bringing the OCR prog into play)? Each =
stage will introduce some loss of definition. You might do better doing =
everything from the OCR interface, i.e., reading directly off the =
scanner glass.</span></blockquote><br></div><div>I scan with the Canon =
programme, then save multiple pages in .pdf.</div><div><br></div><div>I =
then drop the .pdf file on Velocraptor.</div><div><br></div><div>Can't =
help thinking that putting a pun into a programme's name along with the =
four letters C R A P is not the best of marketing =
policies.</div><div><br></div><div>T</div><div><br></div><div><br></div><b=
r><div>
<span class=3D"Apple-style-span" style=3D"border-collapse: separate; =
color: rgb(0, 0, 0); font-family: 'Lucida Grande'; font-size: 16px; =
font-style: normal; font-variant: normal; font-weight: normal; =
letter-spacing: normal; line-height: normal; orphans: 2; text-align: =
auto; text-indent: 0px; text-transform: none; white-space: normal; =
widows: 2; word-spacing: 0px; -webkit-border-horizontal-spacing: 0px; =
-webkit-border-vertical-spacing: 0px; =
-webkit-text-decorations-in-effect: none; -webkit-text-size-adjust: =
auto; -webkit-text-stroke-width: 0; "><div>THDW</div><div><a =
href=3D"mailto:[log in to unmask]">[log in to unmask]=
m</a></div><div><br class=3D"webkit-block-placeholder"></div></span><br =
class=3D"Apple-interchange-newline">
</div>
<br></body></html>=
--Apple-Mail-2628-225146986--
------------------------------
Date: Mon, 9 May 2011 15:58:04 +0100
From: Alan & Elaine <[log in to unmask]>
Subject: Re: OT: Text scanning
On 9 May 2011, at 15:04, THDW wrote:
> On 9 May 2011, at 03:54, Alan & Elaine wrote:
>=20
>> Does the above mean that you are using a two stage process (scanning =
to a Tiff or similar document and only then bringing the OCR prog into =
play)? Each stage will introduce some loss of definition. You might do =
better doing everything from the OCR interface, i.e., reading directly =
off the scanner glass.
>=20
> I scan with the Canon programme, then save multiple pages in .pdf.
>=20
> I then drop the .pdf file on Velocraptor.
>=20
> Can't help thinking that putting a pun into a programme's name along =
with the four letters C R A P is not the best of marketing policies.
Perhaps we should all welcome the marketeer's honesty. ;)
Passing through a pdf phase might be the problem. Don't you have a means =
to do the character recognition off the screen before saving it in a =
printable format? I always scan from within the OCR program.
Alan=
------------------------------
Date: Mon, 9 May 2011 16:16:25 +0100
From: Doug Browne <[log in to unmask]>
Subject: Re: OT: Text scanning
I scan to .tiff and drop the file onto VeloCRaptor which works fine.
I always have good results from this program and certainly do NOT =
consider it crap.
By the way, the program Alan said he uses is currently $99.
Doug
On 9 May 2011, at 15:58, Alan & Elaine wrote:
> On 9 May 2011, at 15:04, THDW wrote:
>=20
>> On 9 May 2011, at 03:54, Alan & Elaine wrote:
>>=20
>>> Does the above mean that you are using a two stage process (scanning =
to a Tiff or similar document and only then bringing the OCR prog into =
play)? Each stage will introduce some loss of definition. You might do =
better doing everything from the OCR interface, i.e., reading directly =
off the scanner glass.
>>=20
>> I scan with the Canon programme, then save multiple pages in .pdf.
>>=20
>> I then drop the .pdf file on Velocraptor.
>>=20
>> Can't help thinking that putting a pun into a programme's name along =
with the four letters C R A P is not the best of marketing policies.
>=20
> Perhaps we should all welcome the marketeer's honesty. ;)
>=20
> Passing through a pdf phase might be the problem. Don't you have a =
means to do the character recognition off the screen before saving it in =
a printable format? I always scan from within the OCR program.
>=20
> Alan
------------------------------
End of NISUS Digest - 8 May 2011 to 9 May 2011 (#2011-4)
********************************************************