Number mangling reportedly not a Xerox-only issue

The question whether or not the number mangling in scanned documents only occurs at Xerox devices seems to have been answered. I got an eMail of a Brother customer telling me he's able to replicate the issue using my test number sheets on a Brother MFC-9140CDN. He attached the outcoming scan and indeed, there is at least one 6 substituted by a nice, clean 8.

Edit: Brother replied and stated that their devices are not affected by number mangling. At first glance, the Brother MFC seems to substitute a lot less numbers than the Xerox machines. Of course, as I haven't been able to reproduce the error on the machine myself, and additionally, the device ID is not shown in the PDF (there only is to be read “Paperport 12”), so I can't really tell where and how the data has been processed, so please take the information as hearsay. Also, we can never be sure the mails I get are part of some spin doctoring campaign, so be suspicious. I'll be adding potentially affected devices to my “hear-say list of affected devices” though. 8-)

Whats more, until now it looked like the newer Xerox devices (WorkCentre 78xx) would not be affected. Unfortunately, I may have to destroy this illusion as I got an email from someone familiar with the topic, telling me that one of his customers is able to replicate the problem on a bunch of eighteen 78xx machines. However, this is no real confirmation at all as I haven't seen any scans of these. Also, I'm investigating what particular 78xx devices are affected.

Next thing: Xerox seems to be telling their support right now. Thus, by someone saying he is xerox support employee (I can't verify this) I got a list of potentially affected devices as Xerox seems to have investigated theirselves. Let's cite the list:

  • ColorQube 87XX / 89XX
  • WorkCentre 57XX
  • WorkCentre 76XX
  • ColorQube 92XX / 93XX
  • WorkCentre 58XX
  • WorkCentre 77XX
  • WorkCentre 5030/5050
  • WorkCentre 6400
  • WorkCentre 78XX (there is something interesting here: I'm getting emails that the 7855 model, unlike other models, does NOT give any warning about character substitution when you select “Normal” on the control panel of the machine.)
  • WorkCentre 51XX
  • WorkCentre 7220/7225
  • WorkCentrePro 2XX / BookMark 40/55
  • WorkCentre 56XX
  • WorkCentre 75XX

As well: Take it as hearsay. Imagine that Xerox could have scanned the list on one of their own devices. LOL

Edit: Xerox has been preparing some Q&A sheets, have a look:

Comments

Because of caching, a comment can take up to two minutes until it appears.

Due to heavy spam, I need to block the comment feature for some time.

I have tested this with the WorkCentre 7655 with various scan settings (200dpi, 300dpi, normal quality, high quality, OCR on/off). Even using the settings recommended by Xerox in their “ScanningAppendixB.PDF”, the problem continues to exist. Default settings of 300dpi/higher quality still produce number switching, which contradicts the statement in Xerox's Scanning QA document: “You will not see a character substitution issue when scanning with the factory default settings.” (from Question 6 in their document)

I'm am about to run tests with scanners from two other brands to see if I can replicate the problem there.

1 |
J.Colbert
| 2013/08/08 18:26 | reply

I have experimented with the open source jbig2enc library available at http://github.com/agl/jbig2enc, which has a encoding parameter called the “threshold”, described like this:

“sets the fraction of pixels which have to match in order for two symbols to be classed the same. This isn't strictly true, as there are other tests as well, but increasing this will generally increase the number of symbol classes”

The included command tool accepts values for this parameter between 0.4 and 0.9, with 0.85 as the default.

I have found replaced digits in single-page numerical tables encoded with this parameter set as high as 0.82. As with the other examples you have found, the errors are not in any ways obvious to the eye which is, of course, the real problem.

Since JBIG2 has been supported in PDF since 2001, it would be surprising if only Xerox have fallen into this trap.

2 |
Hans Liss
| 2013/08/09 09:12 | reply