Xerox announces software patch

One day after their first Statement, which in essence contained information already known at the time, Xerox now published a second one.

In this statement, Xerox does announce a software patch to be developed that seems to completely eliminate the compression mode in question from their scan copiers. Pretty nice! Even though it is the most radical option, it is the one I wished for, for as I pointed out several times, patch based lossy compression seems to be uncalled for in document scan copiers. It's also a big leap for Xerox, as in their glossy brochures, they have been praising the compression mode in question as a top notch feature.

Almost nobody is short of disk storage space any more. Aside from this, customers not having legal certainty on anything they scanned using this mode, or others fearing to unburden the pensions office from their monthly payments to ol' grandma by just scanning her medicine prescriptions might not be what Xerox originally intended at all.

The day after

… seems not to be today.

I was a bit of anticipating the Xerox number mangling concern to lose momentum today. This would have been okay, since it is not my goal to inflict any damage to Xerox. I really appreciated how they got in touch and listened, and this is why I tried to help as far as I could.

Only – the internet doesn't forget. Read the sometimes harsh comments below the Xerox statement! People do not seem to agree that some small notice in the web interface only shown when changing the compression level to “normal” (sic!) would make up for possible annual productions of subtly incorrect documents at what may easily be thousands of enterprises world-wide. As a result, and as a result of the mass media kicking in additionally, this web site runs on up to 160 hits a minute the whole day. A friend of mine condensed the issue in a wonderful way I do not want to withhold from you:

Conference call with Xerox

This evening, I had like half an hour conference call with

  • Rick Dastin, Corporate Vice President Office and Solutions, and
  • Francis Tse, Imaging System Architect at Xerox Corporation.

First, I'd like to point out that the atmosphere of the call was very relaxed and easy-going. Above all, both sides were listening to each other, at least this is what I feel (Mr. Dastin, Mr. Tse, feel free to object ;-) ). I highly appreciate the way, Xerox deals with the issue, as not all enterprises would do it in this friendly way. We all know the stories of enterprises shooting at the messenger for such a blog post.

Facts first:

  1. The suggested workaround is indeed a workaround, as it switches off JBIG2
  2. The main problem was, respectively is, a support problem, which would not have ocurred, if Xerox support would have known their machines.

Now for the finer granularity facts. The Xerox design in scanning modes contains three levels. Two standard levels (high and higher) and one that gives us small file sizes, but deliberately neglects data integrity (named normal). Now, the “normal” setting uses JBIG2 (as suggested) and therefore may indeed mangle characters. The “higher” and “high” levels use another compression, which also explains, why the image quality may actually decrease when switching from “higher” to “high” – another counter-intuitive thing, as we also discussed.

If one needs a data integrity neglecting compression level in scanners, can be argued about. You need to make your own opinion with respect to this. The double key phrase concecning JBIG2 from the conference call was the following (from my memory minutes, but I think these were the words):

David Kriesel: “If you give me a document encoded with JBIG2, and I claim it's incorrect, you can't prove me wrong.”

Francis Tse: “Yes, you're right, it's a probablistic thing.”

To be most fair: The “normal” setting is not the default setting (in particular, in the company where the error occured to me first, somebody must have set scanning to “normal”) and there exists an (albeit small) warning message in the web interface, see also the screen shot in the workaround blog post.

Only, I don't think Xerox is off the hook with this. Personally, I would never ever implement patch based image compression algorithms for text data that might possibly need legal certainty, which I also stated during the call. We however agreed that the Names of the compression levels may be misleading. Somebody might always think “hey, the normal setting is enough for me”, neglect the small warning in the web interface telling about character substitutions, and go on with business (in fact, exactly this is what lots and lots of people seemingly did, but more on this later, when I also propose the only two solutions I can think of).