|
DataGlyphs® Frequently Asked Questions (FAQ)
Time-stamp: "Mon 02/27/06 10:09:37"
Definitions
Conceptual
Integration
Programming
Parameters
Q. What are DataGlyphs?
A. DataGlyphs are a technology for embedding digital data in printed
documents by printing robust and aesthetically pleasing symbol patterns ("glyphs").
Q. What is DG500?
A. DG500 is a proprietary technical specification for DataGlyphs.
Q. What are glyphtones?
A. DataGlyphs can be aesthetically modulated to resemble a pictorial
image. The technique is called a pictorial glyphtone.

Q. What are serpentones?
A. Serpentones are an alternative markshape for DataGlyphs. Compare these
glyphtone and
serpentone images.
This document
demonstrates the upper limits of color serpentone technology alongside with other glyphtone
technologies.

Q. What is partial dotting?
A. To produce a pictorial glyphtone, a graphic image is required as input. When partial
dotting is turned off, each pixel in the input image aesthetically represent a cell in the
generated DataGlyph. When partial dotting is turned on, each pixel in the input image
aesthetically represents a pixel in the generated DataGlyph. Typically people will use
larger input images with partial dotting, and smaller input images when not using partial
dotting. Glyphtones with partial dotting can look slightly better
than their counterparts without partial dotting. This comes
at a slight cost in decoding robustness.

Q. What is a Yellow DataGlyph?
A. Yellow DataGlyphs are a nearly invisible way to embed data
in a printed document. Instead of using traditional techniques such as
ultraviolet or infrared ink, this DataGlyph variant uses yellow
glyphmarks that small and widely spread out. The resulting DataGlyph
is so unobtrusive that most people won't notice it at all. A color
scanner is able to see the Yellow DataGlyph in the blue channel, and
decode the embedded information. The tradeoff for near-invisibility is
lowered data density, so Yellow DataGlyphs are typically used to store
smaller quantities of information. 
Q. What are address carpets?
A. Address carpets are a special type of DataGlyph which
provide pure positional information, and do not
carry message data at all. Even with a small
capture window, one can determine exactly which portion of the
DataGlyph is imaged. 
Q. What is tiling?
A. Tiling is a technique for repeating a DataGlyph over a large surface
area. Any small portion of the surface may be imaged to extract the data. 
Q. What is an MVP score?
A. The MVP score is a quality metric for DataGlyph decoding.
It helps answer questions like "Which of these two DataGlyph is more
decodable?" and is useful to people who are tuning equipment or
software settings. The highest possible MVP score is 1.0, and anything
below 0.5 will not decode at all. 
Q. How do I create
a DataGlyph?
A. DataGlyphs are created with software contained in the
DataGlyph Toolkit. The input is typically the digital information
being stored, along with parameters that describe how that information is
stored. The output is a DataGlyph, which is typically printed on
a substrate such as paper. This process is called encoding.

Q. How do I read
a DataGlyph?
A. DataGlyphs are read with software. First an image of a DataGlyph
is captured with a scanner, camera, or other 2-D imaging device. That image is given
to software contained in the DataGlyph Toolkit. The software extracts the
information contained in the DataGlyph Toolkit. This process is called decoding.

Q. Can I read a
DataGlyph with a simple 1-D barcode scanner?
A. No. Many 1-D barcode scanners, such as those
found in supermarket checkout lines to read UPC symbols, are very
simple devices consisting of a light source and a photodiode.
They are not capable of recording a two dimensional image, and
thus fundamentally incapable of decoding a DataGlyph.

Q. What type of
data can I store?
A. DataGlyphs store arbitrary binary data. Whatever bits you put in is
what you'll get back out.

Q. Are DataGlyphs
a type of encryption?
A. No. However, DataGlyphs may store information that has been digitally
signed or encrypted using standard techniques. 
Q. Do DataGlyphs
have built-in compression?
A. No. However, DataGlyphs may store information that has been already
compressed using any technique. 
Q. Can DataGlyphs be
copied?
A. Yes, although some photocopiers will not reproduce very fine DataGlyph
patterns. 
Q. How fast is the
codec?
A. Fast. Processing times are generally proportional to the area of the
DataGlyph being processed. The following benchmarks are informal and apply to
black and white DataGlyphs. Note that working with grayscale or color images
require significantly more computer memory and processing time than
black and white images.

| Amount of data stored
| encode time (bitmap)
| encode time (font)
|
| 20 bytes
| 5 ms
| 5 ms
|
| 1 kilobyte
| 13 ms
| 9 ms
|
| 16 kilobytes
| 153 ms
| 107 ms
|
Benchmark #1: Worst case scenario. Encode a
DataGlyph, on a 733 MHz Pentium III running Linux. These numbers were
generated using the dgencode executable and include the overhead of
the OS creating a new process to handle every encode
operation. Process creation overhead time dominates for the small
dataglyphs.
|
| Average time per encode/decode cycle
|
| Amount of data stored
| 733MHz PIII
| 2.8GHz PIV
| 1.8GHz Dual Opteron
|
| 20 bytes
| 5.6 ms
| 2.9 ms
| 1.5 ms
|
| 1 kilobyte
| 47 ms
| 19 ms
| 6.9 ms
|
| 16 kilobytes
| 750 ms
| 400 ms
| 93 ms
|
Benchmark #2: Encode/decode a DataGlyph image
using dgthreadtest benchmarking program under Linux, using one thread
per processor. Approximately 90% of the cycle time is used for
decoding, and 10% for encoding.
|
| DataGlyphs encoded per second
|
| Amount of data stored
| 733MHz PIII
| 2.8GHz PIV
|
| 20 bytes
| 6,450
| 23,500
|
| 56 bytes
| 4,500
| 16,750
|
Benchmark #3: Encode a font-based DataGlyph
repeatedly from a custom application under linux.
Q. What is the largest
DataGlyph I can make?
A. 769x769 glyph marks. This is easily large enough to cover an entire
sheet of paper using typical printing and imaging resolutions. Such DataGlyphs
can store typically 50 kilobytes.
Q. Can DataGlyphs
be read upside down?
A. Yes. DataGlyphs can be read from any orientation, including upside
down, sideways, and tilted. See the -rotated command line option
if you are experimenting with the dgdecode command line utility.

Q. What is the failure
mode for decoding?
A. All or nothing. DataGlyphs are designed to correctly decode the embedded
information, or in the face of unrecoverable damage report a decode failure.
DataGlyphs will not return partial information or incorrect information. This
behaviour is enforced by integrity-checking mathematics utilized by the symbology.
Q. What is the difference between tiling and high ECC?
A.
Tiling is good when you only image a small portion of a dataglyph. For
example, lets say you have a dataglyph tiled all over the surface of a
shipping crate. If someone has a handheld dataglyph decoder
(imagine something similar to a traditional 1D or 2D barcode
scanner) they can point the handheld reader at any unobscured
place on the surface and get a read -- no precision aiming
required. If the entire dataglyph is being imaged, and the goal is
high decodability despite significant damage, real estate is usually
better spent using a high ECC (error correction level) rather than
tiling. Our error correction codes are good at handling arbitrary
burst errors (as opposed to tile shaped burst errors!) and thus makes
for a more flexible protection against arbitrary damage.

Q. How do DataGlyphs resist noise?
A.
DataGlyphs are particularly good at resisting salt and pepper noise,
which is a very common type of damage. Salt and pepper noise is
characterized by "flipped pixels" and can be caused by imperfections
in the printing process, imaging device, or environmental wear and
tear. Many 2D barcodes are composed of arrays of on-or-off dots, which
can be hard to distinguish from similarly looking noise patterns.
DataGlyphs transmit information through angles, which are relatively
easy to distinguish even in the presence of common types of noise. For
example, our decoding software (and hopefully your eyes as well) can
distinguish the angle of all but 8 of the data bearing glyphmarks in
this image. The small number of
missing or erroneously read glyphmarks are easily compensated for with
error correction coding. In many respects, this is analogous to the
situation with AM and FM radio. In AM radio, both the noise and the
signal are modulations in the amplitude. FM radio (like DataGlyphs) is
more robust in part because the signal is transmitted though phase or
angle changes which is more easily distinguishable from common noise
sources.

Q. Can I add application specific labels to DataGlyphs?
A.
Yes. This feature is automatically available. Each DataGlyph contains
a short fingerprint derived from the license key used to create it. This
fingerprint information is programmatically accessible by DataGlyph
decoders. This allows a DataGlyph created for application A to be
automatically distinguished from a DataGlyph created for application B.

Q. How can I improve the number of intensity levels in glyphtones?
A.
Advanced users can incorporate error diffusion techniques with glyphtones. This
can improve the number of intensity levels for glyphtones with a small cell size.

Q. Are there many varieties of glyphtones?
Yes, glyphtone has several variations. By default, the DataGlyph Toolkit modulates individual glyph marks as
halftone marks to reproduce intensity changes. However, one can also
"overlay" glyph marks over an image. For a color image, one may use either technique on all channels or just one channel (typically the blue/yellow channel).
This document
demonstrates the upper limits of of the "halftoning" flavor of the color glyphtone technology.
And this document demonstrates the the "overlay" flavor.

Q. How do I license the DataGlyph Toolkit?
A. Start by filling out the business survey. This
provides contact information and a starting point to discuss pricing, evaluation licenses, and everything
else required to move forwards from a business perspective.

Q. How do application developers typically use the toolkit?
A. A typical scenario starts with the application developer
experimenting with the command line executables included with the
toolkit and perusing the included documentation. The developer then
examines the source code to those examples and trims the example code down
to the exact functionality required for the application. The result is
then embedded inside the application.

Q. Do DataGlyphs
require a specialized printer or imager?
A. No.
Q. Can I read a
DataGlyph with a video camera or webcam?
A. Yes. A webcam is used for the Glyphsaw
Puzzle. See also this camera example
and this technical paper describing
advanced camera techniques.

Q. Can I read a
DataGlyph with a multifunction device?
A. Yes. For example, in
GlyphChess the raw image is scanned by a Xerox
multifunction device, then transmitted over a computer network for processing. Multifunction
devices are typically used for multi-page, document oriented applications.

Q.Does
the printing surface matter? A. Yes, and you should
test this yourself. While most dataglyphs are
printed with ink and toner on paper, many other printing techniques and
printing surfaces that can also be used. In general, when
using a novel substrate or printing technique, do a system level
test which includes the entire printing and imaging cycle. In
general, if the marks look sane after imaging (i.e. clearly
distinguishable as forward and backward slashes by a human) there is
an excellent chance our decoding software can handle it. However, the
majority of dataglyph testing has been with ink printing on paper,
and imaging with a flatbed scanner. If your images look wildly different from what
we traditionally work with, then custom image processing (either inside
or before decoding) might be necessary for best results.
Q.Can I print a DataGlyph
using a font?
A. Yes. The DataGlyph Toolkit can produce character output meant to be
rendered using a DataGlyph font. See the "-format" command line argument of
the dgencode sample program. Fonts for a variety of printer types are
bundled with the toolkit.
Q.Can I print a DataGlyph
using a bitmap image?
A. Yes.
Q. What resolutions
are required?
A. After a complete printing and imaging cycle, glyph marks must have
an minimum cell size of 5x5 pixels. For best results, we suggest a 7x7 pixel
cell size or higher.
Q. What print quality is
required?
A. It is difficult to concisely quantify the exact interaction between
error correction level, failure rates, and printing and imaging quality. Typically
we use large regression suites of real DataGlyphs to test certain parameters
and printing/imaging conditions. Qualitatively, after a complete print and imaging
cycle, most of the glyph marks must be visually distinguishable from each other.
Here is a example of acceptable marks and
here is an example of unacceptable marks.
Q. Do you have a
print testpage?
A. Yes. This PostScript document is designed for
a good quality 600dpi printer. The dataglyphs contained within
should come out as a smooth grey. If you get a striped pattern
(aliasing) in the dataglyph, then you likely have a resampling
operation somewhere in your print process that needs to be
addressed. This PDF document
demonstrates the upper limits of color glyphtone technology. These
test images require good quality scanning and printing equipment.

Q. Do Xerox printers
or VIPP have built-in support for DataGlyphs?
A. The DocuPrint NPS family of Xerox printers optionally support an embedded
DataGlyph encoder. This eliminates the need for inserting a DataGlyph encoder
somewhere else in the printing pipeline. The DataGlyph can be activated either
via a custom PostScript operator, or through VIPP (Variable-data Intelligent
Postscript PrintWare). VIPP users (you'll know it if you are one) should refer
to the SHGLYPH operator on page 5-141 of the 2001 VIPP Reference Manual. These
interfaces expose a subset of parameters available in the DataGlyph toolkit.
Q. Do endusers
need to obtain a toolkit or license key?
A. No. Only the developer of a DataGlyph-enabled product or service needs
the toolkit. For example, assume a developer is making a fax machine that is
capable of inserting and reading DataGlyphs on fax cover pages. The fax machine
developer needs the toolkit and a license key to activate it. The toolkit libraries
are incorporated (linked) into the fax machine software and are distributed
as part of the fax machine. The developer embeds their activating license key
into the fax machine software as well. The resulting fax machine is sold to
end users, who in general don't have to care about the inner workings of the
product. They certainly don't need to know about DataGlyph toolkits or obtain
one. This situation holds true for any sort of DataGlyph enabled product or
service, including pure software products.
Q. Does the toolkit
come with source code?
A. No. We do include source code for a few command line sample applications
to help application developers understand the library calls. However, the toolkit
libraries themselves are precompiled and are not shipped with source code.

Q. Do application developers
need libtiff?
A. The toolkit as shipped supports TIFF images on all platforms. Win32 developers
do not need libtiff. All other application
developers will need libtiff
installed at link time, even if TIFF functionality is not used. Libtiff may
be obtained from libtiff.org, and in some
cases may already be included with your operating system. The toolkit has been
successfully linked with libtiff ranging from libtiff version 3.4 and higher
and is typically linked with the latest stable release.
Q. Can the toolkit
decode from multi-image TIFF?
A. No. Split your multi-image TIFF into single-image TIFF files before
decoding. Also, see the libtiff question above.
Q. How can I locate
a DataGlyph?
A.By default, the toolkit decoding routines are designed to look for a single
DataGlyph roughly in or near the center of the image or specified region. Locating
DataGlyphs in a large image can be very domain specific. For example, images from
a flatbed scanner might require a very different locator algorithm than a camera
image. Locator algorithms can vary widely due to particular imaging equipment,
imaging equipment settings, and application specific constraints. We provide an example
for finding a DataGlyph in a large scanned image -- see the "dgdetect" program included
with the toolkit. Application developers can adapt that example to their needs, or
utilize their own domain specific locator algorithms.

Q. Does the toolkit
come with dynamic or static libraries?
A. For win32, both static (.lib) and dynamic (.dll) libraries are included.
For most, but not all Unix platforms we supply dynamic (.so) as well as static (.a) libraries.
Q. Is the toolkit
threadsafe?
A. Yes.
Q. Can the toolkit
be ported to additional platforms?
A. Yes, with a custom (business) arrangement. The toolkit is written
in clean, well documented ANSI and POSIX compliant C, and thus is technically
capable of being ported to a large number of platforms. The toolkit is known-capable
of porting to platforms as diverse as OS/390, MacOS X, AIX, and many others
including embedded platforms.
Q. Is the evaluation
toolkit full-featured?
A. Yes.
Q.How are program errors reported?
A. In the included sample applications such as dgencode and dgdecode,
only very minimal error reporting information is returned to the user. Those
applications serve as calling examples, so they are designed to be short
and simple. To get more detailed information about an error condition, try running
with the logging command line option enabled, or trace the sample application with a debugger.
For application programmers, all toolkit library functions have a return value that may
be checked for error conditions. See the DgErrors.h include file for details.

Q.Is there additional toolkit programming documentation?
A. All official toolkit documentation is shipped with the toolkit itself,
in the directory labeled doc. The DataGlyph website provides some
supplementary information, mostly about the general technology. Other than that,
no additional toolkit documentation is available.
Q.What languages
can I call the toolkit from?
A. Any computer language that can call a C library can call the toolkit.
DataGlyph applications have been written in a variety of languages, including
but not limited to C, C++, Python, Java, Visual Basic, PHP, and COBOL. In some cases,
this required a thin language specific wrapper around the raw C library calls.
Any such wrappers are the responsibility of the application developer; none
are shipped with the toolkit itself.
Q. What are recommended
parameters for error correction?
A. Error correction allows a DataGlyph to be decoded even if some of
the glyph marks are damaged or destroyed. Typically we devote 10% to 25% of
the data capacity to error correction. This range is a reasonable sweet spot
in providing good protection against damage while still allowing good data density.
For example, using the dgenocde sample executable, the following command illustrates
a 25% error correction level. (64 is about 25% of 255).
dgencode -id <licensekey> -word 255 -ecc
64 -format bmp -out
Note that the calculation is different for very short messages.
If the entire message is smaller than a single codeword size, then the error
correction level effectively increases. For example, if the command above was
used to encode a 20 bytes message, we would have 64 bytes of correction for
the 20 bytes of data; about 75% of the bytes would be devoted to error correction,
making for a robust DataGlyph with low data density.
Q. What are recommended
parameters for glyph cell size?
A. When you are deciding how large to make the glyph marks, you need
to take into account the entire system, including both the printing and imaging
process, as mentioned above. We highly recommend experimenting with your actual
printing and imaging equipment, as their characteristics can vary. For example,
some printers print wider glyph marks than others, and the sensitivity of imaging
equipment can vary greatly.
Note that the different output methods provide different
levels of control over glyph mark size. For example, when generating a bitmap
with the sample encoder program, one may specify the cellsize, graylevel,
and various glyphtone parameters. Whereas in a purely font based DataGlyph, the
graylevel of the visual characteristics of the glyph marks are completely
determined by the font and printing
process. Thus at generation time, there is no need or benefit for specifying such parameters,
as they will be ignored. Examples follow.
dgencode -id <licensekey$gt; -format bmp
-cell 7 -gray 0.5 -out
dgencode -id <licensekey> -format font
-out
Q. What are recommended
parameters for faxes?
A. Some applications require a DataGlyph to survive two cycles through
a fax machine. That can introduce a lot of degradation, but several strategies
can help. First, try to make the marks themselves look as good as possible after
transmission. If possible, use an electronic source for the first transmission,
and match the DataGlyph image resolution against the fax transmission resolution.
Also, make the glyphmarks large to overcome distortion by poor quality fax printers
and to avoid descreening algorithms from fax machines that try to be too clever.
For best results, meet or exceed a 7x7 pixel default cell size in the final
image. Second, make sure you use a generous quantity of ECC, so that the DataGlyph
can decode even if many marks are destroyed. A 25% ECC is very generous for
most applications, but can be raised for extremely noisy conditions. Both cellsize
and ECC parameters make a tradeoff between reliability vs. data density, so
find a balance that works for your application. (As with all applications, test,
test, test!) Finally, be aware that standard mode fax transmission can potentially
deliver DataGlyphs with non-symmetric resolutions like 200x100 dpi. Definitely
make sure resolution parameters of the image are correct before requesting a
decode from the toolkit.
Q. How much text
can be overlayed on a Dataglyph?
A. DataGlyphs contain error correction that can allow a successful decode
despite damage. One popular type of "damage" is overlayed text or images. To
the first order, the amount of text we can overlay is directly proportional
to the error correction level. Here are some examples showing text overlay,
which were produced by addding more and more text until we have just about exhausted
the ability of error correction. Hence these are maximum scenarios that assume
a clean printing and imaging process. If you expect additional damage (from
dirt, scan noise, or other sources) you should increase the error correction
beyond what is illustrated in these examples.
Q. How can I
try out different parameters?
A. Use this DataGlyph Capacity spreadsheet
to experiment with various parameters. You can also directly experiment by creating
DataGlyphs using the toolkit.
|