events contact us
Search the complete PARC site
 

DataGlyphs® Frequently Asked Questions (FAQ)

Time-stamp: "Mon 02/27/06 10:09:37"

Definitions

Conceptual

Integration

Programming

Parameters


Q. What are DataGlyphs?
A. DataGlyphs are a technology for embedding digital data in printed documents by printing robust and aesthetically pleasing symbol patterns ("glyphs"). top

Q. What is DG500?
A. DG500 is a proprietary technical specification for DataGlyphs.top

Q. What are glyphtones?
A. DataGlyphs can be aesthetically modulated to resemble a pictorial image. The technique is called a pictorial glyphtone. top

Q. What are serpentones?
A. Serpentones are an alternative markshape for DataGlyphs. Compare these glyphtone and serpentone images. This document demonstrates the upper limits of color serpentone technology alongside with other glyphtone technologies. top

Q. What is partial dotting?
A. To produce a pictorial glyphtone, a graphic image is required as input. When partial dotting is turned off, each pixel in the input image aesthetically represent a cell in the generated DataGlyph. When partial dotting is turned on, each pixel in the input image aesthetically represents a pixel in the generated DataGlyph. Typically people will use larger input images with partial dotting, and smaller input images when not using partial dotting. Glyphtones with partial dotting can look slightly better than their counterparts without partial dotting. This comes at a slight cost in decoding robustness. top

Q. What is a Yellow DataGlyph?
A. Yellow DataGlyphs are a nearly invisible way to embed data in a printed document. Instead of using traditional techniques such as ultraviolet or infrared ink, this DataGlyph variant uses yellow glyphmarks that small and widely spread out. The resulting DataGlyph is so unobtrusive that most people won't notice it at all. A color scanner is able to see the Yellow DataGlyph in the blue channel, and decode the embedded information. The tradeoff for near-invisibility is lowered data density, so Yellow DataGlyphs are typically used to store smaller quantities of information. top

Q. What are address carpets?
A. Address carpets are a special type of DataGlyph which provide pure positional information, and do not carry message data at all. Even with a small capture window, one can determine exactly which portion of the DataGlyph is imaged. top

Q. What is tiling?
A. Tiling is a technique for repeating a DataGlyph over a large surface area. Any small portion of the surface may be imaged to extract the data. top

Q. What is an MVP score?
A. The MVP score is a quality metric for DataGlyph decoding. It helps answer questions like "Which of these two DataGlyph is more decodable?" and is useful to people who are tuning equipment or software settings. The highest possible MVP score is 1.0, and anything below 0.5 will not decode at all. top

Q. How do I create a DataGlyph?
A. DataGlyphs are created with software contained in the DataGlyph Toolkit. The input is typically the digital information being stored, along with parameters that describe how that information is stored. The output is a DataGlyph, which is typically printed on a substrate such as paper. This process is called encoding. top

Q. How do I read a DataGlyph?
A. DataGlyphs are read with software. First an image of a DataGlyph is captured with a scanner, camera, or other 2-D imaging device. That image is given to software contained in the DataGlyph Toolkit. The software extracts the information contained in the DataGlyph Toolkit. This process is called decoding. top

Q. Can I read a DataGlyph with a simple 1-D barcode scanner?
A. No. Many 1-D barcode scanners, such as those found in supermarket checkout lines to read UPC symbols, are very simple devices consisting of a light source and a photodiode. They are not capable of recording a two dimensional image, and thus fundamentally incapable of decoding a DataGlyph. top

Q. What type of data can I store?
A. DataGlyphs store arbitrary binary data. Whatever bits you put in is what you'll get back out. top

Q. Are DataGlyphs a type of encryption?
A. No. However, DataGlyphs may store information that has been digitally signed or encrypted using standard techniques. top

Q. Do DataGlyphs have built-in compression?
A. No. However, DataGlyphs may store information that has been already compressed using any technique. top

Q. Can DataGlyphs be copied?
A. Yes, although some photocopiers will not reproduce very fine DataGlyph patterns. top

Q. How fast is the codec?
A. Fast. Processing times are generally proportional to the area of the DataGlyph being processed. The following benchmarks are informal and apply to black and white DataGlyphs. Note that working with grayscale or color images require significantly more computer memory and processing time than black and white images. top

Amount of data stored encode time (bitmap) encode time (font)
20 bytes 5 ms 5 ms
1 kilobyte 13 ms 9 ms
16 kilobytes 153 ms 107 ms

Benchmark #1: Worst case scenario. Encode a DataGlyph, on a 733 MHz Pentium III running Linux. These numbers were generated using the dgencode executable and include the overhead of the OS creating a new process to handle every encode operation. Process creation overhead time dominates for the small dataglyphs.

Average time per encode/decode cycle
Amount of data stored 733MHz PIII 2.8GHz PIV 1.8GHz Dual Opteron
20 bytes 5.6 ms 2.9 ms 1.5 ms
1 kilobyte 47 ms 19 ms 6.9 ms
16 kilobytes 750 ms 400 ms 93 ms

Benchmark #2: Encode/decode a DataGlyph image using dgthreadtest benchmarking program under Linux, using one thread per processor. Approximately 90% of the cycle time is used for decoding, and 10% for encoding.

DataGlyphs encoded per second
Amount of data stored 733MHz PIII 2.8GHz PIV
20 bytes 6,450 23,500
56 bytes 4,500 16,750

Benchmark #3: Encode a font-based DataGlyph repeatedly from a custom application under linux.

top

Q. What is the largest DataGlyph I can make?
A. 769x769 glyph marks. This is easily large enough to cover an entire sheet of paper using typical printing and imaging resolutions. Such DataGlyphs can store typically 50 kilobytes.top

Q. Can DataGlyphs be read upside down?
A. Yes. DataGlyphs can be read from any orientation, including upside down, sideways, and tilted. See the -rotated command line option if you are experimenting with the dgdecode command line utility. top

Q. What is the failure mode for decoding?
A. All or nothing. DataGlyphs are designed to correctly decode the embedded information, or in the face of unrecoverable damage report a decode failure. DataGlyphs will not return partial information or incorrect information. This behaviour is enforced by integrity-checking mathematics utilized by the symbology.top

Q. What is the difference between tiling and high ECC?
A. Tiling is good when you only image a small portion of a dataglyph. For example, lets say you have a dataglyph tiled all over the surface of a shipping crate. If someone has a handheld dataglyph decoder (imagine something similar to a traditional 1D or 2D barcode scanner) they can point the handheld reader at any unobscured place on the surface and get a read -- no precision aiming required. If the entire dataglyph is being imaged, and the goal is high decodability despite significant damage, real estate is usually better spent using a high ECC (error correction level) rather than tiling. Our error correction codes are good at handling arbitrary burst errors (as opposed to tile shaped burst errors!) and thus makes for a more flexible protection against arbitrary damage. top

Q. How do DataGlyphs resist noise?
A. DataGlyphs are particularly good at resisting salt and pepper noise, which is a very common type of damage. Salt and pepper noise is characterized by "flipped pixels" and can be caused by imperfections in the printing process, imaging device, or environmental wear and tear. Many 2D barcodes are composed of arrays of on-or-off dots, which can be hard to distinguish from similarly looking noise patterns. DataGlyphs transmit information through angles, which are relatively easy to distinguish even in the presence of common types of noise. For example, our decoding software (and hopefully your eyes as well) can distinguish the angle of all but 8 of the data bearing glyphmarks in this image. The small number of missing or erroneously read glyphmarks are easily compensated for with error correction coding. In many respects, this is analogous to the situation with AM and FM radio. In AM radio, both the noise and the signal are modulations in the amplitude. FM radio (like DataGlyphs) is more robust in part because the signal is transmitted though phase or angle changes which is more easily distinguishable from common noise sources. top

Q. Can I add application specific labels to DataGlyphs?
A. Yes. This feature is automatically available. Each DataGlyph contains a short fingerprint derived from the license key used to create it. This fingerprint information is programmatically accessible by DataGlyph decoders. This allows a DataGlyph created for application A to be automatically distinguished from a DataGlyph created for application B. top

Q. How can I improve the number of intensity levels in glyphtones?
A. Advanced users can incorporate error diffusion techniques with glyphtones. This can improve the number of intensity levels for glyphtones with a small cell size. top

Q. Are there many varieties of glyphtones?
Yes, glyphtone has several variations. By default, the DataGlyph Toolkit modulates individual glyph marks as halftone marks to reproduce intensity changes. However, one can also "overlay" glyph marks over an image. For a color image, one may use either technique on all channels or just one channel (typically the blue/yellow channel). This document demonstrates the upper limits of of the "halftoning" flavor of the color glyphtone technology. And this document demonstrates the the "overlay" flavor. top

Q. How do I license the DataGlyph Toolkit?
A. Start by filling out the business survey. This provides contact information and a starting point to discuss pricing, evaluation licenses, and everything else required to move forwards from a business perspective. top

Q. How do application developers typically use the toolkit?
A. A typical scenario starts with the application developer experimenting with the command line executables included with the toolkit and perusing the included documentation. The developer then examines the source code to those examples and trims the example code down to the exact functionality required for the application. The result is then embedded inside the application. top

Q. Do DataGlyphs require a specialized printer or imager?
A. No.top

Q. Can I read a DataGlyph with a video camera or webcam?
A. Yes. A webcam is used for the Glyphsaw Puzzle. See also this camera example and this technical paper describing advanced camera techniques. top

Q. Can I read a DataGlyph with a multifunction device?
A. Yes. For example, in GlyphChess the raw image is scanned by a Xerox multifunction device, then transmitted over a computer network for processing. Multifunction devices are typically used for multi-page, document oriented applications. top

Q.Does the printing surface matter?
A. Yes, and you should test this yourself. While most dataglyphs are printed with ink and toner on paper, many other printing techniques and printing surfaces that can also be used. In general, when using a novel substrate or printing technique, do a system level test which includes the entire printing and imaging cycle. In general, if the marks look sane after imaging (i.e. clearly distinguishable as forward and backward slashes by a human) there is an excellent chance our decoding software can handle it. However, the majority of dataglyph testing has been with ink printing on paper, and imaging with a flatbed scanner. If your images look wildly different from what we traditionally work with, then custom image processing (either inside or before decoding) might be necessary for best results.top

Q.Can I print a DataGlyph using a font?
A. Yes. The DataGlyph Toolkit can produce character output meant to be rendered using a DataGlyph font. See the "-format" command line argument of the dgencode sample program. Fonts for a variety of printer types are bundled with the toolkit.top

Q.Can I print a DataGlyph using a bitmap image?
A. Yes.top

Q. What resolutions are required?
A. After a complete printing and imaging cycle, glyph marks must have an minimum cell size of 5x5 pixels. For best results, we suggest a 7x7 pixel cell size or higher.top

Q. What print quality is required?
A. It is difficult to concisely quantify the exact interaction between error correction level, failure rates, and printing and imaging quality. Typically we use large regression suites of real DataGlyphs to test certain parameters and printing/imaging conditions. Qualitatively, after a complete print and imaging cycle, most of the glyph marks must be visually distinguishable from each other. Here is a example of acceptable marks and here is an example of unacceptable marks.top

Q. Do you have a print testpage?
A. Yes. This PostScript document is designed for a good quality 600dpi printer. The dataglyphs contained within should come out as a smooth grey. If you get a striped pattern (aliasing) in the dataglyph, then you likely have a resampling operation somewhere in your print process that needs to be addressed. This PDF document demonstrates the upper limits of color glyphtone technology. These test images require good quality scanning and printing equipment. top

Q. Do Xerox printers or VIPP have built-in support for DataGlyphs?
A. The DocuPrint NPS family of Xerox printers optionally support an embedded DataGlyph encoder. This eliminates the need for inserting a DataGlyph encoder somewhere else in the printing pipeline. The DataGlyph can be activated either via a custom PostScript operator, or through VIPP (Variable-data Intelligent Postscript PrintWare). VIPP users (you'll know it if you are one) should refer to the SHGLYPH operator on page 5-141 of the 2001 VIPP Reference Manual. These interfaces expose a subset of parameters available in the DataGlyph toolkit.top

Q. Do endusers need to obtain a toolkit or license key?
A. No. Only the developer of a DataGlyph-enabled product or service needs the toolkit. For example, assume a developer is making a fax machine that is capable of inserting and reading DataGlyphs on fax cover pages. The fax machine developer needs the toolkit and a license key to activate it. The toolkit libraries are incorporated (linked) into the fax machine software and are distributed as part of the fax machine. The developer embeds their activating license key into the fax machine software as well. The resulting fax machine is sold to end users, who in general don't have to care about the inner workings of the product. They certainly don't need to know about DataGlyph toolkits or obtain one. This situation holds true for any sort of DataGlyph enabled product or service, including pure software products.top

Q. Does the toolkit come with source code?
A. No. We do include source code for a few command line sample applications to help application developers understand the library calls. However, the toolkit libraries themselves are precompiled and are not shipped with source code. top

Q. Do application developers need libtiff?
A. The toolkit as shipped supports TIFF images on all platforms. Win32 developers do not need libtiff. All other application developers will need libtiff installed at link time, even if TIFF functionality is not used. Libtiff may be obtained from libtiff.org, and in some cases may already be included with your operating system. The toolkit has been successfully linked with libtiff ranging from libtiff version 3.4 and higher and is typically linked with the latest stable release.top

Q. Can the toolkit decode from multi-image TIFF?
A. No. Split your multi-image TIFF into single-image TIFF files before decoding. Also, see the libtiff question above.top

Q. How can I locate a DataGlyph?
A.By default, the toolkit decoding routines are designed to look for a single DataGlyph roughly in or near the center of the image or specified region. Locating DataGlyphs in a large image can be very domain specific. For example, images from a flatbed scanner might require a very different locator algorithm than a camera image. Locator algorithms can vary widely due to particular imaging equipment, imaging equipment settings, and application specific constraints. We provide an example for finding a DataGlyph in a large scanned image -- see the "dgdetect" program included with the toolkit. Application developers can adapt that example to their needs, or utilize their own domain specific locator algorithms. top

Q. Does the toolkit come with dynamic or static libraries?
A. For win32, both static (.lib) and dynamic (.dll) libraries are included. For most, but not all Unix platforms we supply dynamic (.so) as well as static (.a) libraries.top

Q. Is the toolkit threadsafe?
A. Yes.top

Q. Can the toolkit be ported to additional platforms?
A. Yes, with a custom (business) arrangement. The toolkit is written in clean, well documented ANSI and POSIX compliant C, and thus is technically capable of being ported to a large number of platforms. The toolkit is known-capable of porting to platforms as diverse as OS/390, MacOS X, AIX, and many others including embedded platforms.top

Q. Is the evaluation toolkit full-featured?
A. Yes.top

Q.How are program errors reported?
A. In the included sample applications such as dgencode and dgdecode, only very minimal error reporting information is returned to the user. Those applications serve as calling examples, so they are designed to be short and simple. To get more detailed information about an error condition, try running with the logging command line option enabled, or trace the sample application with a debugger. For application programmers, all toolkit library functions have a return value that may be checked for error conditions. See the DgErrors.h include file for details. top

Q.Is there additional toolkit programming documentation?
A. All official toolkit documentation is shipped with the toolkit itself, in the directory labeled doc. The DataGlyph website provides some supplementary information, mostly about the general technology. Other than that, no additional toolkit documentation is available.top

Q.What languages can I call the toolkit from?
A. Any computer language that can call a C library can call the toolkit. DataGlyph applications have been written in a variety of languages, including but not limited to C, C++, Python, Java, Visual Basic, PHP, and COBOL. In some cases, this required a thin language specific wrapper around the raw C library calls. Any such wrappers are the responsibility of the application developer; none are shipped with the toolkit itself.top

Q. What are recommended parameters for error correction?
A. Error correction allows a DataGlyph to be decoded even if some of the glyph marks are damaged or destroyed. Typically we devote 10% to 25% of the data capacity to error correction. This range is a reasonable sweet spot in providing good protection against damage while still allowing good data density. For example, using the dgenocde sample executable, the following command illustrates a 25% error correction level. (64 is about 25% of 255).top

dgencode -id <licensekey> -word 255 -ecc 64 -format bmp -out

Note that the calculation is different for very short messages. If the entire message is smaller than a single codeword size, then the error correction level effectively increases. For example, if the command above was used to encode a 20 bytes message, we would have 64 bytes of correction for the 20 bytes of data; about 75% of the bytes would be devoted to error correction, making for a robust DataGlyph with low data density.top

Q. What are recommended parameters for glyph cell size?
A. When you are deciding how large to make the glyph marks, you need to take into account the entire system, including both the printing and imaging process, as mentioned above. We highly recommend experimenting with your actual printing and imaging equipment, as their characteristics can vary. For example, some printers print wider glyph marks than others, and the sensitivity of imaging equipment can vary greatly.

Note that the different output methods provide different levels of control over glyph mark size. For example, when generating a bitmap with the sample encoder program, one may specify the cellsize, graylevel, and various glyphtone parameters. Whereas in a purely font based DataGlyph, the graylevel of the visual characteristics of the glyph marks are completely determined by the font and printing process. Thus at generation time, there is no need or benefit for specifying such parameters, as they will be ignored. Examples follow.

dgencode -id <licensekey$gt; -format bmp -cell 7 -gray 0.5 -out

dgencode -id <licensekey> -format font -out top

Q. What are recommended parameters for faxes?
A. Some applications require a DataGlyph to survive two cycles through a fax machine. That can introduce a lot of degradation, but several strategies can help. First, try to make the marks themselves look as good as possible after transmission. If possible, use an electronic source for the first transmission, and match the DataGlyph image resolution against the fax transmission resolution. Also, make the glyphmarks large to overcome distortion by poor quality fax printers and to avoid descreening algorithms from fax machines that try to be too clever. For best results, meet or exceed a 7x7 pixel default cell size in the final image. Second, make sure you use a generous quantity of ECC, so that the DataGlyph can decode even if many marks are destroyed. A 25% ECC is very generous for most applications, but can be raised for extremely noisy conditions. Both cellsize and ECC parameters make a tradeoff between reliability vs. data density, so find a balance that works for your application. (As with all applications, test, test, test!) Finally, be aware that standard mode fax transmission can potentially deliver DataGlyphs with non-symmetric resolutions like 200x100 dpi. Definitely make sure resolution parameters of the image are correct before requesting a decode from the toolkit.top

Q. How much text can be overlayed on a Dataglyph?
A. DataGlyphs contain error correction that can allow a successful decode despite damage. One popular type of "damage" is overlayed text or images. To the first order, the amount of text we can overlay is directly proportional to the error correction level. Here are some examples showing text overlay, which were produced by addding more and more text until we have just about exhausted the ability of error correction. Hence these are maximum scenarios that assume a clean printing and imaging process. If you expect additional damage (from dirt, scan noise, or other sources) you should increase the error correction beyond what is illustrated in these examples.top

Examples of max text overlay
noise free 10% ECC 25% ECC 50% ECC
1 scan/print cycle 10% ECC 25% ECC 50% ECC

Q. How can I try out different parameters?
A. Use this DataGlyph Capacity spreadsheet to experiment with various parameters. You can also directly experiment by creating DataGlyphs using the toolkit.top

 

DataGlyphs is a registered trademark of Palo Alto Research Center Incorporated

 

BUSINESS CONTACT
Hermann Calabria
Technology Commercialization Manager
650-812-4751
DEMONSTRATION

GlyphServer Demo allows you to create and decode your own DataGlyph blocks. (For demonstration purposes only.)

APPLICATION SUPPORT

Technical Overview

Frequently Asked Questions (FAQ)

API for Windows, Linux, Solaris, MacOS X, and QNX platforms

   

  (Logo/Homepage) PARC - Palo Alto Research Center

Copyright © 2002-2007 Palo Alto Research Center Incorporated. All Rights Reserved.
PARC, the PARC Logo, AspectJ, DataGlyph, Obje, Silx, StressedMetal, and ClawConnect
are trademarks or registered trademarks of Palo Alto Research Center Incorporated.