Advertisment

Document Conversion Issues

author-image
PCQ Bureau
New Update

Depending on the type of document, you need to convert it from one form to another and from one format to another. During the life cycle of a document, the management system is not just concerned with the paper and digital forms in general, but also with the specific document requirements of each sub application. For example, when you have something on paper and want to work on it in your computer, you would need to scan it in, OCR it, correct the OCR's output and then use the resulting file in a word processor or image manipulation software. Or, you could put the matter into a PDF file and send it to someone. At each stage, the document's format is changing. An intelligent document management system will manage all these stages.

Advertisment

Scanning



Everyone knows that when you scan a piece of paper, you are essentially digitizing it. However, the process does not end there. If the scan or the input document is not of sufficient quality, there will be errors in it and this needs to be fixed. We shall cover the issues relating to this correction in a later segment, but suffices to say that saving the scanned image as a picture file simply will not do. Modern productivity suites feature their own built-in capture tools to take the input from the scanner and convert it to their specific formats. So, if you were working in say Photoshop and want to use a scanned image, scan it in using Photoshop and you will get a ready PSD file. Similarly, you can scan directly to word processor and PDF files as well.

Stage conversion



When the digital document flows through your process, its format changes and new capabilities are expected out of it. For example, this document is being composed in a word processor. Before it gets printed, its workflow demands a conversion to a DTP software format and then into photographic plates for printing. Document- management systems would keep track of this requirement and handle step-based requirements appropriately. For example, when this document is uploaded into it, it is as a word processor file. But when the next-stage worker accesses it, they would get a ready-converted publishing software format (for simplicity, let's say as a PDF file). Although it would be difficult for most formats to be implemented by all solutions because some formats would be proprietary, they would provide commonly expected filters (like DOC, PDF, Text and HTML).

A 'document' need not just be a something printed or written on a piece of paper. It can be on cloth, on a piece of stone or leaf. For example, the Archeological Survey of India (ASI) maintains digital copies of ancient scripts and notations. These are often found on rocks and leaves and cloth. Converting this from one form to another is yet another challenge. Similarly, the medical industry finds its source documents in the form of human body parts, body fluids (blood, urine and stool) and photographic (x-ray and MRI scan) or electrical signals (ECG and EEG). Some of these are readily digitized-for example, a DNA analysis computer provides the pattern as a digital read out which is usually transcribed onto paper. ECG and EEG signals are similarly graphed on paper. X-ray and MRI are essentially digital images. These can be directly captured in their raw digital forms and filed away or processed easily.

Advertisment