Article Title	 	: Boot Monitor CRC Checking
Creation Date		: 11/09/93
Message ID		: EGRNWQ
Last Update		: 
Expiration Rules	: unknown
Location		: NCD-Articles/NCDware
=================================================================

NCD X Terminal CRC Server Image CRC Checking
============================================

Description
-----------
The boot monitor does CRC checking mainly to catch instances
where users might have attempted to "patch" an NCD server
image and therefore possibly damage the overall quality of the
server.  However, the CRC checking is a useful way to discover
instances of a valid server image being somehow corrupted while
on disk or when being transferred from tape to disk during an
installation or backup restoration process, although these
cases are quite rare due to the generally inherent reliability
of operating system monitored I/O activities.

When the NCD server image is built at the factory, a CRC
checksum is calculated and then inserted in a data structure in
the first "section" of the server image.

Later, when the boot monitor is booting the server image, it
reads the checksum and saves it.  Then, as blocks of the server
images are loaded from the source (e.g. network, PROM, flash
memory, etc.) and whence into RAM memory, the boot monitor
recalculates the CRC checksum repeatedly until the entire
server is loaded.  The resulting CRC checksum is compared to
that which was stored in the first "section" of the server,
and, if they match, then the boot monitor assumes a valid
server image has been successfully loaded and then transfers
control of the terminal to the server.

If the CRC checksums do not match, then the boot monitor
reports so with the error message "File Corrupted CRC Error".

Intermittent versus Consistent CRC Errors
-----------------------------------------
A consistent CRC error, occurring on all NCD terminals which
boot from a host, is nearly always an indication that the
server image file on disk is corrupt.

A consistent CRC error, occurring on only one NCD, indicates the
network, the host, and the server image are not at fault. The problem
lies in either the base or the Network Interface Card, probably the
latter. Swapping the NIC should cause the symptom to move.
-----------------------------------------

An intermittent CRC error is much harder to debug.  First,
if a CRC error occurs intermittently on only one terminal among
a group booting from the same host, this would indicate that
the RAM memory on the specific terminal is failing.

If CRC errors occur intermittently on several terminals booting
from the same host, this would indicate the extremely unlikely
possibility that the server image data is being corrupted
either while being copied from disk to memory or when being
copied from memory to the Ethernet device.  This is extremely
unlikely since the host operating system nearly always flags
such errors, but, low-level booting protocols such as MOP might
not handle such errors correctly.

It is generally assumed, since Ethernet itself provides CRC
checking of each packet, that data on the Ethernet is reliable.

