Checksums and their calculations in C

Cesky: , English:

Checksums

and their calculations in C

Checksums are an inevitable part of data transfer and device control. They are used to detect errors. An important characteristic of checksums is their resistance to multiple errors. The effect of one error on the resulting checksum should not offset the effect of another error. Here we will show some of the most commonly used checksums and their calculations, including a statistical comparison of success rates. Most of the functions here can be found in both fast and slow versions - with or without a table. Of particular note for use in the MCU is the CCITT XModem, which is available in both fast and tableless versions.

CRC-32

CRC-32 is used as one of the strongest checksum types, with high sensitivity to errors. The result is a 32-bit number. It uses the polynomial x^32 + x^26 + x^23 + x^22 + x^16 + x^12 + x^11 + x^10 + x^8 + x^7 + x^5 + x^4 + x^2 + x^1 + x^0 to calculate. The default value is the number 0xFFFFFFFF. It is used e.g. in a PNG graphic file. In the fast table version it requires a pre-generated table of 1 KB. In the library source code you can also find a variant for the hardware CRC32 of the STM32 processor. Check pattern: "123456789" -> 0xCBF43926, 0xFC 0x05 0x4A -> 0xA8E10F6D. Calculation of 1 byte in slow version:

#define CRC32_POLY 0xEDB88320 // reversed 0x04C11DB7
u32 crc32_slow_1(u32 crc, u8 data)
{
	int i;
	crc ^= data;
	for (i = 8; i > 0; i--)
	{
		if ((crc & 1) == 0)
			crc >>= 1;
		else
			crc = (crc >> 1) ^ CRC32_POLY;
	}
	return crc;
}

As a special variant, the CRC32 calculation for Raspberry Pico is also included in the library. Maybe I just misunderstood the code, but I think there is an error in the end of their function and that's why the calculation gives different results than similar calculations (it's a variant with reversed bit order and inversion, but still...). That's why I simulated the function for the calculation by mimicking the operations with the ARM registers. Control sample: "123456789" -> 0x0376E6E7, 0xFC 0x05 0x4A -> 0x8E10D720.

CRC-16 (IBM, Modbus)

CRC-16 is the second strongest checksum type. Although its result is only 16-bit, it still has excellent sensitivity to errors. The computation follows the polynomial x^16 + x^15 + x^2 + x^0. It is used e.g. in disk controllers for sector checking, on the Modbus or on the Dallas Maxim 1-wire bus. The table version requires a table of 512 bytes. There are several versions depending on the default value. Disk controllers and Dallas Maxim use a default value of 0. Check pattern: "123456789" -> 0xBB3D, 0xFC 0x05 0x4A -> 0x9742. The Modbus uses the default value 0xFFFF. Check pattern: "123456789" -> 0x4B37, 0xFC 0x05 0x4A -> 0x5733. Calculation of 1 byte in the slow version:

#define CRC16_POLY 0xA001 // reversed 0x8005
u16 crc16_slow_1(u16 crc, u8 data)
{
	int i;
	crc ^= data;
	for (i = 8; i > 0; i--)
	{
		if ((crc & 1) == 0)
			crc >>= 1;
		else
			crc = (crc >> 1) ^ CRC16_POLY;
	}
	return crc;
}

Kermit

The Kermit transmission protocol also uses a 16-bit checksum, but the calculation is based on the polynomial x^16 + x^12 + x^5 + x^0. This checksum is sometimes referred to as the original CRC-CCITT. The tabular version requires a table of 512 bytes. Check pattern: "123456789" -> 0x8921, 0xFC 0x05 0x4A -> 0x71BA. Calculation of 1 byte in the slow version:

#define CRCKERMIT_POLY 0x8408 // reversed 0x1021
u16 crc_kermit_slow_1(u16 crc, u8 data)
{
	int i;
	crc ^= data;
	for (i = 8; i > 0; i--)
	{
		if ((crc & 1) == 0)
			crc >>= 1;
		else
			crc = (crc >> 1) ^ CRCKERMIT_POLY;
	}
	return crc; // swap bytes of result
}

DNP

DNP is a communication protocol for communication between computers on a network. The computation follows the polynomial x^16 + x^13 + x^12 + x^11 + x^10 + x^8 + x^6 + x^5 + x^2 + x^0. The spreadsheet version requires a table of 512 bytes. Check pattern: "123456789" -> 0x82EA, 0xFC 0x05 0x4A -> 0x8AE7. Calculation of 1 byte in the slow version:

#define CRCDNP_POLY 0xA6BC // reversed 0x3D65
u16 crc_dnp_slow_1(u16 crc, u8 data)
{
	int i;
	crc ^= data;
	for (i = 8; i > 0; i--)
	{
		if ((crc & 1) == 0)
			crc >>= 1;
		else
			crc = (crc >> 1) ^ CRCDNP_POLY;
	}
	return crc; // invert and swap bytes of result
}

CCITT XModem

The 16-bit CCITT XModem checksum is used in the XModem transmission protocol. The calculation follows the polynomial x^16 + x^12 + x^5 + x^0, similar to crc Kermit, but the order of the bits is reversed. The spreadsheet version requires a table of 512 bytes. The default value of crc XModem is the number 0. Control pattern: "123456789" -> 0x31C3, 0xFC 0x05 0x4A -> 0x8048. The library also provides a CRC-CCITT variant with a default value of 0xFFFF. Check pattern: "123456789" -> 0x29B1, 0xFC 0x05 0x4A -> 0x4CD4. And then a variant labeled CRC-CCITTB, with a default value of 0x1D0F. Check pattern: "123456789" -> 0xE5CC, 0xFC 0x05 0x4A -> 0x9144. Calculation of 1 byte in slow version:

#define CRCCCITT_POLY 0x1021
u16 crc_ccitt_slow_1(u16 crc, u8 data)
{
	int i;
	crc ^= ((u16)data << 8);
	for (i = 8; i > 0; i--)
	{
		if ((crc & 0x8000) == 0)
			crc <<= 1;
		else
			crc = (u16)((crc << 1) ^ CRCCCITT_POLY);
	}
	return crc;
}

I especially like the XModem checksum, and the reason is that there is a code for it that is easy and fast to calculate, comparable to the spreadsheet version, and yet does not require a spreadsheet. This makes it suitable for use in small processors with little memory and little power (it works well even in the Z80). I also include assembly code for the ATmega processor in the library, where I use it for example to check program Flash memory in calculators. Calculation of 1 byte:

u16 crc_ccitt_fast_1(u16 crc, u8 data) // without table
{
	crc = (crc >> 8) | (crc << 8);
	crc ^= data;
	crc ^= (crc & 0xff) >> 4;
	crc ^= crc << 12;
	crc ^= (crc & 0xff) << 5;
	return crc;
}

Other implementations of the fast algorithm can be found on the net, e.g.: https://www.ccsinfo.com/forum/viewtopic.php?t=24977, https://www.oldcomp.cz/viewtopic.php?p=109443#p109443 .

Sick

The Sick checksum is used in Sick devices for industrial measurement and detection. This checksum is presented here as a point of interest, not for real use. The libcrc library lists the following calculation, which is similar to the 1-bit calculation in CRC-16 code, not the full byte calculation. Apparently this is an old version of the CRC calculation in Sick devices, as they currently specify using the CRC-CCITT checksum for Sick devices. Author of the library states that the following calculation should be applied for each byte 2x. 1 byte calculation:

#define CRCSICKPOLY 0x8005
u16 crc_sick_1(u16 crc, u8 data, u8 prev)
{
	if (crc & 0x8000)
		crc = (crc << 1) ^ CRCSICKPOLY;
	else
		crc <<= 1;
	crc ^= (data | (prev << 8));
	return crc;
}

CRC-8 Dallas

The CRC-8 Dallas checksum is 8-bit. It is used on the Dallas Maxim 1-Wire bus. The calculation is based on the polynomial x^8 + x^5 + x^4 + x^0. Its output is 8-bit and therefore has a lower resistance to multiple errors (there is an increased chance of error compensation). For lower transmission requirements, it is used to secure short data with low error rates. The table version requires a table size of 256 bytes. Check pattern: "123456789" -> 0xA1, 0xFC 0x05 0x4A -> 0xF1. 1 byte calculation in the slow version:

u8 crc8_dallas_slow_1(u8 crc, u8 data)
{
	int i;
	crc ^= data;

	for (i = 8; i > 0; i--)
	{
		if ((crc & 1) != 0)
			crc = (crc >> 1) ^ CRC8DALLAS_POLY;
		else
			crc >>= 1;
	}
	return crc;
}

CRC-8 RDallas

CRC-8 RDallas is the inverse of the Dallas checksum, used in some serial link devices. Check pattern: "123456789" -> 0x7C, 0xFC 0x05 0x4A -> 0x85. Calculation of 1 byte in the slow version:

u8 crc8_rdallas_slow_1(u8 crc, u8 data)
{
	int i;
	u8 carry;

	for (i = 8; i > 0; i--)
	{
		carry = (crc & 0x80);
		crc <<= 1;
		if ((data & 1) != 0) crc |= 1;
		data >>= 1;
		if (carry != 0) crc ^= CRC8RDALLAS_POLY;
	}

	return crc;
}

CRC-8 Sum

CRC-8 Sum is the first in a series of simple and fast checksums. It is simply the sum of bytes of data. Compared to polynomial calculations, it has lower robustness against multiple errors, but still higher robustness than simple XOR of data (where errors at the same bit position compensate each other). Checksum: "123456789" -> 0xDD, 0xFC 0x05 0x4A -> 0x4B. 1 byte calculation:

u8 crc8_sum_1(u8 crc, u8 data)
{
	return (u8)(crc + data);
}

CRC-16 Sum

CRC-16 Sum is a sum similar to the previous case, but the result is a 16-bit number. Check pattern: "123456789" -> 0x01DD, 0xFC 0x05 0x4A -> 0x014B. 1 byte calculation:

u16 crc16_sum_1(u16 crc, u8 data)
{
	return (u16)(crc + data);
}

CRC-32 Sum

CRC-32 Sum is again a sum of data as the previous 2 cases, but with a 32-bit result. Check pattern: "123456789" -> 0x000001DD, 0xFC 0x05 0x4A -> 0x0000014B. 1 byte calculation:

u32 crc32_sum_1(u32 crc, u8 data)
{
	return (u32)(crc + data);
}

CRC-SHT

The CRC-SHT is an 8-bit checksum used in the SHT1x and SHT7x series heat and humidity sensors. Only a spreadsheet version is available. The calculation is supposedly by the polynomial x^8 + x^5 + x^4 + x^0, which should be the same polynomial as Dallas, but it's different, so I don't know. Check pattern: "123456789" -> 0xA2, 0xFC 0x05 0x4A -> 0x10. 1 byte calculation:

u8 crc_sht_1(u8 crc, u8 data)
{
	return crc_sht_tab[data ^ crc];
}

CRC-16 XOR

I first encountered the CRC-16 XOR checksum in Robotron devices, where it was used to check the contents of EEPROMs. I no longer remember the order of operations and whether the rotation was left or right, but that shouldn't affect the robustness of the code. I suspect that the fault tolerance will be a bit higher than for simple sum. The computation is simple and fast - you do an XOR with the data byte and rotate the result by 1 bit. Check pattern: "123456789" -> 0x406A, 0xFC 0x05 0x4A -> 0x0760. 1 byte calculation:

u16 crc_xor_1(u16 crc, u8 data)
{
	crc ^= data;
	return rol16(crc, 1);
}

Statistical comparison of checksums

I don't know if I've searched wrong, but I haven't come across a statistical evaluation of the success of checksums on the web. I added a statistical test of checksums to the library - I presented them with a sample containing an error and watched to see if the checksum evaluated the sample as correct. Here I have compiled a ranking of the success rate of checksums, after running 1000 million tests:

CRC check: OK

Checked samples: 1000000K

CRC32 err=0 (0.00000%)
Pico err=0 (0.00000%)
IBM err=16353 (0.00164%)
Modbus err=16353 (0.00164%)
Kermit err=14846 (0.00148%)
DNP err=48016 (0.00480%)
XModem err=13216 (0.00132%)
CCITT err=13216 (0.00132%)
CCITTB err=13216 (0.00132%)
Sick err=325582 (0.03256%)
Dallas err=3883375 (0.38834%)
RDallas err=3883375 (0.38834%)
Sum8 err=4790763 (0.47908%)
Sum16 err=2208416 (0.22084%)
Sum32 err=2208416 (0.22084%)
SHT err=3876875 (0.38769%)
XOR err=752766 (0.07528%)
XOR8 err=5694701 (0.56947%)
XOR0 err=5661974 (0.56620%)
CCITT0 err=13216 (0.00132%)

1) CRC-32 (error count=0, i.e. 0.00000%)

As expected, CRC-32 clearly leads in success rate. It even did not make a single error during the whole testing period. And this includes the Raspberry Pico variant, so apparently my suspicion of a bug in the code was incorrect and Pico only generates CRC-32 in reverse.

2) XModem, CCITT, CCITTB (number of errors=13216, i.e. 0.00132%)

In second place is the CCITT checksum and it is in all its 3 variants (XModem, CCITT, CCITTB). Thanks to the fast variant of the calculation, we have a very high quality and fast checksum, not demanding on memory (not requiring tables). On the net I have come across the opinion that the default value is important for the correctness of the checksum. Therefore, I added a variant of CCITT0, where I used a randomly chosen number 12345 as the default value. As you can see from the statistical results, the success rate is the same as the other CCITT variants, the default value does not affect the success rate.

3) Kermit (number of errors=14846, i.e. 0.00148%)

Kermit checksum is close behind CCITT. Its success rate is only slightly worse.

4) IBM, Modbus (error count=16353, i.e. 0.00164%)

In 4th place, CRC-16 is closely followed by IBM and Modbus variants. All these 3 checksums are 16-bit polynomials (with 4 polynomial terms) and have a reasonably good success rate.

5) DNP (number of errors=48016, i.e. 0.00480%)

In 5th place is the DNP checksum. Although it is a 16-bit checksum, its success rate is significantly worse than the previous 3 checksums. This is probably because its polynomial has 10 terms and, as it seems, the higher polynomial complexity is rather detrimental and thus cannot be recommended. Apparently the authors did not check the success rate in practice.

6) Sick (number of errors=325582, i.e. 0.03256%)

Sick's checksum is in 6th place. Compared to the previous crc its success rate has already dropped considerably, but it has not yet reached the level of the following crc. The success rate corresponds to the fact that it is a sort of hybrid between bit and byte checksums. A compromise in terms of success rate and in terms of speed, with rather poor results in both. This is probably why it was later abandoned and replaced by full CCITT.

7) XOR (number of errors=752766, i.e. 0.07528%)

The XOR checksum was a pleasant surprise. Although the output is 16-bit and cannot compete in quality with 16-bit polynomial sums, its great advantage is speed and simplicity. It is almost as complex as a simple sum and yet has a much higher success rate. So if you want an easy and fast checksum that doesn't require complex polynomial calculations, simply XOR the data and rotate by 1 bit and you will get a relatively high quality method quickly and easily.

8) Sum16, Sum32 (error count=2208416, i.e. 0.22084%)

Simple sums of 16 and 32 bits came out better than the following 8-bit sums. This is because the result is 16 and 32-bit number, while the 8-bit only 8 bits. Sum32 and Sum16 come out the same because the sample is too short (32 bytes) and so the sum does not overflow 16 bits. The quality of the sum is 3x worse than XORing the data with rotation.

9) SHT (number of errors=3876875, i.e. 0.38769%)

The 8-bit checksum of SHT is at the 9th position. Although the computation should supposedly follow the same polynomial as Dallas, the error rate is slightly better.

10) Dallas, RDallas (error count=3883375, i.e. 0.38834%)

The checksums of Dallas and RDallas are only slightly worse than SHT. RDallas is as successful as Dallas, although the bit order is reversed - so apparently the bit order is not important (the bit order of checksums is otherwise commonly distinguished by whether the computer is a Big endian or Little endian).

11) Sum8 (error count=4790763, i.e. 0.47908%)

The simple byte sum of 8 bits is almost at the bottom of the rankings. Still, we can say that its success rate is not much worse than 8-bit polynomial calculations. It has the advantage of simplicity and speed and therefore can be a serious competitor to 8-bit polynomial checksums.

12) XOR0 (number of errors=5661974, i.e. 0.56620%)

As expected, the simple XOR of the data (without rotation) came out worse than the simple byte sum. I added this calculation just to test and compare the success rate. The output is an 8-bit CRC. Errors at the same bit positions are compensated for and therefore the result is worse than the simple sum.

13) XOR8 (number of errors=5694701, i.e. 0.56947%)

XOR8 is again an add-on to the "out of order" test. Again, XOR and bit rotation is done with the data, but only within 8 bits. The result is worse than a simple 8-bit sum, which is surprising. Given the success rate of the 16-bit XOR version, I would have expected a better result than the simple sum.

Conclusion

The following conclusion can be drawn from these results:

If a really strong comparison is needed, revealing virtually any error, use CRC-32 to do it.
In common technical practice (e.g. packet checksums), a 16-bit checksum is good enough, preferably in the CCITT (XModem) variant, which is reliable and has a fast calculation variant that does not require a table.
If you want a very easy and fast checksum that you can just run "on the fly", use the 16-bit XOR method with rotation. It only takes 2 machine code instructions and the result is quite good.
If you are constrained by data size and must use an 8-bit CRC, the polynomial calculation is not that much help and you can just add the data into a 1-byte checksum, you won't get much worse off.

Download

You can check the checksum calculations in the online calculator https://www.lammertbies.nl/comm/info/crc-calculation .

Here in the library there is a compilation available for MS Visual Studio 2005 and for Raspberry Pico - both in the UART port and USB virtual COM port versions (including the full compilation environment, just add GCC ARM). In the library the correctness of the checksum calculation is verified and the statistical evaluation of its success is performed. The CRC library itself is located in the cpp.* files.

Link to download the library: CheckSum_src.zip

Appendix

I have been advised of the possibility to use a truncated table for only 4 bits (Small Table CRC). The calculation is only slightly slower than with the full table, but the table only takes up 1/16 of the original size. It is possible to compute CRCs quickly this way even on processors with small memory. https://wiki.wxwidgets.org/Development:_Small_Table_CRC

Miroslav Nemecek

<< Back