The hard drive storage swindle

The hard drive storage swindle

Have you bought a hard drive and wondered why you don’t get the storage you paid for? According to my research, you are being short changed.

For example, when you are buying a 1 Terabyte hard drive, you are short changed by the tune of 92.7GB in terms of file storage.

For me to explain fully, I need to go into some detail on how files are stored, and how file sizes are calculated.

Bits, bytes and kilobytes

When you think of a kilobyte of computer data, do you think it is 1000 bytes?

If you have been thinking that, then you may be surprised to find out that you are not quite correct. But, you could be forgiven for thinking that.

It is all to do with number systems and the way data is counted.

Binary (Base 2), Decimal (Base 10) and Hexadecimal (Base 16)

Base 10 – also known as decimal or DEC on a programmer’s calculator (counting in 10s)

When we were taught mathematics in primary school, we were taught using the base 10 number system – the decimal system. Everyone knows the system, and it is where every group of numbers is based on groups of 10 numbers.

0 to 9, 10 to 19, 20 to 29 and so on.

When you are asked how many metres are in a kilometre, you easily say it is 1,000 and you would be correct. So why would a kilobyte not be 1,000 bytes?

That is because computers count in binary (base 2) – BIN on a programmer’s calculator.

To fully see the way this works, you need to know the binary numbering system (base 2) as that is how computers work after all — using binary bits of data.

Base 2 – also known as binary or BIN on a programmer’s calculator

Computers work on a basis of 0s and 1s, which is the make up of binary numbers and binary data. 0 for off, 1 for on, 0 for no, 1 for yes…

So how is counting done in binary, and how does that relate to kilobytes etc.?

Well, in binary you have binary data written in rows, which each 0 and 1 being called a binary bit. You have 0 then 1, then 10, then 11, then 100…

If you look at the binary table below, each column represents the value of each binary bit. The top row represents the value of each binary bit and the bottom row is the row of binary bits.

If you add the values above each binary 1, you will get the total value in base 10 (decimal).

8 4 2 1Base 10
Value
111115
Table for the binary nibble

A series of 4 binary bits is referred to as a nibble (or sometimes nybble or nyble to match the name byte) and a nibble is half a byte.

So, 1 binary bit storage can store one quarter of a nibble of data (2 bits of data – 0 or 1) and 2 binary bits equal a half a nibble.

As you can see in the table above, you can store a maximum of 16 different bits of data in a nibble (0 to 15 in decimal or 0000 to 1111 in binary).

For a complete byte of data, you need 2 nibbles together as shown below.

128643216 8 4 2 1Base 10
Value
11111111255
Table for a byte of binary data

For a kilobit you need 2.5 nibbles (1 byte and 0.5 nibbles) of data together (10 binary bits of data).

This has a capacity to store 1,024 different bits of data, hence 1 kilo in computer terms is equal to 1,024 in decimal and not 1000.

512256128643216 8 4 2 1Base 10
Value
11111111111023
Table for a byte of binary data

Binary data bits and data capacity

Binary bits of data (n)Capacity = 2n
11 bit or 0.25 nibbles
24 bits or 0.5 nibbles
38 bits or 0.75 nibbles
4 16 bits or 1 nibble or 0.5 bytes
532 bits or 1.25 nibbles
664 bits or 1.5 nibbles
7128 bits or 1.75 nibbles
8256 bits or 2 nibbles or 1 byte
9512 bits or 1 byte and 0.25 nibbles
101024 bits or 1 byte and 0.5 nibbles or 1kilobit

Hard drive storage capacities in computer terms and how it is sold

JEDEC, who sets the Global Standards for the Microelectronics Industry uses the binary definitions of Megabyte and Gigabyte which is exactly how computers generally would see things within programming etc. JEDEC states that:

This agrees with what we have already discussed.

As you can see, each step up (byte to kilobyte, kilobyte to megabyte…) is in multiples of 1024.

Starting with a byte of data, the following table outlines the different file sizes as measured by a computer.

File storage capacityEqual to (binary)In bits (b)
or kilobits (Kb)
(binary)
1 Byte (1B)8 bits8b
or 0.0078Kb
1 Kilobyte (1KB)1024 bytes8192b
or 7.9872Kb
1 Megabyte (1MB)1024 kilobytes8,388,608b
or 65,431.1Kb
1 Gigabyte (1GB)1024 megabytes
or 1048,576KB
8,589,934,592b
or 67,001,489.8Kb
1 Terabyte (1TB)1024 gigabyte
or 1048,576MB
or 1,073,741,824KB
8,796,093,022,208b
or 68,609,525,573.2Kb
1 Petabyte (1PB)1024 terabytes
or 1048,576GB
or 1,073,741,824MB
or 1,099,511,627,776KB
9,007,199,254,740,992b
or 70,256,154,186,979.7Kb

Now, when you buy a hard drive, you should expect each block of data size to be in multiples of 1,024, as shown by the file storage capacities table. But they are not.

Advertised hard drive space and actual hard drive space is down to the fact that hard drives in shops are advertised with storage information in decimal (base 10) as opposed to how computers understand storage, which as we have already determined, is in binary (base 2).

Why do they do this?

Since consumers do not generally think in binary mathematics, manufacturers decided to rate most drive capacities based on the decimal numbers with which we’re all familiar. Therefore, 1KB equals 1000 bytes, and:

  • 1MB = 1.000KB equalling 1,000,000 bytes
  • 1GB = 1,000MB leading to 1,000,000,000 bytes
  • 1TB = 1,000GB leading to 1,000,000,000,000 bytes… and so on.

JEDEC notes in the link about MB storage size, that The Institute of Electrical and Electronics Engineers (IEEE) in their document IEEE/ASTM SI 10‑1997 states:

“[The practice of K equalling 1,024] frequently leads to confusion and is deprecated.” Further confusion results from the popular use of a “megabyte” consisting of 1 024 000 bytes to define the capacity of the familiar “1.44‑MB” diskette. An alternative system is found in Amendment 2 to IEC 60027‑2: Letter symbols to be used in electrical technology – Part 2:

This approximation was not much of a problem when we were dealing with only kilobytes of data at a time. However, each level of increase in the prefix also increases the total discrepancy of the actual space compared to the advertised space, leading to massive shortages in real data storage figures.

How much of a difference does this make in reality?

Here is a quick reference to show the amount that the actual values differ compared to the advertised for each common referenced value:

  • Megabyte difference = 48,576 bytes (47KB). That’s 4.6% short.
  • Gigabyte difference = 73,741,824 bytes (72,013KB or 70MB). That’s 6.8% short.
  • Terabyte difference = 99,511,627,776 bytes (97,179,324KB or 94,901.7MB or 92.7GB). That’s 9.1% short.

So as the unit of file storage gets larger (KB vs MB vs GB vs TB) you are short by more file storage.

A case in point is that:

  • a hard drive in my computer, advertised at 2TB in size, is actually 1863GB in size according to my computer instead of 2048GB. That is 9% or 185GB short (92.5GB per advertised Terabyte).
  • a 6TB hard drive I connected actually had 5589GB in file storage space. 555GB short.

When, and if manufacturers start to produce hard drives in Petabytes, unless the discrepancy is dealt with, the Petabyte difference will equal 125,899,906,842,624 bytes (122,949,127,776KB or 120,067,507.6MB or 117,253.4GB or 114.5TB). That is more storage than you can buy in one hard drive at this present time, and means you will be short changed by 11.2% of file storage space.