When the InterNet was first invented they decided that since they'd be lucky to have perhaps 1,000 computers on the Internet, that they could do some smart stuff with the IP addresses. So the first few bits of an IP address said how large the network the IP was in was.
|^Bits|^Name|^Size|^Bit length|Range |<0|Class A|128 networks of 16777216 hosts each|/8|>0.0.0.0/1 - 127.255.255.255 |<10|Class B|2^14 networks of 65536 hosts each|/16|>220.127.116.11/2 - 18.104.22.168 |<110|Class C|2^21 networks of 256 hosts each|/24|>192.0.0.0/3 - 22.214.171.124 |<1110|Class D|2^28 multicast ID's| |>126.96.36.199/4 - 188.8.131.52 |<11110|Class E|2^27 Reserved addresses| |>240.0.0.0/5 - 247.255.255.255
There was a lot of use of Class C networks since a lot of people had about 250 odd machines, and a lot of people needed more than 250 machines, but less than 65530ish machines and they ended up using Class B addresses, however not many people required more than that. Now half the address space was "Wasted" reserved for "A" spaces and B and C spaces were running out fast. So the idea was to stop using the class bits and store the network information independently. This also let them give out ranges larger than a /24 (254 usable addresses), but smaller than a /16 (65534 useable addresses)
Many large organisations have (or used to have) class A networks assigned to them - a couple of quick examples are many universities (eg StanfordUniversity) and some car manufacturers (eg Ford - reserved a class A in expectation that all their cars would eventually have a unique IP address).
The notation commonly called "CIDR notation" is to suffix / and the number of bits that form the network. For example, the "10.1.1.1" Class A address would be shown as "10.1.1.1/8". The idea is that any length of bits from 1 to 32 can be used after "/". Its fairly unlikely you'll ever see anything greater than a /8 however.