HARDWARE AND OPERATION | CONCEPTS OF COMPRESSION
A1.1.8 Describe the concept of compression.
- The differences between lossy compression methods and lossless compression methods
- Run-length encoding and transform coding
Many files such as images, videos and even text documents can take up large amounts of memory, meaning an increased need for storage space and slower transfer speeds do to file size, this has led to the need for data compression. Data Compression is the process of making files require less memory to store. When people talk about 'file size' they are usually referring to the memory required to store the file and not the physical size of the document or file.
There are two main methods of file compression, lossy and lossless. Each type of file compression has its benefits and disadvantages. To compress a file software such as Win Zip or is used Archive Utility Zip are used, files zipped by one software brand should be able to be uncompressed by other brands. Different file types use different methods of compression, for example compression an image will use a different algorithm to compressing a text document.
Some reasons for compression are:
✓ Compression makes the file size smaller so less space is needed to store the file
✓ Compression makes the file size smaller so files transfer faster over a network such as the internet
✓ Compression makes the file size smaller which helps with file streaming
SECTION 1 | LOSSY AND LOSSLESS
The key element of Lossy compression is that the file will lose quality when it is compressed. The lose of quality is not important for many files and in many cases we do not even notice the reduction in quality. Some key points on Lossy compression are:
✓ Lossy compression reduces the file size by removing some of the data, because of this an exact match of the original data cannot be recreated. Quality is lost.
✓ Lossy compression uses an algorithm that looks to remove detail that is barely noticeable, for example if pixels next to each other in an image are almost the same colour then the Lossy algorithm will give them the same value to reduce the bytes needed to store the detail.
✓ Lossy compression is often used on files such as images and sound files such as MP3s and JPGs
✓ Lossy compression is often not a good option for files such as text documents
✓ Lossy compression can make files sizes smaller that is possible with Lossless compression
LOSSLESS COMPRESSION
The key element to lossless compression is the no quality is lost during the process of compression. Lossless compression is used when it is important to maintain the original quality. Some key points of Lossless compression are:
✓ Lossless compression will not remove any quality from the file, the compressed version will be the same as the original when uncompressed.
✓ Lossless compression uses an algorithm that looks for repeat data, this can be groups and categorised and a token be given for where each group will be used in the reconstruction
✓ Lossless compression is often used on files such as text files and images such as DOCXs, GIFs and PNGs
✓ Lossless compression is often not a good option for audio files and high colour images
✓ Lossless compression is more limited than Lossy compression with how small the file size can be made
SECTION 2 | RUN-LENGTH ENCODING AND TRAMSFORM CODING
Run-Length Encoding (RLE)
Run-length encoding (RLE) is a simple form of compression that reduces repeated data values.
- It replaces consecutive repeating values (runs) with a single value and a count.
- For example, a sequence of repeated characters or pixels can be stored as one value and the number of times it occurs.
RLE is most effective when data contains long runs of the same value, such as:
- Simple images with large areas of the same colour
- Black-and-white graphics
- Certain types of formatted or structured data
If the data does not contain many repeated values, RLE may provide little or no compression and can sometimes increase file size.
Transform Coding
Transform coding is a more advanced compression technique commonly used for multimedia data such as images, audio, and video.
- Instead of storing raw data values, the data is transformed into a different representation.
- This transformation identifies patterns and frequencies within the data.
- Less important information can then be reduced or removed.
A common example is transforming image data into frequency components so that details less noticeable to human perception can be compressed more heavily.
Transform coding is the foundation of many widely used compression standards, particularly in image and video compression.
Key Differences Between RLE and Transform Coding
- RLE works directly on the original data by reducing repetition.
- Transform coding changes how the data is represented before compression.
- RLE is simple and fast but limited in effectiveness.
- Transform coding is computationally more complex but achieves much higher compression ratios for multimedia data.
Usage Context
- Run-length encoding is often used as part of larger compression systems or for simple data types.
- Transform coding is used in applications where large data sizes must be reduced efficiently, such as image storage, streaming video, and audio compression.
Summary
- Run-length encoding compresses data by replacing repeated values with a value-count pair.
- Transform coding compresses data by converting it into a different representation that highlights patterns.
- Each method is suited to different types of data and compression requirements.
Run-length encoding and transform coding represent two contrasting approaches to compression: one focuses on repetition in raw data, while the other exploits patterns and perceptual significance after transforming the data.
Compressed File | A file that has been reduced in size using a compression technique to save storage space or reduce transmission time.
Decompression | The process of restoring compressed data so that it can be used or viewed.
Lossless Compression | A compression method in which the original data can be perfectly reconstructed after decompression.
Lossy Compression | A compression method in which some data is permanently removed during compression to achieve smaller file sizes.
Redundancy | Repeated or unnecessary data that can be removed during compression without affecting essential information.
Run-Length Encoding (RLE) | A lossless compression technique that replaces sequences of repeated data values with a single value and a count.
Run | A sequence of consecutive identical data values used in run-length encoding.
Transform Coding | A compression technique that converts data into a different representation to identify patterns and frequencies, often used in multimedia compression.
Frequency Component | A part of data produced during transform coding that represents how often a particular pattern occurs.
Perceptual Coding | A compression approach that reduces data based on human perception, often used alongside transform coding.
Compression Ratio | A measure of how much a file has been reduced in size as a result of compression.
Encoding | The process of converting data into a compressed format.
Decoding | The process of converting compressed data back into a usable form.
- Describe the purpose of compression in computer systems.
- Explain the difference between lossy compression and lossless compression.
- Describe how run-length encoding (RLE) compresses data.
- Explain why run-length encoding is most effective for certain types of data.
- Describe one situation where run-length encoding would be ineffective.
- Explain the basic idea behind transform coding.
- Describe why transform coding is commonly used for image, audio, and video compression.
- Explain how transform coding can reduce file size without significantly affecting perceived quality.
- Compare run-length encoding and transform coding in terms of complexity and effectiveness.
- Using an example, explain why lossy compression is acceptable in some applications but not in others.
Sample Answers – A1.1.8 Compression
1. Purpose of compression
Compression reduces the amount of data needed to store or transmit information, saving storage space and reducing transmission time.
2. Lossy vs lossless compression
Lossless compression allows the original data to be perfectly reconstructed, while lossy compression permanently removes some data to achieve smaller file sizes.
3. Run-length encoding
Run-length encoding replaces sequences of repeated values with a single value and a count, reducing redundancy in the data.
4. Effectiveness of RLE
RLE is most effective when data contains long runs of repeated values, such as simple images with large areas of the same colour.
5. Ineffective use of RLE
RLE is ineffective for data with little repetition, such as photographs or encrypted data, and may even increase file size.
6. Transform coding concept
Transform coding converts data into a different representation that highlights patterns and frequency components, allowing less important data to be compressed.
7. Multimedia use of transform coding
Transform coding is suitable for multimedia because it exploits human perception, allowing data that is less noticeable to be compressed more heavily.
8. Perceived quality
By removing or reducing less perceptually important information, transform coding can significantly reduce file size without noticeable loss of quality.
9. RLE vs transform coding
RLE is simple and fast but limited in effectiveness, while transform coding is more complex but achieves much higher compression for multimedia data.
10. Acceptable use of lossy compression
Lossy compression is acceptable for images, audio, and video where small losses are not noticeable, but not suitable for text or program files where accuracy is critical.
☐ 1.1.1 FUNCTIONS OF THE CPU
☐ 1.1.2 ROLE OF THE GPU
☐ 1.1.3 CPU VS GPU
☐ 1.1.4 PURPOSE AND TYPES OF PRIMARY MEMORY
☐ 1.1.5 FETCH, DECODE AND EXECUTE CYCLE
☐ 1.1.6 PIPELINING IN MULTICORE ARCHITECTURES
☐ 1.1.7 SECONDARY MEMORY STORAGE
➩ 1.1.8 CONCEPTS OF DATA COMPRESSION
☐ 1.1.9 CLOUD COMPUTING
A1.2 DATA REPRESENTATION AND COMPUTER LOGIC
☐ 1.2.1 REPRESENTING DATA
☐ 1.2.2 HOW BINARY IS USED TO STORE DATA
☐ 1.2.3 LOGIC GATES
☐ 1.2.4 TRUTH TABLES, CIRCUITS, EXPRESSIONS AND K MAPS
☐ 1.2.5 LOGIC CIRCUIT DIAGRAMS - COMING SOON
A1.3 OPERATING SYSTEMS AND CONTROL SYSTEMS
☐ 1.3.1 ROLE OF OPERATING SYSTEMS
☐ 1.3.2 FUNCTIONS OF OPERATING SYSTEMS
☐ 1.3.3 APPROACHES TO SCHEDULING
☐ 1.3.4 INTERUPT HANDLING
☐ 1.3.5 MULTITASKING
☐ 1.3.6 CONTROL SYSTEM COMPONENTS
☐ 1.3.7 CONTROL SYSTEM APPLICATIONS