1.2.1 | TEXT, SOUND AND IMAGES
Topics from the Cambridge IGCSE (9-1) Computer Science 0984 syllabus 2023 - 2025.
OBJECTIVES
1.2.1 Understand how and why a computer represents text and the use of character sets, including American standard code for information interchange (ASCII) and Unicode
1.2.1 Understand how and why a computer represents text and the use of character sets, including American standard code for information interchange (ASCII) and Unicode
ASCII AND UNICODE
Computers represent text using character sets, which are sets of characters and symbols that are assigned a unique binary code that can be understood by the computer. The use of character sets is important because it enables computers to store and display text in a standardized way, regardless of the language or alphabet used.
One of the most widely used character sets is the American Standard Code for Information Interchange (ASCII), which was developed in the 1960s for use with early computers. ASCII uses 7 bits to represent a total of 128 characters, including letters, numbers, and symbols commonly used in the English language. ASCII is widely used in the US, and is also used as a base for other character sets in different parts of the world.
As computers became more advanced and widespread, it became clear that the 128 characters of ASCII were not sufficient to represent all the characters and symbols used in different languages and scripts around the world. In response to this, Unicode was developed in the 1990s as a universal character set that could represent all characters used in any language or script. Unicode uses 16 bits to represent a total of 65,536 characters, including characters from different scripts such as Latin, Arabic, Chinese, and many more.
Unicode has become the standard for representing text in modern computing, and most modern operating systems, applications, and programming languages support Unicode. The use of Unicode has enabled the development of software and applications that can display text in different languages and scripts, which has been important for global communication and accessibility.
In summary, computers represent text using character sets, which are sets of characters and symbols assigned unique binary codes. The American Standard Code for Information Interchange (ASCII) is a widely used character set that uses 7 bits to represent 128 characters, while Unicode is a universal character set that uses 16 bits to represent 65,536 characters, including characters from different scripts and languages.
One of the most widely used character sets is the American Standard Code for Information Interchange (ASCII), which was developed in the 1960s for use with early computers. ASCII uses 7 bits to represent a total of 128 characters, including letters, numbers, and symbols commonly used in the English language. ASCII is widely used in the US, and is also used as a base for other character sets in different parts of the world.
As computers became more advanced and widespread, it became clear that the 128 characters of ASCII were not sufficient to represent all the characters and symbols used in different languages and scripts around the world. In response to this, Unicode was developed in the 1990s as a universal character set that could represent all characters used in any language or script. Unicode uses 16 bits to represent a total of 65,536 characters, including characters from different scripts such as Latin, Arabic, Chinese, and many more.
Unicode has become the standard for representing text in modern computing, and most modern operating systems, applications, and programming languages support Unicode. The use of Unicode has enabled the development of software and applications that can display text in different languages and scripts, which has been important for global communication and accessibility.
In summary, computers represent text using character sets, which are sets of characters and symbols assigned unique binary codes. The American Standard Code for Information Interchange (ASCII) is a widely used character set that uses 7 bits to represent 128 characters, while Unicode is a universal character set that uses 16 bits to represent 65,536 characters, including characters from different scripts and languages.
CHECK YOUR KNOWLEDGE
Which of the following statements is TRUE about ASCII and Unicode?
A) ASCII can represent all characters from all languages worldwide.
B) Unicode uses more bits per character than ASCII to represent characters from multiple languages.
C) ASCII and Unicode use the same number of bits to represent each character.
D) Unicode was developed before ASCII to support English characters.
EXPLAINATION
The correct answer is B: "Unicode uses more bits per character than ASCII to represent characters from multiple languages." ASCII uses 7 or 8 bits to represent English characters, while Unicode can use 16 or more bits to represent a wide array of characters from various languages.
.
Character Encoding | A system for representing characters, symbols, and text using numbers that a computer can interpret.
ASCII (American Standard Code for Information Interchange) | A character encoding standard that uses 7 or 8 bits to represent English letters, numbers, and basic symbols, covering 128 characters in its standard form.
Extended ASCII | An 8-bit version of ASCII that includes additional characters (up to 256 total) for symbols and accented letters used in other languages.
Unicode | A universal character encoding standard that can represent characters from almost all written languages, supporting over 143,000 characters and using multiple encoding formats like UTF-8, UTF-16, and UTF-32.
Emoji | Graphical symbols encoded within Unicode that allow visual representation of expressions, objects, and emotions. They are part of Unicode and often encoded in UTF-8 or UTF-16.
Backward Compatibility | The ability of a newer system (like UTF-8) to support older standards, allowing ASCII characters to be used without modification.
Multilingual Support | The capability of Unicode to represent characters from various languages, making it ideal for globalized applications and international text processing.
Hexadecimal Representation | A base-16 number system often used in Unicode to represent code points, with characters prefixed by 'U+', like U+0041 for 'A'.
Encoding Scheme | The set of rules that define how characters are converted to bytes in a particular encoding, such as UTF-8 or ASCII.
ASCII (American Standard Code for Information Interchange) | A character encoding standard that uses 7 or 8 bits to represent English letters, numbers, and basic symbols, covering 128 characters in its standard form.
Extended ASCII | An 8-bit version of ASCII that includes additional characters (up to 256 total) for symbols and accented letters used in other languages.
Unicode | A universal character encoding standard that can represent characters from almost all written languages, supporting over 143,000 characters and using multiple encoding formats like UTF-8, UTF-16, and UTF-32.
Emoji | Graphical symbols encoded within Unicode that allow visual representation of expressions, objects, and emotions. They are part of Unicode and often encoded in UTF-8 or UTF-16.
Backward Compatibility | The ability of a newer system (like UTF-8) to support older standards, allowing ASCII characters to be used without modification.
Multilingual Support | The capability of Unicode to represent characters from various languages, making it ideal for globalized applications and international text processing.
Hexadecimal Representation | A base-16 number system often used in Unicode to represent code points, with characters prefixed by 'U+', like U+0041 for 'A'.
Encoding Scheme | The set of rules that define how characters are converted to bytes in a particular encoding, such as UTF-8 or ASCII.
1: What is a character set?
A) A set of fonts used by a particular software program
B) A set of colours used in graphic design
C) A set of characters and symbols assigned a unique binary code that can be understood by a computer
D) A set of keyboard shortcuts for commonly used commands
2: What is the American Standard Code for Information Interchange (ASCII)?
A) A set of computer hardware standards
B) A type of computer virus
C) A character set used for representing text
D) A programming language used for web development
3: Why was Unicode developed?
A) To provide a standardized way of representing text across different languages and scripts
B) To provide a more efficient way of compressing digital images
C) To provide a faster way of processing mathematical equations
D) To provide a more secure way of encrypting data
4: What is the difference between ASCII and Unicode?
A) ASCII is a universal character set, while Unicode is only used for representing English text
B) ASCII uses 8 bits to represent characters, while Unicode uses 16 bits
C) ASCII is a newer character set than Unicode
D) ASCII can only represent characters from the English language, while Unicode can represent characters from any language or script
5: Why is the use of character sets important in computing?
A) It enables computers to store and display text in a standardized way, regardless of the language or alphabet used
B) It makes it easier for programmers to write code
C) It helps to prevent computer viruses and malware
D) It allows computers to perform mathematical calculations more efficiently
A) A set of fonts used by a particular software program
B) A set of colours used in graphic design
C) A set of characters and symbols assigned a unique binary code that can be understood by a computer
D) A set of keyboard shortcuts for commonly used commands
2: What is the American Standard Code for Information Interchange (ASCII)?
A) A set of computer hardware standards
B) A type of computer virus
C) A character set used for representing text
D) A programming language used for web development
3: Why was Unicode developed?
A) To provide a standardized way of representing text across different languages and scripts
B) To provide a more efficient way of compressing digital images
C) To provide a faster way of processing mathematical equations
D) To provide a more secure way of encrypting data
4: What is the difference between ASCII and Unicode?
A) ASCII is a universal character set, while Unicode is only used for representing English text
B) ASCII uses 8 bits to represent characters, while Unicode uses 16 bits
C) ASCII is a newer character set than Unicode
D) ASCII can only represent characters from the English language, while Unicode can represent characters from any language or script
5: Why is the use of character sets important in computing?
A) It enables computers to store and display text in a standardized way, regardless of the language or alphabet used
B) It makes it easier for programmers to write code
C) It helps to prevent computer viruses and malware
D) It allows computers to perform mathematical calculations more efficiently