As its name suggests a database is a base for storing data. With many advantages over the old paper system of storing data, Databases save space(physical storage space, compared to paper records), allow multiple people to access the same data at the same time and queries (similar to filters) can be done to only show the data required.
A database is system that allows us to store data in a structured way using tables and fields, and gives us various means of access to to the data.
What is the difference between a data and information? Data is a collection of facts that are meaningless on their own whereas information puts data into clear understandable context.
What is the difference between a database and a spreadsheet? Yes, when you look at a database it might look very similar to a spreadsheet. Whilst spreadsheets may primarily be used to manipulate data using functions and formula to perform calculations and statistics, whereas databases are primarily used to store data and often have relationships between tables and should allow the user to easily generate queries to view specific data. Databases are often contain much more data than a spreadsheet.
What is the difference between a database and an information system? A database may form part of the backend of an information system. As described on wikipedia ' An Information system (IS) is a formal, sociotechnical, organizational system designed to collect, process, store, and distribute information. In a sociotechnical perspective, information systems are composed by four components: task, people, structure (or roles), and technology' (https://en.wikipedia.org/wiki/Information_system) A database will contain data that is used by information systems where-as the information system comprises of the complete system, furthermore may present the data from the database in a way in which it becomes information.
All databases contain at least one Table, Record and Field, these are 3 basic terms you should understand when getting started.
Table: A table is what holds the records, fields and defines the structure of the database. Record: A record is all of the data about one item. In the table above record one is all of the details about Mercedes Field: A field refers to each type of data held. In the table above an example of a field would be TeamName
Entities: An entities is a unique classification about data such as people, objects, places or events. Attributes: An attribute is a single item of data such as the 'Points' for 'Mercedes' in the table above . Attributes could be all the details of the team in position 1, for example [1,Mercedes,Bottas,Hamilton,221].
If you were asked the question: What is the team name from the record in the second position? The answer would be 'Red Bull Racing'
WHAT IS A FLAT-FILE DATABASE
A flat file database is a database where all the records are stored in a single table. The databases above showing the 'F1 Constructors' table is a single flat file database. It has no other tables that it is connected to or has a relationship with. For a very basic and simple that is used by one or a small amount of people database a flat-file database be suitable.
Databases are more powerful when using a relational database concept. Watch the video below for more on how databases differ from simple spreadsheets.
Data validation and data verification are two important processes used to ensure the accuracy, completeness, and consistency of data in a database system. Although the terms are sometimes used interchangeably, they refer to different processes.
DATA VALIDATION | Data validation is the process of checking whether the data entered into a system is accurate, complete, and consistent with predefined rules and constraints. The purpose of data validation is to ensure that the data entered into the system is correct and can be used reliably.
Data validation is typically performed when data is first entered into the system, and it involves checking for errors, such as missing or invalid data, incorrect data types, or data that does not conform to predefined rules and constraints. Data validation may be performed using automated validation tools, such as regular expressions, or it may involve manual review and correction of data.
DATA VERIFICATION | Data verification is the process of checking whether the data in the database is accurate, complete, and consistent with the original source. The purpose of data verification is to ensure that the data stored in the database is a true representation of the original data source.
Data verification is typically performed on a periodic basis, such as during data migrations or when integrating data from multiple sources. It involves comparing the data in the database with the original source to ensure that it is accurate and complete. Data verification may involve manual checks, automated tools, or a combination of both.
Data validation and data verification are both essential processes for ensuring the accuracy, completeness, and consistency of data in a database system. Data validation checks the accuracy and completeness of data when it is first entered into the system, while data verification checks the accuracy and completeness of data stored in the database relative to the original source. By performing both data validation and data verification, organizations can ensure that their data is reliable, accurate, and useful.
TYPES OF VALIDATION
LENGTH CHECK | This validation ensures that the data entered has a specific length. It's particularly useful for fields where the length of the data is known and fixed, such as a postal code or telephone number. EXAMPLE | A user is asked to enter a ZIP code, which should be exactly 5 digits long. If the user enters less or more than 5 digits, the system will flag this as an error.
RANGE CHECK | This type of validation checks if a value falls within a certain range. It's used for numerical data where there are known minimum and maximum values. EXAMPLE | If a user is inputting their age into a form, and the valid age range is 18 to 99, any number outside of this range (like 17 or 100) will be rejected.
DATA TYPE CHECK | This validation ensures that the data entered is of the correct data type. Common data types include integers, strings, dates, etc. EXAMPLE | If a user is asked to enter a date of birth, the system will check to ensure that the input is in a valid date format (like MM/DD/YYYY). If the user enters something like "abcde", the system will reject it as it doesn't match the date data type.
PRESENCE CHECK | This validation is used to ensure that essential data is not left blank. It's a basic form of validation to ensure that required fields are filled in. EXAMPLE | On a registration form, fields such as 'Email Address' or 'Password' might be mandatory. If the user tries to submit the form without filling these fields, the system will prompt them to complete these fields.
CHECK DIGIT | A check digit is a form of redundancy check used for error detection, commonly used in sequences of numbers like account numbers, credit card numbers, etc. A check digit is calculated from the other digits and then included as part of the data. EXAMPLE | In a credit card number, the last digit is often a check digit. When a user enters their credit card number, the system can perform a calculation (like the Luhn algorithm) to verify that the number is potentially valid by checking this digit. If the check digit doesn't match the calculated value, it indicates a possible error in the entered number.
Test data is crucial in software testing because it allows developers and testers to verify that a software application functions as intended. By using test data, they can simulate how the software will perform under various conditions and with different types of input. This process helps in identifying bugs, ensuring data integrity, enhancing security, and improving user experience. Test data must be comprehensive and diverse to effectively test all aspects of the software.
NORMAL DATA | Normal data, also known as valid data, refers to input that is expected under regular usage conditions. It falls within the range of acceptable or anticipated values that the system is designed to handle. EXAMPLE | For a field accepting a person's age, normal data would be any integer between 1 and 100, assuming these are the reasonable age limits for the application's context.
ABNORMAL DATA | Abnormal data, or invalid data, is data that falls outside of what the system is designed to handle. It is used to test how the system behaves with inputs that are unexpected or out of scope. EXAMPLE | If a text field is only meant to accept alphabetic characters, entering numerical values or special characters would be considered abnormal data.
EXTREME DATA / BOUNDARY DATA | Extreme data (also known as boundary data), a subset of normal data, consists of values that are at the edge of acceptable input but still within the range that the system is supposed to handle. It is used to test the limits of the system's capabilities. EXAMPLE | For a field accepting a percentage value where the normal range is 0 to 100, extreme data examples would be exactly 0 and 100.