One place for hosting & domains

      Relational

      Introduction to Relational Databases and RDBMSs


      Many programs need to save or store data for later use and read data that is recorded. Although there are many ways to do this, the most common approach is to use a Relational Database Management System (RDBMS).
      MySQL
      ,
      PostgreSQL
      , and
      SQLite
      are a few industry-standard open-source RDBMSs that have been widely adopted by software development projects. This guide provides an overview of relational databases and RDBMS concepts.

      What is a Relational Database?

      A database is an application for storing and retrieving data. Although the mechanisms differ, most databases provide an API allowing users to add, delete, access, search, and manage their data. As an alternative to using a database, data can be stored in text files or hash tables. However, this technique is not as fast or as convenient as using a database and is rarely used in modern systems.

      Early database applications evolved into the modern relational database, which allows users to store massive amounts of data. A relational database management system (RDBMS) is a software application that creates and maintains relational databases. An RDBMS no longer forces users to store data in one big table. It provides more structured ways of partitioning the data and is designed for more efficient access. RDBMS applications are optimized for fast reads and writes and bulk transfer of information.

      Database designers conceptualize and organize the data in terms of tables, columns, and rows. A row is also referred to as a record, or tuple. Contemporary relational databases structure the data using the following concepts:

      • Each database contains one or more tables.
      • When the user creates a table, they specify the columns within the table at the same time.
      • Each column represents a specific attribute, or field, within the record. A column is designed to hold data of a particular data type, for example, VARCHAR, which stands for a variable-length string.
      • A table contains a cluster of rows.
      • Each row within a table represents a unique database entry. Each column within the row contains an individual field in that entry.
      • A database table is like a two-by-two matrix. Each square inside the matrix contains a piece of data.

      An RDBMS is considered to be relational because it allows users to define relationships within and between the various tables using keys and indices. A relational database permits a user to provide or generate a primary key for each row. SQL can guarantee that this key is unique within the table. The fields in these tables might be related to one another based on their primary and foreign keys. These relationships help structure and organize the database and limit the amount of data duplication.

      An RDBMS application always provides the capabilities listed below. Individual applications might offer more options.

      • It allows for the creation, definition, modification, and removal of database tables, columns, rows, primary keys, and indices.
      • It accepts SQL queries and stores or retrieves the relevant data, combining information from different database tables as necessary.
      • It guarantees the integrity of the data and the references between the tables. For example, a foreign key always points to a valid row in another table.
      • It automatically updates indices, timestamps, and other internally-generated attributes as required.

      Relational databases use the Structured Query Language (SQL) to query and update the database. For example, an RDBMS client uses the SQL INSERT command to add a new row to one of the database tables. When a user adds a new row, they simultaneously specify a value for each column. Additional SQL commands are used to modify and delete rows, manage database items, and retrieve a list of records meeting specific criteria.

      For example, consider a database for a school. This database has several tables, for teachers, students, courses, classrooms, and so forth. The definition of the Students table might contain columns for the student’s first and last name, ID, grade, family, and more. Each row in this table symbolizes an individual student and serves to represent and collect all relevant information about that student. If the student’s name is “John”, the first_name column in this row contains John. The student ID can serve as the index and primary key and could be used to cross-reference the student in other tables.

      For instance, a simplified Students table can be defined using the structure displayed below. The top row represents the names of the columns in the table. The table below currently has two rows of data, one for each student.

      first_namelast_namegradefamily_idstudent_id
      JohnDoe411165005
      JaneStudent512245350

      What are Some Common RDBMS Terms?

      The following terms are frequently used in relation to databases:

      • Column: A set of values of the same data type, representing one attribute within a table. Columns are defined when a table is created.
      • Compound Key: A key consisting of multiple columns. A compound key is used when a single column cannot reliably identify a row.
      • Database: An organized group of data that is stored electronically. A database is usually organized into smaller clusters of information.
      • Foreign Key: An index used to cross-link a table entry to a row in another table.
      • Index: A method of more quickly accessing database entries. An index can be created using any combination of attributes, but implementation is application-specific. A database index is similar to an index in a book.
      • Primary Key: A column serving as an index to uniquely identify a row inside a table. A primary key can either be auto-generated or defined in the table definition. A primary key can be used to locate a specific row within a table.
      • Referential Integrity: An internal database property to ensure a foreign key always references a valid row in another table.
      • Relational DataBase Management System (RDBMS): A type of database system based on relationships between tables and entries.
      • Row: A structured entry within a table consisting of a set of related data. Each row in a table has the same structure, which corresponds to the column specifications in the table definition. A row is also referred to as a record or a tuple.
      • Structured Query Language (SQL): A simplified domain-specific programming language used to manage data in an RDBMS.
      • Table: A collection of database records, consisting of a series of rows and columns. A table can be thought of as a two-dimension matrix of information.

      SQL vs. MySQL

      The terms SQL and MySQL are often mixed up or used interchangeably, but they are not the same. SQL is the standard programming language for querying RDBMS applications. It is used to write database queries and can be used with any database system that supports it. MySQL is a specific instance of an RDBMS that uses SQL. Database users send SQL commands to an RDBMS such as MySQL to read and write data, and to administer the database. There is no application named SQL, so it does not make sense to make a “SQL vs MySQL” comparison. However, the term SQL database is often used informally as a shorthand term for any relational database.

      The SQL Language

      The SQL language is specified as a series of statements. It is not considered a general-purpose imperative programming language like Python, because it lacks a full range of data structures and control statements. It is instead a domain-specific language intended for a single purpose. SQL is designed for the querying, definition, and manipulation of data. It’s also designed to provide data access control. One advantage of SQL is that it can access multiple records using only one command. It does not specify how the database should access an entry.

      The SQL language consists of designated keywords, expressions, queries, statements, operators, and optional clauses. Object identifiers are used to refer to database entities, including tables and columns. SQL supports a large number of predefined data types, such as CHAR, for the character, and INTEGER. Some of the most important SQL operators include =, <>, >, <, IN, LIKE, TRUE, FALSE, and NOT. Recent releases of SQL now support a simple CASE statement. The MySQL documentation contains more information about the SQL
      language structure
      ,
      data types
      , and
      statements
      .

      Some of the most widely-used SQL statements and clauses include the following:

      • ALTER: Modifies the structure of a database object.
      • CREATE: Creates a database object, such as a table or database.
      • DELETE: Removes one or more existing rows from the database.
      • DROP: Permanently deletes an object from the database.
      • FROM: Indicates which table to use for the query.
      • GRANT: Authorizes a database user to perform a particular action.
      • GROUP BY: A clause to organize output from a SELECT statement.
      • INSERT: Adds rows to the database.
      • JOIN: A clause specifying how to combine and assemble data from multiple tables.
      • MERGE: Combines data from multiple tables.
      • ORDER BY: A clause for sorting the output from a query.
      • SELECT: Retrieves data from one or more tables. This command does not alter the database or change any data.
      • UPDATE: Modifies one or more existing rows.
      • WHERE: A clause to identify the rows a query should operate on. It is typically used with a comparison operator.

      The wildcard * operator is often used in conjunction with the SELECT command. This command instructs SQL to display all columns in the output.

      Below are a couple of examples of SQL queries. The following SQL command displays the name of each class in the Class database for each row where the value of the subject column is math.

      SELECT name
          FROM Class
          WHERE subject="math";
      

      The next SQL statement creates the Class table. The CREATE statement defines each column in the table, along with its data type, in sequential order. The VARCHAR data type is used to hold a variable-length string. The SMALLINT data type is used for small integer values from the signed range of -32768 to 32767.

      CREATE TABLE Class (
          classID smallint,
          name varchar(255),
          subject varchar(255),
          level smallint
      );
      

      SQL vs. NoSQL

      NoSQL systems are an alternative to traditional SQL-based RDBMS applications. As the name implies, they use a non-relational model to handle data. They are typically less structured and more flexible than an RDBMS. NoSQL systems are not standardized and can take a variety of formats. However, they are typically key-value, graph, or document-based, not table-based. Some NoSQL applications can use structured domain-specific languages or even accept SQL queries in parallel. A few examples of NoSQL applications include Redis and
      MongoDB
      . For more information on NoSQL systems, consult the Linode guide for a
      comparison between SQL and NoSQL databases
      .



      Source link

      What is the Relational Model?


      The relational model is an abstract model used to organize data within a database. In order to control access to a database, write data, run queries, or perform any other tasks related to database management, a database management system must have some kind of underlying model that defines how the data within it are organized.

      Databases that implement the relational model are often referred to as relational databases. The relational model was for a long time the most sophisticated model for organizing data, and its widespread use has only recently been curbed by the rise of nonrelational — or, NoSQL — data models.

      The most fundamental elements in the relational model are relations, which users and modern RDBMSs recognize as tables. A relation is a set of tuples, or rows in a table, with each tuple sharing a set of attributes, or columns:

      Diagram example of how relations, tuples, and attributes relate to one another

      A column is the smallest organizational structure of a relational database, and represents the various facets that define the records in the table. Hence their more formal name, attributes. It can be helpful to think of each tuple as a unique instance of whatever type of people, objects, events, or associations the table holds.

      For more educational resources related to the relational model, please visit:

      For more information on how to work with relational databases, we encourage you to check out our How To Use SQL tutorial series.



      Source link

      Understanding Relational Databases


      Introduction

      Database management systems (DBMS) are computer programs that allow users to interact with a database. A DBMS allows users to control access to a database, write data, run queries, and perform any other tasks related to database management.

      In order to perform any of these tasks, though, the DBMS must have some kind of underlying model that defines how the data are organized. The relational model is one approach for organizing data that has found wide use in database software since it was first devised in the late 1960s, so much so that, as of this writing, four of the top five most popular DBMSs are relational.

      This conceptual article outlines the history of the relational model, how relational databases organize data, and how they’re used today.

      History of the Relational Model

      Databases are logically modelled clusters of information, or data. Any collection of data is a database, regardless of how or where it is stored. Even a file cabinet containing payroll information is a database, as is a stack of hospital patient forms, or a company’s collection of customer information spread across multiple locations. Before storing and managing data with computers was common practice, physical databases like these were the only ones available to government and business organizations that needed to store information.

      Around the middle of the 20th century, developments in computer science led to machines with more processing power, as well as greater local and external storage capacity. These advancements led computer scientists to start recognizing the potential these machines had for storing and managing ever larger amounts of data.

      However, there weren’t any theories for how computers could organize data in meaningful, logical ways. It’s one thing to store unsorted data on a machine, but it’s much more complicated to design systems that allow you to add, retrieve, sort, and otherwise manage that data in consistent, practical ways. The need for a logical framework for storing and organizing data led to a number of proposals for how to harness computers for data management.

      One early database model was the hierarchical model, in which data are organized in a tree-like structure, similar to modern-day filesystems. The following example shows how the layout of part of a hierarchical database used to categorize animals might look:

      Example Hierarchical Database: Animal categorization

      The hierarchical model was widely implemented in early database management systems, but it also proved to be somewhat inflexible. In this model, even though individual records can have multiple “children,” each record can only have one “parent” in the hierarchy. Because of this, these earlier hierarchical databases were limited to representing only “one-to-one” and “one-to-many” relationships. This lack of “many-to-many” relationships could lead to problems when you’re working with data points that you’d like to associate with more than one parent.

      In the late 1960s, Edgar F. Codd, a computer scientist working at IBM, devised the relational model of database management. Codd’s relational model allowed individual records to be associated with more than one table, thereby enabling “many-to-many” relationships between data points in addition to “one-to-many” relationships. This provided more flexibility than other existing models when it came to designing database structures, and meant that relational database management systems (RDBMSs) could meet a much wider range of business needs.

      Codd proposed a language for managing relational data, known as Alpha, which influenced the development of later database languages. Two of Codd’s colleagues at IBM, Donald Chamberlin and Raymond Boyce, created one such language inspired by Alpha. They called their language SEQUEL, short for Structured English Query Language, but because of an existing trademark they shortened the name of their language to SQL (referred to more formally as Structured Query Language).

      Due to hardware constraints, early relational databases were still prohibitively slow, and it took some time before the technology became widespread. But by the mid-1980s, Codd’s relational model had been implemented in a number of commercial database management products from both IBM and its competitors. These vendors also followed IBM’s lead by developing and implementing their own dialects of SQL. By 1987, both the American National Standards Institute and the International Organization for Standardization had ratified and published standards for SQL, solidifying its status as the accepted language for managing RDBMSs.

      The relational model’s wide use across multiple industries led to it becoming recognized as the standard model for data management. Even with the rise of various NoSQL databases in more recent years, relational databases remain the dominant tools for storing and organizing data.

      How Relational Databases Organize Data

      Now that you have a general understanding of the relational model’s history, let’s take a closer look at how the model organizes data.

      The most fundamental elements in the relational model are relations, which users and modern RDBMSs recognize as tables. A relation is a set of tuples, or rows in a table, with each tuple sharing a set of attributes, or columns:

      Diagram example of how relations, tuples, and attributes relate to one another

      A column is the smallest organizational structure of a relational database, and represents the various facets that define the records in the table. Hence their more formal name, attributes. You can think of each tuple as a unique instance of whatever type of people, objects, events, or associations the table holds. These instances might be things like employees at a company, sales from an online business, or lab test results. For example, in a table that holds employee records of teachers at a school, the tuples might have attributes like name, subjects, start_date, and so on.

      When creating columns, you specify a data type that dictates what kind of entries are allowed in that column. RDBMSs often implement their own unique data types, which may not be directly interchangeable with similar data types in other systems. Some common data types include dates, strings, integers, and Booleans.

      In the relational model, each table contains at least one column that can be used to uniquely identify each row, called a primary key. This is important, because it means that users don’t need to know where their data is physically stored on a machine; instead, their DBMS can keep track of each record and return them on an ad hoc basis. In turn, this means that records have no defined logical order, and users have the ability to return their data in whatever order or through whatever filters they wish.

      If you have two tables that you’d like to associate with one another, one way you can do so is with a foreign key. A foreign key is essentially a copy of one table’s (the “parent” table) primary key inserted into a column in another table (the “child”). The following example highlights the relationship between two tables, one used to record information about employees at a company and another used to track the company’s sales. In this example, the primary key of the EMPLOYEES table is used as the foreign key of the SALES table:

      Diagram example of how the EMPLOYEE table's primary key acts as the SALES table's foreign key

      If you try to add a record to the child table and the value entered into the foreign key column doesn’t exist in the parent table’s primary key, the insertion statement will be invalid. This helps to maintain relationship-level integrity, as the rows in both tables will always be related correctly.

      The relational model’s structural elements help to keep data stored in an organized way, but storing data is only useful if you can retrieve it. To retrieve information from an RDBMS, you can issue a query, or a structured request for a set of information. As mentioned previously, most relational databases use SQL to manage and query data. SQL allows you to filter and manipulate query results with a variety of clauses, predicates, and expressions, giving you fine control over what data will appear in the result set.

      Advantages and Limitations of Relational Databases

      With the underlying organizational structure of relational databases in mind, let’s consider some of their advantages and disadvantages.

      Today, both SQL and the databases that implement it deviate from Codd’s relational model in several ways. For instance, Codd’s model dictates that each row in a table should be unique while, for reasons of practicality, most modern relational databases do allow for duplicate rows. There are some that don’t consider SQL databases to be true relational databases if they fail to adhere to each of Codd’s specifications for the relational model. In practical terms, though, any DBMS that uses SQL and at least somewhat adheres to the relational model is likely to be referred to as a relational database management system.

      Although relational databases quickly grew in popularity, a few of the relational model’s shortcomings started to become apparent as data became more valuable and businesses began storing more of it. For one thing, it can be difficult to scale a relational database horizontally. Horizontal scaling, or scaling out, is the practice of adding more machines to an existing stack in order to spread out the load and allow for more traffic and faster processing. This is often contrasted with vertical scaling which involves upgrading the hardware of an existing server, usually by adding more RAM or CPU.

      The reason it’s difficult to scale a relational database horizontally has to do with the fact that the relational model is designed to ensure consistency, meaning clients querying the same database will always retrieve the same data. If you were to scale a relational database horizontally across multiple machines, it becomes difficult to ensure consistency since clients may write data to one node but not the others. There would likely be a delay between the initial write and the time when the other nodes are updated to reflect the changes, resulting in inconsistencies between them.

      Another limitation presented by RDBMSs is that the relational model was designed to manage structured data, or data that aligns with a predefined data type or is at least organized in some predetermined way, making it easily sortable and searchable. With the spread of personal computing and the rise of the internet in the early 1990s, however, unstructured data — such as email messages, photos, videos, etc. — became more common.

      None of this is to say that relational databases aren’t useful. Quite the contrary, the relational model is still the dominant framework for data management after over 40 years. Their prevalence and longevity mean that relational databases are a mature technology, which is itself one of their major advantages. There are many applications designed to work with the relational model, as well as many career database administrators who are experts when it comes to relational databases. There’s also a wide array of resources available in print and online for those looking to get started with relational databases.

      Another advantage of relational databases is that almost every RDBMS supports transactions. A transaction consists of one or more individual SQL statements performed in sequence as a single unit of work. Transactions present an all-or-nothing approach, meaning that every SQL statement in the transaction must be valid; otherwise, the entire transaction will fail. This is very helpful for ensuring data integrity when making changes to multiple rows or tables.

      Lastly, relational databases are extremely flexible. They’ve been used to build a wide variety of different applications, and continue working efficiently even with very large amounts of data. SQL is also extremely powerful, allowing you to add and change data on the fly, as well as alter the structure of database schemas and tables without impacting existing data.

      Conclusion

      Thanks to their flexibility and design for data integrity, relational databases are still the primary way data are managed and stored more than fifty years after they were first conceived of. Even with the rise of various NoSQL databases in recent years, understanding the relational model and how to work with RDBMSs are key for anyone who wants to build applications that harness the power of data.

      To learn more about a few popular open-source RDBMSs, we encourage you to check out our comparison of various open-source relational SQL databases. If you’re interested in learning more about databases generally, we encourage you to check out our complete library of database-related content.



      Source link