COMPARING SQL and NoSQL

Linh Mai
6 min readNov 17, 2020

Structured Query Language or SQL is an essential language that a data scientist should be familiar with, right after Python. SQL has been around since the 1970s and in the late 2000s, another database was developed to deal with areas that a SQL database cannot cover. It is called Not-Only-SQL or NoSQL. So, in which case should we use SQL and when should we use NoSQL? This post will define and distinguish these two widely-used databases for querying data.

SQL and RELATIONAL DATABASES

1. What is SQL?

Relational databases accessed by Structured Query Language or SQL which is a domain-specific language for querying and manipulating data. Databases are a collection of two-dimensional tables that have fixed columns and rows. Each table is very similar to a CSV file or a spreadsheet in Microsoft Excel.

An outline of a database structure

After the development of the relational model from Edgar F. Codd, SQL was developed by IBM’s researchers Donald D. Chamberlin and Raymond F. Boyce in the early 1970s, starting with System R. These researchers developed SQL with the purpose of a query language that was as easy to read as English, and that would also encompass database administration and manipulation.

SQL databases are table-based which makes it a suitable option for applications that require multi-row transactions such as an accounting system

2. How does SQL work?

Relational data just simply means that pieces of data are related to each other. For instance, from the database structure example above, data on the ‘customers’ table has some relationships to data on the ‘orders’ table, indicating that a customer may be associated with a specific order. You may also notice that there is an asterisk (*) next to a column name in each table. This indicates that the column is the primary key for that particular table. A primary key is a unique identifier for a column and it can only be unique values for that column entry. On the other hand, the primary key of this table can also appear on another table. In this case, this key is called a foreign key which means the primary key from a different table. This is the main idea of how data on different tables are associated with each other in a relational database. For example, if you were told a specific ‘customerNumber’, and then given a list of order data that included the ‘customerNumber’, you could determine which orders were placed by that customer by matching up the primary and foreign keys.

An example of a simple query in SQL:

SELECT col1, col2, col3
FROM table
WHERE records_match_criteria
LIMIT 100;

3. What are the pros and cons of SQL?

a. Pros

  • Fast queries
    SQL can be used to retrieve a large number of records from a database quickly and effectively
  • Easy to understand
    This is one of the reasons why researchers invented SQL in the first place. There is no need for a large number of code lines for retrieving data. Moreover, the keywords are straightforward and easy to understand such as SELECT, GROUP BY, UPDATE, and so on
  • Data portability
    SQL supports a variety of devices such as PCs, laptops, tablets, servers. You can run SQL on a local system with intranet and internet and the SQL databases can be easily moved from a device to another

b. Cons

  • Vertical scaling
    Most SQL databases require to scale-up vertically when you exceed the required capacity of your current server which often migrating to a larger and more expensive server
  • Complex interfaces
    Even though the coding in SQL is straightforward and not complicated, its process of interfacing is complex which becomes difficult for certain users to access it
  • Partial control
    Programmers do not have full control over the database with SQL because of the hidden business rules

4. Examples of SQL Databases

Some examples of SQL databases include SQLite, MySQL, PostgreSQL, Oracle, and Microsoft SQL Server.

NoSQL and NON-RELATIONAL DATABASES

1. What is NoSQL?

Relational databases are an absolutely convenient tool for companies to track, store, and analyze data in the simplest ways. However, as the technology has developed to an era with faster internet and widely-used smart devices, there are numerous types of data that cannot fit into “tables” or a relational format. This is why NoSQL was introduced to the world in the 2000s to work with unstructured data such as text or images and also provide some benefits that SQL databases do not have.

NoSQL databases are databases that can store data in formats other than relational tables. NoSQL can also store relationship data, but in a different way, compared to SQL. For example, a retail company wants to store chat logs between their customer service department and customers. These chats can vary in length size and quantity. In addition, for each message, metadata also needs to be stored. As you can see, over time, it could waste a lot of memory storage with a relational database, due to redundant information. Moreover, if you store this data in separate tables or use the third normal format, it could result in long runtimes for SQL queries. There are four main variants of NoSQL databases and each has its own data model: Document Stores, Key-Value Stores, Wide-Column Stores, and Graph Databases.

2. How does NoSQL work?

As stated above, each type of NoSQL database was designed to use in a specific situation. Back to the example from above, we can store the chat logs and their associated metadata with Document Stores of NoSQL. We store each message and its accompanying metadata as a document and then embed each of those documents in order in a chat document. Each document contains key-value pairs, with the actual data being stored as the value. In this way, it makes Document Stores greatly flexible, because each document can be unique. In addition, another perk of Document Stores is that we can create any key at any time. This is absolutely wonderful for working with data where we are not sure about what shape it will take, or even about whether any data will be stored at all.

3. What are the pros and cons of NoSQL?

a. Pros

  • Less need for Extract, Transform, and Load (ETL)
    With NoSQL databases, you can store data “as is” which means that it can handle a range of flat or nested structures. NoSQL can take a binary object to be passed through a set of layers, an XML document, a JSON file. Therefore, you can save time by not having to convert the data format.
  • Flexible data models — Dynamic schema
    Most NoSQL systems do not require up-front commitments to data models. NoSQL allows you to easily make changes to your databases by iterating quickly and continuously integrating new application features
  • Horizontal scalable and highly available
    NoSQL allows scaling-out horizontally which means we are able to add cheaper commodity servers when needed

b. Cons

  • Less flexible
    NoSQL is less flexible in terms of various types in NoSQL systems. Depending on your selected NoSQL database type, you may not be able to achieve all of your use cases in a single database. For instance, graph databases are great for analyzing data relationships but may not provide what you need for everyday data retrieval such as range queries
  • NoSQL is not compatible with SQL
    Some NoSQL systems do not use SQL which means you need to manually query data. Eventually, this process will take more time and make it more complex
  • ACID (Atomicity, Consistency, Isolation, Durability) transactions
    This is the most frequently cited drawback as NoSQL databases do not support ACID transactions across multiple documents. However, MongoDB added support for multi-document ACID transaction in the 4.0 release and extended them in 4.2 to span sharded clusters

4. Examples of NoSQL Databases

Some examples of NoSQL databases include MongoDB, Cassandra, CouchDB, Redis, and Amazon Neptune.

SUMMARY COMPARISON BETWEEN SQL and NoSQL

--

--

Linh Mai
0 Followers

Data Analyst with a professional background in Chemical Research