COMPARING SQL and NoSQL

6 min readNov 17, 2020

Structured Query Language or SQL is an essential language that a data scientist should be familiar with, right after Python. SQL has been around since the 1970s and in the late 2000s, another database was developed to deal with areas that a SQL database cannot cover. It is called Not-Only-SQL or NoSQL. So, in which case should we use SQL and when should we use NoSQL? This post will define and distinguish these two widely-used databases for querying data.

SQL and RELATIONAL DATABASES

1. What is SQL?

Relational databases accessed by Structured Query Language or SQL which is a domain-specific language for querying and manipulating data. Databases are a collection of two-dimensional tables that have fixed columns and rows. Each table is very similar to a CSV file or a spreadsheet in Microsoft Excel.

After the development of the relational model from Edgar F. Codd, SQL was developed by IBM’s researchers Donald D. Chamberlin and Raymond F. Boyce in the early 1970s, starting with System R. These researchers developed SQL with the purpose of a query language that was as easy to read as English, and that would also encompass database administration and manipulation.

SQL databases are table-based which makes it a suitable option for applications that require multi-row transactions such as an accounting system

2. How does SQL work?

Relational data just simply means that pieces of data are related to each other. For instance, from the database structure example above, data on the ‘customers’ table has some relationships to data on the ‘orders’ table, indicating that a customer may be associated with a specific order. You may also notice that there is an asterisk (*) next to a column name in each table. This indicates that the column is the primary key for that particular table. A primary key is a unique identifier for a column and it can only be unique values for that column entry. On the other hand, the primary key of this table can also appear on another table. In this case, this key is called a foreign key which means the primary key from a different table. This is the main idea of how data on different tables are associated with each other in a relational database. For example, if you were told a specific ‘customerNumber’, and then given a list of order data that included the ‘customerNumber’, you could determine which orders were placed by that customer by matching up the primary and foreign keys.

An example of a simple query in SQL:

SELECT col1, col2, col3
FROM table
WHERE records_match_criteria
LIMIT 100;

3. What are the pros and cons of SQL?

a. Pros

Fast queries
SQL can be used to retrieve a large number of records from a database quickly and effectively
Easy to understand
This is one of the reasons why researchers invented SQL in the first place. There is no need for a large number of code lines for retrieving data. Moreover, the keywords are straightforward and easy to understand such as SELECT, GROUP BY, UPDATE, and so on
Data portability
SQL supports a variety of devices such as PCs, laptops, tablets, servers. You can run SQL on a local system with intranet and internet and the SQL databases can be easily moved from a device to another

b. Cons

Vertical scaling
Most SQL databases require to scale-up vertically when you exceed the required capacity of your current server which often migrating to a larger and more expensive server
Complex interfaces
Even though the coding in SQL is straightforward and not complicated, its process of interfacing is complex which becomes difficult for certain users to access it
Partial control
Programmers do not have full control over the database with SQL because of the hidden business rules

4. Examples of SQL Databases

Some examples of SQL databases include SQLite, MySQL, PostgreSQL, Oracle, and Microsoft SQL Server.

NoSQL and NON-RELATIONAL DATABASES

1. What is NoSQL?

Relational databases are an absolutely convenient tool for companies to track, store, and analyze data in the simplest ways. However, as the technology has developed to an era with faster internet and widely-used smart devices, there are numerous types of data that cannot fit into “tables” or a relational format. This is why NoSQL was introduced to the world in the 2000s to work with unstructured data such as text or images and also provide some benefits that SQL databases do not have.

NoSQL databases are databases that can store data in formats other than relational tables. NoSQL can also store relationship data, but in a different way, compared to SQL. For example, a retail company wants to store chat logs between their customer service department and customers. These chats can vary in length size and quantity. In addition, for each message, metadata also needs to be stored. As you can see, over time, it could waste a lot of memory storage with a relational database, due to redundant information. Moreover, if you store this data in separate tables or use the third normal format, it could result in long runtimes for SQL queries. There are four main variants of NoSQL databases and each has its own data model: Document Stores, Key-Value Stores, Wide-Column Stores, and Graph Databases.

2. How does NoSQL work?

As stated above, each type of NoSQL database was designed to use in a specific situation. Back to the example from above, we can store the chat logs and their associated metadata with Document Stores of NoSQL. We store each message and its accompanying metadata as a document and then embed each of those documents in order in a chat document. Each document contains key-value pairs, with the actual data being stored as the value. In this way, it makes Document Stores greatly flexible, because each document can be unique. In addition, another perk of Document Stores is that we can create any key at any time. This is absolutely wonderful for working with data where we are not sure about what shape it will take, or even about whether any data will be stored at all.

3. What are the pros and cons of NoSQL?