Database vs Spreadsheet: What's the difference?

Chris Keller & Joanne Lazzaro

Consultant Alliance

http://www.consultantalliance.com/aboutus

Database vs Spreadsheet: What’s the Difference?

There is a common misconception that data tables (i.e. databases) and Excel spreadsheets are completely interchangeable. That is simply not true! What sets the two apart and why are databases are inherently better for storing and interacting with many kinds of information?

In this article, we will focus on the key distinction between database and spreadsheet, provide definitions for each of these data structures, and discuss the most common features. This overview will be relevant not only for current Excel/Google spreadsheet users, but also for anyone trying to create relational databases in tools such as FileMaker.

Overview

  • What is a Spreadsheet?
  • What is a Database?
  • How are Databases and Spreadsheets Similar?
  • 3 Key Differences Between Database and Spreadsheet
  • Pros and Cons of Databases vs Spreadsheets
  • Database vs Spreadsheet: The Bottomline

What is a Spreadsheet?

It is an electronic ledger - i.e. an electronic version of paper accounting worksheets. It was created to facilitate people who needed to present (and more importantly) recalculate their accounting or project information in tabular form on a computer. Spreadsheets contain rows and columns (i.e. tables) so this is one reason that people think spreadsheets and databases are interchangeable. Spreadsheets were among the most valuable programs developed for computers, and allowing multiple scenarios and changes to be calculated and recalculated quickly and easily. However, when handling large amounts of repetitive data (as you might have found on 3 x 5 cards in the past) they are restrictive due to the limitations of the X by Y structure of the spreadsheet.

What is a Database?

A database is a collection of data organized and managed via an unlimited number of views, with a set of rules and data relationships. However, compared to a spreadsheet, a database is much more complex. Most commonly databases are visualized as "rows and columns" giving rise to the belief that they are identical to spreadsheets. When we compare the data visualization functionalities of database software such as FileMaker, to the spreadsheets you find in Excel, any similarity ends quickly. While spreadsheets are usually limited to manageable numbers of visible cells (and practical file size limits), databases can vary greatly in size - from a small club membership list, to all the Zip codes in the United States, to all the products available at Amazon.

How are Databases and Spreadsheets Similar?  

Before we outline the key differences between a database and a spreadsheet, let's compare their similarities:

1. Both databases and spreadsheets can contain a large amount of tabular data and can use existing data to make calculations. When using Excel these calculations are performed directly on the cells themselves using formulas. Database software uses programming language, which allows more complex logic, decision making and workflows - none of which exist in a spreadsheet.

2. In a business setting, both spreadsheets and databases can be used by multiple people, so the users will work with the same data in both cases. Beyond these two common functionalities, however, the data storage features (and usability) differ enormously in every aspect of their practical use.

3 Key Differences Between Database and Spreadsheet

1.     How Data is Formatted in a Database vs Spreadsheet

Consider a typical spreadsheet -every cell is treated as a unique entity. It can store any type of information – a date, a time, a number, a string. We have different types of values in various cells, and we can also apply a specific format to these cells. The cells however, contain only raw data. Each cell is a container of a single data value - the smallest piece of information being handled - and you must pre-set the type of data contained in every cell. Databases, on the other hand, hold the smallest group of indivisible data in a database record. A record is a group of related fields - and all of those fields have relations with other fields within their record. Furthermore, each field can have specific data requirements, which prevents inadvertent mistakes – for example, in a field containing date values, should the user try to add incorrect text into a date field, the database rules will show an error, and the user will be prompted to correct the data immediately at the point of entry. In Excel (in the absence of custom programing language) you can force text data into column requiring date values, and Excel will store the string value and happily calculate something incorrectly, further along the calculation path.

2.     How Data is Stored in a Database vs Spreadsheet

In a spreadsheet, data can be stored in a cell, while in a database, data is stored in a record of a table. Our main goal is to save the information, and the key value is the information being stored; visual formatting of those individual fields is not required. Once we’ve got our data properly stored in a database structure allows us to create stunning visualizations such as charts, maps, bars, graphs, and much more. Of course, spreadsheet compilers can also be used to visualize data, but the results are a lot less sophisticated.

3.     How Calculations are Performed in a Database vs Spreadsheet

Another substantial difference is that, in a spreadsheet, different cells can contain calculations, such as functions and formulas. This means, that if you want to see, for example, a sum total - the result will be stored in another cell. In a database, on the other hand, all calculations and operations are based on the existing data and are done after its retrieval. There is a specific feature, called “views”, similar to the tables, in which you can do a calculation. These objects also contain columns that can be normal columns like the ones in the tables or could contain a certain type of calculation. In a database, there is no way you can confuse a record of data, with a calculation result.

Pros and Cons of Databases vs Spreadsheets

1.     Data Integrity

The database features mentioned so far improve data integrity – you can’t store different types of data in the same field, and it is unlikely someone will mistake a data value for an outcome of a calculation, especially in large data sets. Data integrity is a strong advantage when working with databases, as opposed to spreadsheets.

2.     Faster Manipulation of Data

Naturally, you might think a spreadsheet can contain multiple worksheets, so one can create tables in the worksheets, and then use the worksheets to create relations between the tables. Why bother using relational databases? In a spreadsheet, such relations will be logically limited. In a database, instead of setting up spreadsheets or worksheets, one can set up relations between the tables, and this will boost the performance of operations, increasing the speed with which you could manipulate your dataset.

3.     Unlimited Volume of Data

Albeit powerful for many circumstances, spreadsheets have their limitations - Excel is incapable of handling over 1 million rows of data. Databases are the solution - where having 2, 5, or 10 million records is not a problem. What might be more challenging is managing such vast swaths of data and thankfully, powerful tools such as FileMaker (and beyond) can handle these needs.

4.     Better Multi-User Experience

Referring to the multi-user functionality, spreadsheets are very limited. Often, every person must update their own spreadsheet with new data (or deal with the fragility of a shared file on a drive, where one person's error can ruin the entire file). You would justifiably think Google Docs and the latest versions of Office solve this issue, but they do so only partially. In Google Docs, you might have trouble finding out who incorrectly changed or deleted a piece information, which creates an frustrating business environment where people have a hard time organizing their tasks.  

5.     Data Consistency

A database is a stable structure, with workflow rules controlling access permissions and user restrictions. One person can make a change that is visible to everybody instantly. This feature increases efficiency and data consistency when using databases.

6.     Eliminating Duplicate Information

Considering data integrity and data consistency, using databases eliminates duplicate information, which is another way to save space and increase efficiency. Consider a “Customers” table. You know a certain first and last name corresponds to a unique email address. So, if you know Sean Northway has changed their email, and you are using a spreadsheet that use Customer Email data in multiple instances (contacts, shipping, customerservice, etc) you may change the email address once and accidentally miss updating the same address in another record. These types of mistakes can be avoided by using a relational database - an accredited user only needs to access the “Customers” table and change Sean Northway’s email address there - just once! Not only will this operation save time, but it will also prevent inconsistencies.

Database vs Spreadsheet: The Bottomline

Everything we’ve discussed so far highlights why databases are a better environment for storing and keeping track of information when working with multiple dimensions and large amounts of data. While this database vs spreadsheet comparison may be heavily weighted towards databases, spreadsheets do have their uses - they are an excellent tool that allows us to carry out extensive mathematical analysis. But for the easy retrieval and updating of data, input efficiency, data consistency, data integrity, speed, and security, relational databases are definitely the structure to choose for most business applications. Databases can store enormous quantities and varieties of raw data, from text to images, and are excellent at separating the data from the way it is displayed for analysis.

Q&A

1.     What is the Difference Between Databases and Spreadsheets?

Spreadsheets are electronic ledgers, i.e. electronic versions of accounting workbooks. Databases are collections of information organized and managed using structured query language (SQL). Aside from the difference in definitions, they are also used in completely different ways. Formatting: Spreadsheets format cells to fit certain values. You can apply formatting after you’ve put in the data and display it as a date, integer, percentage etc. With databases, you have to preset the values before input and if you attempt to plug information from a different kind into a field that’s been preset you will get an error.

Data storage: In terms of the way information is stored, the two also differ significantly – spreadsheets store everything in cells, while databases rely on records of tables.

Calculations: in a spreadsheet, calculations involving values from two different cells are performed & displayed in a third cell. In a database, all calculations and operations are done within a table.

2.     Is a Database Better Than a Spreadsheet?

Database and spreadsheets each have strengths and weaknesses. Spreadsheets are effective analytic tools; databases are best at facilitating retrieval of information, maintaining consistency, data integrity, speed, and security. Firstly, the chances of making a mistake when working with databases are lower because you can’t store different types of information within the same field, and rules can be added to check consistency within and between data fields. Databases are also much faster with calculations and updates, as they are working on an individual record or possibly a group of records at one time. A spreadsheet is recalculating every cell each time information is changed. Another great feature of databases is that they aren’t limited in the amount of data they store. Moreover, they provide a superior multi-user experience - changes made by one user in a database are instantly visible to all other users. Finally, because of this greater level of data consistency, eliminating duplicate information (e.g. two customer records with different mailing addresses, etc.) is  a lot easier in databases compared to spreadsheets.

3.     Is Excel a Database or a Spreadsheet?

Excel is not a database — it is spreadsheet software. Even though many users try to force it to function like a database, its limitations will limit your ability to use your data effectively. Excel is limited to 1 million rows of data, and using a spreadsheet for handling a large dataset can be cumbersome and slow. Current databases are limited only by your available storage space. When using Excel, you can apply formatting after you’ve input the data, and you can also input data that doesn’t comply with the preset values of a field. This creates potentials for errors, which the stricter organization of a database helps prevent. Maintaining data integrity and consistency is also problematic in Excel when multiple users need to access the file. A database also allows multiple users to make changes on the fly, with all the results instantly available to all users. Moreover, with relational databases you can easily create relationships between additional tables creating a long term fixed relationship, while working across worksheets in Excel can be cumbersome and prone to error and damage.

Whether you have an existing set of data in an Excel spreadsheet, have started a FileMaker database on your own, or just have a pile of scraps of paper with information on them, we can develop a solution that will help you manage information better.

If you would like to explore ways that a database can help your organization, contact Joanne. We will set a time for us to discuss your potential database need, show you some examples and discuss how a database can streamline your workflow and benefit those whose work depends on getting accurate information about the business.


Additional reading...

https://americanlibrariesmagazine.org/2016/01/04/cataloging-evolves/