Technology Blog

I say No to SQL

"I Love programming, I hate databases & stop trying to teach me geography"

I love programming but I hate databases. Actually, to be more accurate, I love doing the things that make sense and hate trying to do the things I just don't understand - taxes, gardening, databases, etc.

Why hate? Well, maybe it was just because databases didn't come naturally and I guess that I'm lazy at heart. At school I wasn't the sort to study, especially academic stuff; English, English literature, Geography, Mathematics, etc. I was the one that rushed his homework on Sunday night and just crammed the hour before the exams in the hope of a passing grade. Little of my time at school was more than a job where a passing grade was the product. I needed the time to spend on the thing that I really needed to understand, the thing that made total sense; electronics. Electronics and computers were not on the curriculum.

The end-of year disco was however the one part of school life where passion and school intersected in electronic light shows and heavy bass... yes, I was the end-of-term DJ!

It was only after school that I found that qualifications could be had and, at little expense! I was able to win the prize of "Computer Science Degree" by simply attending the first class where they tell you what you'll learn and then the last class where you hand in your finished project – in my case, it was an enhanced help system for the BBC Micro that I'd created to help speed up my coding, I just turned it in! I was also able to take a half dozen exams in the 90's to gain a slew of Microsoft certifications in networking, communications, server management and more using a similar technique to become a Microsoft Certified Professional but, after failing the grueling three hour Microsoft Certified Database Developer exams twice I just consigned databases to the unknown.

For those that aren't into databases, they speak their own textual language called SQL or Structured Query Language, one that's at odds with modern programming practices and the way we store data in the programs. It's a bit like converting a recipe from US cups to metric weights - they're just not the same thing. Add to that the way that we get and update data, computer programs store information in lists and objects with properties, to transfer that to the database you have to convert it into something akin to a spoken sentence "Please would you update all of the orders where the data is this and make it that." substituting this and that with your numbers. It might sound polite but it's convoluted and very error prone.

I thought I was on my own but through 2008-2010 I discovered that I was part of a movement that felt the same, except that the members of the movement contained a bunch of people that really understood how databases work but, at the same time, hated them enough to find a better way, which they did... Enter NoSQL.

 

Tech deep dive:

I'm proud to say that my early coding consisted of a separate abstraction for the database, a set of classes responsible for translating from objects and lists of objects to the SQL statements needed to access the database - there was a lot of string concatenation though!

I started on Microsoft SQL as it was the engine that I was most familiar with however, as the DB grew, my tenuous software license was starting to bother me – oh come on, we've all been there – and so I looked to MySQL as a more affordable solution. The reason I don't say 'free' is that there's more work than it seems on the surface to make the change. Sure, there's some syntax changes between Microsoft and 'My' SQL like 'limit' in select clauses but, also there are also choices to be made around the storage engines MYISAM or InnoDB for example and, given that the data is so precious, data restoration rehearsals take time as well. However, at this time, I made a mistake that ultimately enabled the biggest and best ever change to FoodDays codebase, I chose MyISAM table structure in error.

MyISAM tables are really fast, something that appealed to me as FoodDays was growing and the DB engine was starting to struggle but, unlike the Microsoft SQL engine, MySQL does not do secret background magic which, in this case means indexes. Microsoft SQL will dynamically create indexes to speed up frequent queries and in their absence, MySQL was unusually slow in real terms. This forced the first important mistake driven improvement - I learnt about indexes.

As I'd chosen the wrong table format I'd forsaken transactional integrity - SQL with InnoDB supports transactions and foreign key constraints, MyISAM doesn't and, to work around this I'd bumbled through building transactions and foreign key constraints into my code - truthfully, leaning on the DB to enforce these things is a little sloppy anyway, right? No? - Actually, I'd de-normalized the database in such a way that both foreign key constraints didn't matter and were, in some ways restrictive which, had opened up a more flexible approach to storing orders - yes, the order contained the Childs information when the order was placed, not with the current information so, no weird orders in the wrong homeroom.

And then... 2012 and the arrival of NoSQL - MongoDB to be precise. Finally, a database engine that stored objects instead of behaving like a giant spreadsheet! It was a breath of fresh air, but wait, NoSQL is only really suitable for cases where the database doesn't rely on foreign key constraints and transactions and who has that? Answer: Any idiot that picked the wrong table engine in MySQL - huzzah, I win!

The transition to using a NoSQL database engine was fairly lengthy but oh so satisfying, lines and lines of loops and string concatenation replaced by a single call "Here's an order Mr. DB, save that."

MongoDB was in place for at least five years and served us well however, the constant march of progress resulted in one more change - DynamoDB.

DynamoDB is Amazons answer to NoSQL data storage but, it's more than that, it's a server-less peta-byte scale solution with zero maintenance forged in the heat of Prime Day. Throughout the late 2010's I've been looking to move all aspects of infrastructure off of systems that I maintain towards systems that  are maintained for us and, when that goal, scalability and reduced cost intersect, well, it's worth the effort to move.

In DynamoDB, I've worked with tables of over 9 billion 'rows' yet still been able to find and retrieve groups of records in single mS requests. All it takes is some careful consideration of indexes and and understanding of eventual consistency. It's the most raw DB I've used but is my fav.

To date, and this is probably a mistake to print, the uptime for DynamoDB has been 100.000000% It has never even been slow and that's the kind of infrastructure that frees you to work on the stuff that you can make better.