Monday, March 25, 2013

Why NoSQL matters?

Web applications have been evolving on a daily basis, mainly because web is something so visible that will be shamed if your website is not at the same level of your competitors.

For years the languages and frameworks to build websites changed to fill the demand of new agile development methodologies, servers like nodejs appeared to deal with changes, jquery appeared to allow the creation of richer content, etc. you'll notice the evolution started from front-end technologies and it's moving towards the backend, now it's the time for the ultimate frontier to adjust to this change.

Evolution of the database model

One of the things that remains constant over the years is the need to map your information to storage applications like MySQL, Oracle database, Postgres or SQL server. One thing they have in common it's all of them are based on a rigid model, the model of tables, rows and columns, and these lead to several problems, these problems you need to address during this always changing web environment.

  • Managing the instructions at code level to work with a new set of data or new columns at the database level.
  • Addressing deployment problems, if you create a new field in a table you'll need to take down your server and do a painful process of upgrading, (run scripts to change your tables, execute an script to migrate, or at least update, your new model, and finally updating the code that changed)

What if these changes means that your model changed in a more radical way? What if you created a new type of product that requires a full new model? That's something to thing about.

This kind of problems have been a constant over the years until now, some software developers, as myself, started to wonder if there will be a better way to store the information that is more aligned with the current challenges, that's how the NoSql movement started, as a way to solve this new data model paradigm.

The new data model

To give a better explanation about how NoSQL address the problem of these always changing data models, let's work using an example as a basis. Lets assume we're going to store customers from our newly created website, to do this job we decided to create a table in an RDBMS, and we included columns like name, last name and genre. After some days we put together an application and finally it's running on web, then tweet about giving some benefits to the people that register during the weekend, after a good weekend you end up with 3M records, you're surprised because you didn't expected that kind of traffic, then you check the db and realized that you didn't provide enough space for some long last names and some of them are truncated, darn you'll need to expand the column.... But, you can not change the size of a column in SQL server, you need to create a new table, move the records and that will mean your server will be out for 1 hour. Does this sound familiar? Even worst, you need to include some new fields in the process, meaning changing the HTML, your server side, doing some migration, etc.

Although JSON does not solve all the problems that come with these kind of changes it's really easy to update the underlining model just modifying a JSON structure than changing the tables, columns, rows and SQL statements, and that's how NoSQL helps in the development of web application, allowing to make this kind of changes easily, storing JSON instead of traditional data model.

A Customer could be model in a JSON like this:


    {
         name: "John",
         lastName: "Smith",
         genre: "Male"
    }
Same problem stated before, what if you need to store a longer last name? Because the model is not rigid, your users will never had the problem in the first place, and their last names were not truncated. What if you need to store new information?

    {
         name: "John",
         lastName: "Smith",
         genre: "Male"
    }

    {
         name: "Peter",
         lastName: "Korn",
         genre: "Male", 
         email: "test@mycompany.com"
    }

Done!, you don't need to change anything at the database level. If your REST services are treated as general REST services that get JSON and post them to the DB, you will not need to change your backend either, and the only change you will need is at the HTML and javascript level, which will allow you to do a cleaner deployment of the new version, just post the new HTML and javascripts at the web sever folder and that's it, you are ready to go.

Conclusions

Although a good requirements/design/test will always be better than having to get into a production environment to realize that you forgot a piece of data that is really important to your business, the NoSQL databases allows you to upgrade your applications easier, and relieves the pain of creating an application from ground up, it really empowers the developer to explore the data structure while he is creating his master piece.

Later I will keep exploring samples on how to use NoSQL to create applications in a faster/easier way, in the mean time please go to here: http://djondb.com/blog and check out some of the full examples that you will find in there.

Friday, March 1, 2013

Balance after 5 months of releasing djondb as Open Source

After 5 months of releasing a big product as an open source project I want to write about the progress, the goods, the bads, and share the future releases of djondb.

Briefing

I've been working on djondb for a year, it started as a proof of concept, a learning project and a challenge, rapidly it grew to a running version of a server and I started to wonder if there was space for a new NoSQL database, and here's the video of the speech I gave at BogoDev in 2012

Although it was a living thing it was far from been a production server. After this speech I realized that there was a lot of people with the same requirements, a NoSQL database that can store JSON, ensure everything goes to disk and not into "lose-able" memory, and been able to handle transactions over multiple documents. Last, but not least, been able to run it in Linux, OSX and Windows.

Several tasks were required to be completed once the MVP was a real application:

  • Create an stable full minimal featured release (Insert, Updates, Deletes, Find with meaningful queries, etc)
  • Implement drivers for most common languages.
  • Document all the features
  • Create samples
  • Installers for each platform
  • Drink couple of beers! (this was started even before the first line of code)

The first version of all the features was released a couple of months later, and djondb was open sourced in github since September 2012.

I almost remember the face of my wife once I told her that I will put the code as open source, "Is he crazy?, he is doing this great product and he will give it away?", but I was completely sure that was the best choice, here is the list of the reasons to go to this path:

  • Give something to the world. I'm using open source tools so seems fair enough to give something back.
  • An open source is easier to promote, mainly because Open source promotes itself, the advocates of open source push this kind of developments to keep them going.
  • Some developers might join the development team and we will share the glory, and the money that could come with this solution.
  • Been open source the community will help with feedback, samples, document translation, etc.

In November I did the first "official" speech about djondb in Colombia 3.0, one of the biggest Content Summit of Latin America, here's the video:

After this I've been working on drivers, documentation, new features, bug fixing, etc. And released the version 0.2, which had several performance improvements.

During the latest months I've been working on new features that are going to be released at the end of March.

Balance

djondb brough a lot of learning, not only about coding databases, or high performance systems, but on how the Open source ecosystem works, what means to be an Entrepreneur (I'm still learning, and I think I will never stop learning), what traction is, what an investor sees, etc.

djondb has been a very successful project, more than 350 downloads during the last three months and a lot of great reviews. Some good projects are already using it, and people really share the same though of NoSQL as a profesional tool to develop business applications.

Not everything is nice

Although I've been working every night to improve and release a better product, it's very disappointing to realize that Open source is not what I thought it will be, it's not a great community trying to help out, sharing ideas or helping with spreading the word, it's more like a bunch of guys (like me), working really hard to share a dream with bare hands and passion. Most of the histories I read everyday about lone riders that try to create open source is just like mine, working hard and getting few or none feedback, just a lot of downloads.

Lessons I learnt from the open source, maybe I'm wrong and it's only my perception, but this is what I think:

  • Most people think open source means free, they don't even think that feedback or test is a way to share and help.
  • Investors put weird face when you say "Open source", unless you already show that there's a lot of traction. And I mean huge traction, not the kind you would need if you "own" the code.
  • Other developers look at the open source as a way to get their projects done without paying anything. (there're several ways to give back in the open source, like feedback, helping documentation, maybe fixing a bug or two, etc)

Summary

Don't go with the open source option if you don't have a company that supports you. Open source will mean that you put all your ideas on the web and almost anyone can copy you and create their own product without any responsibility, I know... you could argue that you can show you posted the code first, etc etc. (If you argue that you could patent your code let me remind you that procedure costs U$35.000 and takes 5 years to be approved). The truth is that you can not protect your idea as open source, you gave away your knowledge and you won't get anything back.