Wednesday, June 5, 2013

Parsing command line arguments in a batch program (Windows)

Everytime I'd to create a batch to do compilations, or any other kind of stuff I always bumped to the same problem, parsing command line arguments. Usually I ended up with something like:


if "%1"=="-x32" (
   Do something for x32
)
if "%2"=="-x32" (
   Do something for x32
)
if "%1"=="-x64" (
   Do something for x64
)
if "%2"=="-x64" (
   Do something for x64
)

This is really bad way to do it, so I decided to spend sometime today and solve the problem once and for all, as a result I finally have a decent way to parse arguments, and I will document it in here in case I need it again or if someone else is struggling to get this for himself. (I love getopts from Linux! but I guess Windows will keep forcing us to this kind of solutions)

 @echo off  
 setlocal enabledelayedexpansion  
 if [%1] ==[] goto usage  
 call:parseArguments %*  
 if "%x32%" == "true" (  
   echo Well done you set x32 to true  
 )  
 if "%x64%" == "true" (  
   echo Well done you set x64 to true  
 )  
 if NOT "%d" == "" (  
   echo you set the output dir to: %d%  
 )  
 GOTO Exit  


 @rem ================================================================================  
 @rem Functions  
 @rem ================================================================================  
 :usage  
 Echo Usage: %0 [-x32] [-x64] [-d output-dir]  
 goto exit  

 :getArg  
 set valname=%~1  
 echo arg: !%valname%!  
 goto:eof  

 :parseArguments  
 rem ----------------------------------------------------------------------------------  
 @echo off  
 :loop  
 IF "%~1"=="" GOTO cont  
 set argname=%~1  
 set argname=%argname:~1,100%  
 set value=%~2  
 @rem if the next value starts with - then it's a new parameter  
 if "%value:~0,1%" == "-" (  
   set !argname!=true  
   SHIFT & GOTO loop  
 )  
 if "%value%" == "" (  
   set !argname!=true  
   SHIFT & GOTO loop  
 )  
 set !argname!=%~2  
 @rem jumps first and second parameter  
 SHIFT & SHIFT & GOTO loop  

 :cont  
 goto:eof  

 rem ----------------------------------------------------------------------------------  
 :Exit  

The magic ocurrs in the "parseArguments" function, (Oh yes, batch files have functions, I didn't know that until now. This page was really helpful: http://www.dostips.com/DtTutoFunctions.php).

This function contains two things that I learned today. The first one: SHIFT, this command shift the arguments putting the second argument in front and the third in second place, and so on. This is really useful if you don't know the number of arguments, so you only need to do something like the following code to print all the arguments:


 :loop  
 IF "%~1"=="" GOTO cont  
 echo %~1
 SHIFT & GOTO loop  
 :cont  

The second think I learned is how to create dynamic variables, thanks to this guy: http://batcheero.blogspot.com/2007/07/dynamic-variable-name.html, this is useful to create the variables that you're going to use later in your batch program. This is possible because I used: setlocal enabledelayedexpansion at the very beginning of the batch program, otherwise the !thing! won't work.

So the parse arguments function just iterate over the arguments guessing if they were written as: -x32 or -d c:\temp, (in the first case %x32% will be set to true and the former %d% will contain c:\temp.

Here're the results:


> test.bat -d "c:\test dir\blah"
you set the output dir to: c:\test dir\blah
> test.bat -d "c:\test dir\blah" -x32 -x64
Well done you set x32 to true
Well done you set x64 to true
you set the output dir to: c:\test dir\blah

That's it, enjoy

Tuesday, April 16, 2013

Writing tests that works

Disclaimer

Although some of the ideas exposed in here are too obvious to some developers, I wanted to write down some of my thoughts about how the "unit" tests should be implemented in some projects, this is a discussion I'd with a fellow co-worker after reading this article: Unit Testing Myths and Practices and I think worth the time to post it in here, maybe some of you will agree with me, and some could call me dumb, anyway, I would love to read your thoughts about this and discuss about them on the comments section.

My opinion about "Unit" Tests

Several definitions of Unit tests are around Internet and in books, but here's a simple demo of what I understand by unit tests, and the way I saw unit tests implemented in several projects.

int myMethod(int arg1, int arg2) {
   return arg1 + arg2;
}
Unit test:
void testMyMethod() {

    // Testing positive numbers
    int a = 1;
    int b = 2;
    int expected = 3;
    int result = myMethod(a, b);
    ASSERT(expected, result);

    // Testing negative numbers
    a = -1;
    b = 2;
    expected = 1;
    result = myMethod(a, b);
    ASSERT(expected, result);

    // testing zero arguments
    ...

    // testing big integer values
    ...
}

As you may noticed I required 10+ lines of code to test 1 line of the business method. Now, what will happen to our test cases if we want to test the following method:

int myMethod2(Customer customer, int arg2) {
   int age = customer.age();
   return myMethod(age, arg2);
}

Now I will need to write tons of lines to check how my code will react to every single combination of customers' age or any value to be added. For example: test cases to check NULL customer, customers without age, summing up a negative age, etc, etc... Is this what I wanted in the first place? will this ensure that, if a reckless developer changes myMethod to get sum Integers instead of int, avoid null errors in the myMethod2? or do we need to rewrite all the tests that depends on the first method? do I even care about nulls been sent to myMethod?

Dependency injection will save the day!

Maybe some of you thought, "hey this guy does not know a sh**t about coding and he lacks of expertise, this is easily managed by dependency injection and method contracts". Maybe you are right, but lets check how our methods would change to ensure the usage of dependency injection to separate both concepts (bear in mind that the tests are now going to insert a dummy class that will return the proper values for each test).


[di]
IMySumClass sumClass;

int myMethod2(Customer customer, int arg2) {
   int age = customer.age();
   return sumClass.myMethod(age, arg2);
}

Done!, now we have a new problem, both tests are going to pass up the tests phase, and the error will popup in our production system. Yes, I know... the methods' contract... someone would say "the method says that it will receive X and Y and if the X and Y changes then it's a new method because it's a new contract. Ok, here we go:

int myMethod(int arg1, int arg2) {
   return arg1 + arg2;
}

Integer myMethodWithIntegers(Integer arg1, Integer arg2) {
   return arg1 + arg2;
}

Solved!, now we have 2 methods that executes the same logic, therefore we will need to mantain them as well. (But don't forget that our tests cases will need to be maintained as well, so we have now 30+ lines to keep updated). Too much work to ensure that a simple sum will work, don't you think so?

Integrity tests

Ok, let's go back to the simple code we have in the first place:

int myMethod(int arg1, int arg2) {
   return arg1 + arg2;
}

int myMethod2(Customer customer, int arg2) {
   int age = customer.age();
   return myMethod(age, arg2);
}
Unit test:
void testMyMethod() {

    // Testing positive numbers
    int a = 1;
    int b = 2;
    int expected = 3;
    int result = myMethod(a, b);
    ASSERT(expected, result);

    // Testing negative numbers
    a = -1;
    b = 2;
    expected = 1;
    result = myMethod(a, b);
    ASSERT(expected, result);

    // testing zero arguments
    ...

    // testing big integer values
    ...
}

What I found useful is to keep the tests as simple as possible and targeting the class we want to check, so instead of doing a double check of all the different options for a "a + b" operation in the method testMyMethod2, or in the testMyMethod, I would add the tests that are useful to each case, and I will remove the negative checks, zero arguments, etc that will cumbersome my code, also I won't add tests to testMyMethod2 that are not related with myMethod2 itself, therefore I will not add tests that were tested somewhere else. Let's see the sample:

void testMyMethod() {

    // Testing positive numbers
    int a = 1;
    int b = 2;
    int expected = 3;
    int result = myMethod(a, b);
    ASSERT(expected, result);
}

void testMyMethod2() {
    // Customer with age
    Customer c = new Customer("x", "y", 32);
    int expected = 33;
    int result = myMethod2(c, 1);
    ASSERT(expected, result);

    // Test a customer too old (should be rejected because of his age)
    c = new Customer("x", "y", 32);
    try {
       myMethod2(c, 150);
       FAIL("Customer age was not properly checked");
    } catch (AnyCheckedException) {
        // Accepted case
    }
}

Noticed how I removed the "negative", zero and big numbers tests, I really don't care about these cases at this level, and they will be tested by "customer" cases anyway. These kind of tests are going to be more useful and they will protect my code from unexpected "real" failures reducing the amount of work I need to write down all the test cases and covering the business cases instead of arguments misusage.

Off course, this a very simple case, and it would not cover all the "what if" questions, but it will cover what is supported by the system. The last sample test code will check not only cases related to myMethod2, but myMethod as well, this means that myMethod will be checked for every single call made from any of the classes that uses this method, and it's going to be tested in a "business" way, not in a "what if" applied to every single argument.

What usually happens in production environments is that errors, not originally tested, will popup and we will need to add them to the proper test code to ensure they will not rise again, if the error was in the "customer" level then we will add proper lines to the testMyMethod2 (or create a new one), if the problem was with our sum method not covering Big Integers, then we will need to cover it in the customer test as well, any change to the testMyMethod2 will also test myMethod in a business like scenario.

Conclusion

Every project could be different and every one will need a solution that fits to it, but I've been using this heavily and this method of "accumulative" testing have worked really well to check and correct bugs in production systems.

I would love to read your thought, please post a comment. The "you're a dumba**" comments are welcome but please try to support your ideas.

Wednesday, April 10, 2013

Switching from autotools to something else

Although I'm really happy with the results of autotools on Linux and OSX, everytime I want to generate the installers on Windows I needed to do something, at the beginning I just created the sln and vcproj for each of the modules in djondb, later I'd to implement an script to do the compilation process even simpler, then I'd some problems with the libraries, third party libraries, x64, etc and I'd to change some settings on the project files, so far everything worked as expected (not smoothy but worked), lately I've been having some problems that only happened to crash on Windows environments, and here's where the nightmare have begun, I've all the tests written with cpptest (which is a really good option and I love to write tests on it), but now that I need to execute the same tests on windows, and my approach started to flood.

I've been reading about some other build tools, that are really cross-platform, and I will give a try to CMake, I tried scons, but looks really slow, I don't know why but compiling with it takes double time than same thing on Linux/make counterpart.

I'll do some tests on CMake, and if everything works I will switch djondb to this build system, I'll post my results in here later.

Any thoughts? please let me know what you thing about CMake, SCons and make tools.

Monday, March 25, 2013

Why NoSQL matters?

Web applications have been evolving on a daily basis, mainly because web is something so visible that will be shamed if your website is not at the same level of your competitors.

For years the languages and frameworks to build websites changed to fill the demand of new agile development methodologies, servers like nodejs appeared to deal with changes, jquery appeared to allow the creation of richer content, etc. you'll notice the evolution started from front-end technologies and it's moving towards the backend, now it's the time for the ultimate frontier to adjust to this change.

Evolution of the database model

One of the things that remains constant over the years is the need to map your information to storage applications like MySQL, Oracle database, Postgres or SQL server. One thing they have in common it's all of them are based on a rigid model, the model of tables, rows and columns, and these lead to several problems, these problems you need to address during this always changing web environment.

  • Managing the instructions at code level to work with a new set of data or new columns at the database level.
  • Addressing deployment problems, if you create a new field in a table you'll need to take down your server and do a painful process of upgrading, (run scripts to change your tables, execute an script to migrate, or at least update, your new model, and finally updating the code that changed)

What if these changes means that your model changed in a more radical way? What if you created a new type of product that requires a full new model? That's something to thing about.

This kind of problems have been a constant over the years until now, some software developers, as myself, started to wonder if there will be a better way to store the information that is more aligned with the current challenges, that's how the NoSql movement started, as a way to solve this new data model paradigm.

The new data model

To give a better explanation about how NoSQL address the problem of these always changing data models, let's work using an example as a basis. Lets assume we're going to store customers from our newly created website, to do this job we decided to create a table in an RDBMS, and we included columns like name, last name and genre. After some days we put together an application and finally it's running on web, then tweet about giving some benefits to the people that register during the weekend, after a good weekend you end up with 3M records, you're surprised because you didn't expected that kind of traffic, then you check the db and realized that you didn't provide enough space for some long last names and some of them are truncated, darn you'll need to expand the column.... But, you can not change the size of a column in SQL server, you need to create a new table, move the records and that will mean your server will be out for 1 hour. Does this sound familiar? Even worst, you need to include some new fields in the process, meaning changing the HTML, your server side, doing some migration, etc.

Although JSON does not solve all the problems that come with these kind of changes it's really easy to update the underlining model just modifying a JSON structure than changing the tables, columns, rows and SQL statements, and that's how NoSQL helps in the development of web application, allowing to make this kind of changes easily, storing JSON instead of traditional data model.

A Customer could be model in a JSON like this:


    {
         name: "John",
         lastName: "Smith",
         genre: "Male"
    }
Same problem stated before, what if you need to store a longer last name? Because the model is not rigid, your users will never had the problem in the first place, and their last names were not truncated. What if you need to store new information?

    {
         name: "John",
         lastName: "Smith",
         genre: "Male"
    }

    {
         name: "Peter",
         lastName: "Korn",
         genre: "Male", 
         email: "test@mycompany.com"
    }

Done!, you don't need to change anything at the database level. If your REST services are treated as general REST services that get JSON and post them to the DB, you will not need to change your backend either, and the only change you will need is at the HTML and javascript level, which will allow you to do a cleaner deployment of the new version, just post the new HTML and javascripts at the web sever folder and that's it, you are ready to go.

Conclusions

Although a good requirements/design/test will always be better than having to get into a production environment to realize that you forgot a piece of data that is really important to your business, the NoSQL databases allows you to upgrade your applications easier, and relieves the pain of creating an application from ground up, it really empowers the developer to explore the data structure while he is creating his master piece.

Later I will keep exploring samples on how to use NoSQL to create applications in a faster/easier way, in the mean time please go to here: http://djondb.com/blog and check out some of the full examples that you will find in there.

Friday, March 1, 2013

Balance after 5 months of releasing djondb as Open Source

After 5 months of releasing a big product as an open source project I want to write about the progress, the goods, the bads, and share the future releases of djondb.

Briefing

I've been working on djondb for a year, it started as a proof of concept, a learning project and a challenge, rapidly it grew to a running version of a server and I started to wonder if there was space for a new NoSQL database, and here's the video of the speech I gave at BogoDev in 2012

Although it was a living thing it was far from been a production server. After this speech I realized that there was a lot of people with the same requirements, a NoSQL database that can store JSON, ensure everything goes to disk and not into "lose-able" memory, and been able to handle transactions over multiple documents. Last, but not least, been able to run it in Linux, OSX and Windows.

Several tasks were required to be completed once the MVP was a real application:

  • Create an stable full minimal featured release (Insert, Updates, Deletes, Find with meaningful queries, etc)
  • Implement drivers for most common languages.
  • Document all the features
  • Create samples
  • Installers for each platform
  • Drink couple of beers! (this was started even before the first line of code)

The first version of all the features was released a couple of months later, and djondb was open sourced in github since September 2012.

I almost remember the face of my wife once I told her that I will put the code as open source, "Is he crazy?, he is doing this great product and he will give it away?", but I was completely sure that was the best choice, here is the list of the reasons to go to this path:

  • Give something to the world. I'm using open source tools so seems fair enough to give something back.
  • An open source is easier to promote, mainly because Open source promotes itself, the advocates of open source push this kind of developments to keep them going.
  • Some developers might join the development team and we will share the glory, and the money that could come with this solution.
  • Been open source the community will help with feedback, samples, document translation, etc.

In November I did the first "official" speech about djondb in Colombia 3.0, one of the biggest Content Summit of Latin America, here's the video:

After this I've been working on drivers, documentation, new features, bug fixing, etc. And released the version 0.2, which had several performance improvements.

During the latest months I've been working on new features that are going to be released at the end of March.

Balance

djondb brough a lot of learning, not only about coding databases, or high performance systems, but on how the Open source ecosystem works, what means to be an Entrepreneur (I'm still learning, and I think I will never stop learning), what traction is, what an investor sees, etc.

djondb has been a very successful project, more than 350 downloads during the last three months and a lot of great reviews. Some good projects are already using it, and people really share the same though of NoSQL as a profesional tool to develop business applications.

Not everything is nice

Although I've been working every night to improve and release a better product, it's very disappointing to realize that Open source is not what I thought it will be, it's not a great community trying to help out, sharing ideas or helping with spreading the word, it's more like a bunch of guys (like me), working really hard to share a dream with bare hands and passion. Most of the histories I read everyday about lone riders that try to create open source is just like mine, working hard and getting few or none feedback, just a lot of downloads.

Lessons I learnt from the open source, maybe I'm wrong and it's only my perception, but this is what I think:

  • Most people think open source means free, they don't even think that feedback or test is a way to share and help.
  • Investors put weird face when you say "Open source", unless you already show that there's a lot of traction. And I mean huge traction, not the kind you would need if you "own" the code.
  • Other developers look at the open source as a way to get their projects done without paying anything. (there're several ways to give back in the open source, like feedback, helping documentation, maybe fixing a bug or two, etc)

Summary

Don't go with the open source option if you don't have a company that supports you. Open source will mean that you put all your ideas on the web and almost anyone can copy you and create their own product without any responsibility, I know... you could argue that you can show you posted the code first, etc etc. (If you argue that you could patent your code let me remind you that procedure costs U$35.000 and takes 5 years to be approved). The truth is that you can not protect your idea as open source, you gave away your knowledge and you won't get anything back.

Friday, January 25, 2013

Preparación para taller de NoSQL con djondb


Este post lo realizo con el fin de explicar los pasos de instalación y preparación requeridos para el taller que se dictará en el Bogodev el próximo 31 de Enero /2013.

Herramientas necesarias


El ejemplo que se realizará usará 3 componentes:
  • Ubuntu 11 o superior.
  • djondb: Motor de base de datos NoSQL (http://djondb.com)
  • KnockoutJS: Framework para patron Model-View-View Model MVVM (http://knockoutjs.com/)
  • nodejs: Servidor multiproposito capaz de ejecutar javascript server-side (http://nodejs.org
  • Cualquier editor de texto que facilite la edición de HTML y javascript (vim, Netbeans, emacs, etc)

Instalación

El primer paso será la instalación de la base de datos, para ello vamos al site de djondb y obtenemos la versión para linux correspondiente a nuestra arquitectura.

sudo dpkg -i djondb_Linux_i386.deb

Probablemente obtendremos unos errores indicando que hacen falta algunas dependencias, como antlr, etc. para instalar las dependencias faltantes se realiza:

sudo apt-get -f install

Procedemos a crear la carpeta donde se almacenaran los archivos de la base de datos y le daremos permisos al usuario actual, asi:

sudo mkdir /var/djondb
sudo chown `id -u` /var/djondb/

Ahora podremos verificar si el servidor quedó instalado correctamente ejecutando:

djondbd -n

El parametro -n permite que el servidor no ejecute en background permitiendo bajarlo de manera simple usando Ctrl+C.

Una vez instalado, realizamos la instalación del nodejs de la siguiente forma:

Entrar a la pagina http://nodejs.org y dar click en install, esto bajará un archivo: node-v0.8.18.tar.gz y seguir los siguientes pasos:

tar xvfz node-v0.8.18.tar.gz
cd node-v0.8.18
./configure
make
sudo make install

Luego instalamos algunas herramientas requeridas para instalar plugins de nodejs:
sudo apt-get install g++ make
sudo npm install -g node-gyp

La primera instrucción instalará las herramientas necesarias para ejecutar el servidor e instalar plugins. La segunda instalará el plugin node-gyp que permite compilar extensiones que usen este procedimiento de instalación.

Con estos pasos podemos proceder a crear la carpeta donde realizaremos el taller, e instalaremos el driver de djondb necesario para nodejs, asi:

mkdir proyecto_nosql
cd proyecto_nosql
npm install djondb

Estos son los pasos requeridos para la instalación de las herramientas necesarias para el taller, si tiene algun problema realizando la instalación coloquenos un comentario y con gusto les colaboro para corregir el paso que este generando inconvenientes.

Nos vemos el 31 de Enero en el bogodev.

Update 27 de Enero:
Ya que en un punto del taller usaremos una llamada al server por REST, conviene tener instalado el Chrome con la extensión "REST Console" o firefox con una extensión similar.