My Tech: November 2014

Saturday, 8 November 2014

The Necessity of Unit Testing

Enough is enough. I'm tired of arguing with people about the necessity of unit testing. Next time some one want to argue with me, I'll point them here. If you are one of those people, brace yourself.

Let's get some things straight first. People are not even qualified for stating their argument if they

Have not maintained over 40% useful and valid test coverage on software that's went live and been maintained for months if not years. These people do not even really know what unit testing is. They never even done it themselves.
Have not overseen a team to ensure proper test coverage across all the code base. It's an uphill battle that some people never faced. Slipping is easy while rising is hard.
Do not know how to calculate test coverage for the language they use. Yes to point 1. Yes to point 2. 40% coverage by eye balling the code? Hilarious.
Do not know the basics of a "good" unit test; AAA, A-TRIP. They never even experienced the benefit of true proper unit tests. They don't even know how to identify a pile of spaghetti tests that is make their life hell on earth.

I can quite safely say that above alone rules out 95% of people I mostly argue with. Yes I'm been overly generous. I'm a generous person. They argue with me blue to the face when they never did it themselves and they can't actually do it themselves. Yes they talk and act like they know all about it.

With that out the way, we can now have some proper intelligent discussion.

The code is simple and straight forward and it doesn't need tests

And we have a "code simplicity o' meter" that gives the code a score between 0 and 100 with 25 been the "complex code threshold" that requires tests? Simple code will have simple tests. If it's so straight forward that it doesn't need test, writing one wouldn't take more than a few minutes. If it does take more than a few minutes, then either:

a) the code is not as straight forward and simple as it was perceived or

b) the programmer don't know how to write unit test in which case it's a good practice anyway.

(Yes I'm aware of CCM and I use it myself. People who argue this "simple code" thing with me doesn't know about it usually though, so shhhhh.......)

The code is too complex and writing test takes too much time

Really. If you don't need test for complicated code, I don't know what type of code you will classify as needing test. Simple code? Not so simple but not so complicated code? Shall we bring in the "code simplicity o' meter" to your sea of grey?
Ensuring correctness to subtle mistakes in complex code is one of the main points of unit testing.

Unit Test cost extra time / money

I'd like to say I never have problem burning through same number of points in a sprint as people who doesn't write tests, but I can accept the cost factor to tests thing. At this point, it becomes a complex debate.

Why can't the business afford the time / money to write these tests? Writing the actual code is only a small part of the development cost. If the margin is so low that an extra 20% (arbitrary number) increase of 50% of the project cost can't be paid up front that has long and far reaching benefit, maybe there isn't even a business model to begin with.
Or does it. It is a fact that the earlier an issue is fixed, the less costly it is. Unit test is the second quickest feedback mechanism, behind compilers. If a regression can be stopped at unit tests, it cost far less than one even in staging/UAT environment where resetting environment can cost a lot of time and sometimes money.

Then, from the hot shot developers.

I don't make mistakes and I haven't made one in X years.

But others do, and they will come change your code. Deal.

And lastly,

Code base is still changing. Time we spend writing test could be a waste.

Do or do not, there's no try. If the code you are writing is not going to be run or used, why you even writing it? If yes, even a simple client proof of concept application crash can potentially lose you the deal. Any production code needs tests. Period. As for those so called prototype / throw away code. How cute, when's the last time that happened; throwing away code.

ElasticSearch on AWS with AutoScaling Groups and Spot Instances

One of the most powerful feature of ElasticSearch is its ability to scale horizontally, in many different ways; routing, sharding, and time / pattern based index creation and query. It is a robust storage solution that is capable of starting out small and cheap and then grow and scale as the load and volume rises.

Having implemented the ElasticSearch cluster at where I'm working myself single-handedly, I'll go over a few points of interest. (The design is pretty straight forward for people who have worked with ElasticSearch. If you want to know how to set it up, RTFM.)

As the diagram is clearly marked:

First of all, you want to use a single security group across all your data nodes and master nodes for the purpose of discovery. This "shields" your nodes from other nodes or malicious attempt to join the nodes and is part of your security. Open port 9300 to and from this security group itself.
A split of "core" data nodes and "spot" data nodes. Basically you have a small set of "core" data nodes that guarantee the safety of the data. A set of spot instances then are added to the cluster to boost the performance.

Set rack_id and cluster.routing.allocation.awareness.attributes!!!
I don't like stating the obvious, but THIS IS CRITICAL. Set the "core" nodes to use the same rack_id while the spot instances to use another. This will force the replication to store at least 1 complete set of data on "core" nodes. Also, install kopf plugin and MAKE SURE THAT IS THE CASE. Spot instances are just that, spot instances. They can all disappear in a blink of an eye.
Your shard and replication count directly affects the number of maximum number of machines you can "Auto Scale" to. Self explanatory.
You can update the instance size of the servers specified in the launch configuration and terminate the instances one by one to "vertically" scale the servers should you run into horizontal scaling limit due to shard and replication count limit.
This is an incredibly economical set up. Taking r3.2xlarge instances for example, even 3 year heavy reserve cost $191 / month while spot instances cost you $60 / month. It is the ability to leverage spot instances that makes all managed hosting of ElasticSearch look like day light robbery.
For $4k / month, you can easily scale all the way up to 2TB+ memory and 266+ cores and 10TB of SSD for 30k iops (assuming 50% of $4k monthly fee spent on spot instances and 25% on SSD.). You get 60gb ram and 780gb storage from Bonsai btw.

Setup your master nodes on dedicated separate set of Auto Scaling Groups and have at least 2 servers in it. This is to prevent the cluster from falling apart should any of those master nodes in the Auto Scaling Group gets recycled.
Another point of interest with the necessity of these dedicated master nodes is that because of the volatility of the core data nodes and spot instances since they are on Auto Scaling Groups and the frequency at which the shard and cluster states are reshuffled (again, due to ASG).
They don't need to be particularly beefy.
Front up the master nodes with an Elastic Load Balancer

Open port 9200 for HTTP. User queries will then be evenly distributed across the master nodes. You can use a separate ASG for a set of none data nodes specifically used for query purpose if there is a high volume of query traffic. Optionally set up SSL here.
Open port 9300 for TCP. Logstash instances can then join the cluster using this endpoint as the "discovery point". Otherwise you will not have a specific IP address you can set in logstash ElasticSearch output set in node protocol.

Configure a S3 bucket to store snapshot and restore from using the master nodes as the gateway.

The cluster I'm currently running has 2x t2.small for masters, 2x r3.large for "core" nodes and 4x r3.large for spot instances. It managed to hold 200gb of index and 260 million records and growing without breaking a sweat. It will have a long long way to go before it hits its scaling limit of 12x r3.8xlarge. Should be good for upwards of many terabytes of index and billions of rows. Fingers crossed.