Like usually happens with everybody, you do not learn until you make mistakes and learn from you mistakes.That too has happened with me.After all the efforts I made in resolving those issue I thought I should pen down those mistakes and share those in the form of this blog so that others could avoid those silly looking frequently made mistakes.
Commit Log and SSTables on same disk
Commit logs are written sequentially on disk while scanning SSTable is a random read operation.This means if your Commit Log and SSTables are on same disk, your write performance will suffer because of random reads on SSTables will not allow Cassandra to write sequentially on Commit Logs because random reads would result in disk seeks.This is not applicable to SSDs.So you can choose two options –
- Create separate mounts for Commit Log and SSTable to store them on different disks.
- Use SSDs
Forget to increase File Handle / Descriptor limit
Cassandra needs to work with lots of SSTables / Sockets etc but the default limit for file descriptors in linux is 1024.But once your code will go in production, you will realize that lots of SSTables are getting created and because of reach of this limit, your nodes are going down.So before going into production, its better to increase the file descriptor limit.
Used RoundRobin load balancing policy instead of TokenAware
TokenAwarePolicy is a wrapper policy which makes best efforts to select replicas for a given key in local data center otherwise it will use child policy to locate hosts. Always use TokenAwarePolicy wrapped over DCAwareRoundRobinPolicy.
Cluster.builder() .addContactPoint(node1,node2,node3) .withLoadBalancingPolicy(new TokenAwarePolicy(new DCAwareRoundRobinPolicy())) .build();
Why you should always use TokenAwarePolicy wrapped over DCAwareRoundRobinPolicy and not RoundRobinPolicy ?
By using TokenAwarePolicy, you could avoid the network hops associated with the client not being aware of the layout of the token ranges associated with each node in the cluster. When the client connects to a node that does not hold the token range for the write that node then has to coordinate with another replica node to send the write onto it. It’s much more efficient for the client to connect to a replica node from the get-go.