Monday, December 31, 2018

Thursday, December 6, 2018

How MongoDB's Journaling Works



MongoDB journal provides durability in the event of a failure, MongoDB uses write-ahead logging to on-disk journal files.

It has explained in nicely in below article.  Credit should go to the original author 

Friday, August 31, 2018

Cassandra data retrieving issue - Due to huge tombstones

Recently, one of the development team has complained about their Cassandra cluster isn't returning the data from one of the tables and soon after that issue faced cluster went down.

This incident happened over the weekend and on-call Cassandra DBA was informed and asked to check the cluster. Initially, Dev team has complained about the performance issue with Cassandra cluster, in other words, Cassandra cluster couldn't handle the current load( Later they have identified it was a sudden huge data load came directly from one of the subsystems)
As a first step, DBA decided to start the cluster by starting Cassandra service but he observed the Cassandra log saying "unable to handle the load" exception ( This not the actual error)


As usual heap size wasn't enough for the existing servers due to that new nodes have been added with 12 GB of heap size with total 16 GB of physical RAM.

Even after adding new nodes, DBA has executed repair command and the problem got solved temporarily. 

Late part of the day, Again Dev team has reported their issue still exist and ask DBA support. This time we have observed my WARN message for tombstoned and it was a huge error list. <-- This error was happening from the beginning but it didn't notice because the team wanted to start the cluster running state somehow.

With the high number of tombstones, we suspected that their might have some unusual behavior in application level and we checked with Dev team regarding application involvement with showed table.

Finally, the dev team has confirmed that they are using this table as a temporary location for particular data and delete the data in every 15 mins via corn job.

Since default grace period of the tombstone is nearly 10 days, data is not being deleted from the Cassandra cluster completely until grace period is over.  Due recursive call of the delete operation, these tombstones were slowly growing up and no repair jobs were running to resolve the data scatter.

With all of this,  select query on this table was unable to receive the data as it has to read the huge list of tombstones value to find a live or actual data and heap memory wasn't enough to load all tombstones values. 

As a solution, we have changed the "gc_grace_seconds" to the 1200( 20 mins) and upgrade the sstable and then compact the table manually in each node

connect to the instance and change the gc_grace_seconds value in gradebeforeitem  table
use xxxxxx_prd;
alter table gradebeforeitem with gc_grace_seconds = 1200;


Each command has executed in each node individually ( One after another) to reduce the impact.
nohup nodetool upgradesstables -a xxxxxx_prd   gradebeforeitem &
nohup nodetool compact xxxxxx_prd gradebeforeitem &

Once we received the confirmation and success story from dev team we reverted back to the " gc_grace_seconds " value to the default (864000 ) and set up the repair job in each node to run weekly. We scheduled the job keeping 24 hours difference between jobs to avoid the clashes between the repair jobs

Sunday, May 20, 2018

Upcoming Features in MongoDB 4.0

As per the MongoDB documentation,  MongoDB will be introduced "Multi-Document Transactions".  MongoDB 4.0 will add multi-document transactions for replica sets.

You can find example here :
https://docs.mongodb.com/manual/upcoming/

Wednesday, May 9, 2018

MongoDB Cloud Manager vs. Ops Manager

  • Enterprise Advanced comes with both Ops Manager and Cloud Manager so it is up to you which one you would like to use 
  • Cloud Manager is managed software-as-a-service where you only have to install relevant agents (monitoring, automation, or backup) in your infrastructure. Ops Manager is on-premise software that is part of an Enterprise subscription: you have to install and manage all services on your own infrastructure.
  • They are similar in terms of main feature areas (monitoring, automation, backup) and user interface but not identical. Since Cloud Manager is a managed service it is updated more regularly than Ops Manager releases and currently has some additional features such as integration with cloud provisioning instances via providers like AWS or Azure.
  • Please note that the agent versions also differ between Cloud and Ops Manager, so you should ensure you are running the correct agents and referring to the relevant documentation.

MongoDB Enterprise Licence Pricing

Recently, I got a chance to have a call with one of the sales executive and found out below pricing information for MongoDB enterprise node

Please note that this cost only for MongoDB Enterprise binaries and once you purchased binaries you will be getting many features along with binaries.

There are costing only for data nodes. As an example, you don't need to pay for config servers or MongoS severs in MongoDB shard cluster.

As per the April 2108


  • Production node is $12,990 per annum
  • UAT/QA node is $ 6,495 per annum ( 50% from production cost)
  • Development node is fee


MongoDB Enterprise version comes with trial feature and it has only 30 days. There are no features cut down or being stopped after the trial period.  But you are not allowed to use Enterprise more than 30 days without purchasing for the Production environment

Cost for 5 nodes production replica set runs for year,

5 x $12,990 = $ 64, 950

Since MongoDB didn't share this pricing information on their website, they could have a change pricing time to time. But you could estimate at least before reaching them.




Tuesday, May 8, 2018

About MongoDB lock file (mongodb.lock)

the mongodb.lock file is a simple file which holds a random number and it is doing
most important tasks for MongoDB

Important of mongodb.lock file

1. To detect unexpected/ Unclean shutdown

2. dbpath only access by ONE mongod process


If you have seen any unexpected shutdown or hanged MongoDB server you have to repair your MongoDB before it starts.

Here are basic steps

1. look at the mongodb log file and see bottom lines of log. You will see some clue on mongod hanged
    tail -n 20 /var/log/mongodb/mongodb.log
    you  may see below error message
    "Detected unclean shutdown - mongod.lock is not empty."

2. Create a backup copy of the data files in the --dbpath.
    mongodump --dbpath /data/mongodb/ -o /var

3. Start mongodb with --repair command
    mongod --dbpath /data/db --repair

After the repair, the dbpath should contain the repaired data files and an empty mongod.lock file. 

How life works, referring to Cells, Cell division, DNA, Genes, Chromosomes, and proteins.

 Nothing related to the database but it something worth knowing about our life: 

This article all about explores how life works in terms of basics biological factors. I’m going to discuss here Cell, Cell division, DNA, Genes, Chromosomes, and proteins. We start from the smallest structural and functional unit of an organism called the cell. As an example, the human body contains trillion of the cell and in another word, it’s a structural, functional and biological unit of all organisms. The cell provides structure for the body, take in nutrients from food, convert those nutrients into energy, and carry out specialized functions. Physically cells always have a boundary membrane and a membrane-bound structure containing biomolecules, such as nucleic acids, proteins, and polysaccharides. According to the structure of the cell, biologists divide organisms into two group. Those groups are,
        Prokaryotes: prokaryotes cells surrounded by a membrane and a cell wall, with a circular strand of DNA contains their genes and it does not have a nucleus
        Eukaryotes: eukaryotic cells contained within the nuclear envelop and separated from the cytoplasm. Nevertheless, these cells boast their own personal “power plants”, called mitochondria. These tiny organelles in the cell not only produce chemical energy but also hold the key to understanding the evolution of the eukaryotic cell.

Cell division is the process of a parent cell divides into two or more daughter cells and it occurs as part of the large cell cycle.

There are two distinct types of cell division in eukaryotes,

        Vegetative division: daughter cells is generally identical to the parent cell(mitosis)

        Reductive cell division: number of chromosomes in the daughter cells is reduced by half, to produce haploid gametes(meiosis)

In terms of definition of the Meiosis is “A type of cellular reproduction in which the number of chromosomes is reduced by half through the separation of homologous chromosomes, producing two haploid cells.” and Mitosis is “A process of asexual reproduction in which the cell divides in two producing a replica, with an equal number of chromosomes in each resulting diploid cell.” Prokaryotes are much simpler in their organization than eukaryotes. There are great many more organelles in eukaryotes, also more chromosomes. The usual method of prokaryote cell division is termed binary fission. The prokaryotic chromosome is a single DNA molecule that first replicates, then attaches each copy to a different part of the cell membrane. When the cell begins to pull apart, the replicate and original chromosomes are separated. Following cell splitting (cytokinesis), there are then two cells of identical genetic composition (except for the rare chance of a spontaneous mutation).

DNA (Deoxyribonucleic acid)

DNA contains the biological instructions that make each species unique. DNA, along with the instructions it contains, is passed from adult organisms to their offspring during reproduction. In another word, it looks like a blueprint for building different parts of the cell. Most of DNA is found inside a special area of the cell called the nucleus. Apart from the DNA located in the nucleus, humans and other complex organisms also have a small amount of DNA in a different structure called mitochondria. A nucleotide is the chemical building blocks of the DNA and it contains three parts, a phosphate group, a sugar group and one of four types of nitrogen bases. The four types of nitrogen bases found in Nucleotides are adenine (A), thymine (T), guanine (G) and cytosine (C). The order or sequence, of these bases, determine what biological instructions are contained in a strand of DNA. For example, the sequence ATCGTT might instruct for blue eyes, while ATCGCT might instruct for brown. Nucleotides are arranged in two long strands that form a spiral called a double helix. The structure of the double helix is more likely a ladder. The important property of DNA is that it can replicate itself. Each strand of DNA in the double helix can serve as a pattern for duplicating the sequence of bases. This is critical when cells divide because each new cell needs to have an exact copy of the DNA present in the old cell.


Genes

In term of biology, “A gene is the basic physical and functional unit of heredity. Genes, which are made up of DNA, act as instructions to make molecules called proteins.” In humans, genes vary in size from a few hundred DNA bases to more than 2 million bases but when it comes to general, genes carry the information that determines traits, which are features that are passed from parents. Genes are found on tiny spaghetti­like structures called chromosomes. The DNA also contains large sequences that do not code for any protein and their function is not known. The gene of the coding region encodes instructions that allow a cell to produce a specific protein or enzyme. There are nearly 50,000 and 100,000 genes with each being made up of hundreds of or thousands of chemical bases. Chromosomes

Chromosomes are the place where DNA is located since the cell is very small and organisms have many DNA molecules per cell, each DNA molecules must be tightly packaged and this package is a form of the DNA is called chromosome. Chromosomes come in matching sets of two (or pairs) and there are hundreds or thousands of genes in just one chromosome. Chromosomes in humans can be divided into two types those are, autosomes and sex chromosomes. Certain genetic traits are linked to a person's sex and are passed on through the sex chromosomes. The autosomes contain the rest of the genetic hereditary information. All act in the same way during cell division. Human cells have 23 pairs of chromosomes (22 pairs of autosomes and one pair of sex chromosomes), giving a total of 46 per cell. Half of these chromosomes come from one parent and half come from the other parent In addition to these, human cells have many hundreds of copies of the mitochondrial genome. Sequencing of the human genome has provided a great deal of information about each of the chromosomes.


Proteins

Proteins are a complex and large molecule that play critical roles in the body and proteins are made with hundreds or thousands of smaller units called amino acids, which are attached to one another in long chains. In order to make proteins, the gene from the DNA is copied by each of the chemical bases into messenger RNA (ribonucleic acid) or mRNA. The mRNA moves out of the nucleus and uses cell organelles in the cytoplasm called ribosomes to form the polypeptide or amino acid that finally folds and configures to form the protein.



All of the above factors are playing such a complex roles/ responsibilities to perform life with many more other things.

Thursday, May 3, 2018

Best Practices for Running Cassandra on AWS



This post will explain how we can run Cassandra on AWS effectively, or in other words best approaches to running Cassandra on AWS. As you know AWS is one of the largest cloud environment available in the world.  I've done presentation recently about above topic and I would like to outline that presentation here for your reference.


What is Cassandra?
Apache Cassandra is a massively scalable open source NoSQL database
It delivers,
  • Continues availability,
  • Linear scalability,
  • Operational simplicity across many commodity servers with no single point of failure,
  • A masterless peer-to-peer distributed system where data is distributed among all nodes in the cluster


What is Cassandra Node?

  • A physical server, EC2 instance
  • Each machine has one installation of Cassandra.
  • A node in a cluster is just a fully functional machine that is connected to other nodes in the cluster through the high internal network.
  • All nodes work together to make sure that even if one of them failed due to an unexpected error, they as a whole cluster can provide service.
  • All nodes in a Cassandra cluster are same.
  • AWS provides an expert level of the platform for the Cassandra cluster.

How Cassandra supports High Availability?
  • Cassandra is designed to be fault-tolerant and highly available during multiple node failures.
  • Amazon Regions and availability zones can be used for deployment 
  • Resiliency is ensured through infrastructure automation.
  • Quick replacement of failing nodes
  • In case of regionwide failure, if we deploy with the multi_region option, traffic can be directed to the other active Region

Deploying Cassandra on AWS
  • Cassandra on Amazon EC2  can be automated.
  • Amazon CloudFormation which allows you to describe and provision all your infrastructure resources in AWS. No additional charge and you pay only for the AWS resources 
  • Cassandra common design patterns on AWS
    • Single AWS Region, 3 Availability Zones
    • Active-Active, Multi-Region
    • Active-Standby, Multi-Region
Single Region, 3 Availability Zones
  • Deploy the Cassandra cluster in one AWS Region and three Availability Zones.
  • There is only one ring in the cluster. 
  • By using EC2 instances in three zones, you ensure that the replicas are distributed uniformly in all zones.
Single Region, 3 Availability Zones

  • Deploy the Cassandra cluster in one AWS Region and three Availability Zones.
  • There is only one ring in the cluster. 
  • By using EC2 instances in three zones, you ensure that the replicas are distributed uniformly in all zones.
Active-Active, Multi-Region

  • Two rings in two Regions
  • The VPCs in the two Regions have peered 
  • The two Regions be identical in nature, having the same number of nodes, instance types, and storage configuration.
  • This pattern is most suitable when the applications using the Cassandra cluster are deployed in more than one Region.
  • Read/write traffic can be localized to the closest Region for the user for lower latency and higher performance.
Active-Standby, Multi-region
  • Two rings in two Regions
  • The VPCs in the two Regions have peered 
  • The two Regions be identical in nature, having the same number of nodes, instance types, and storage configuration.
  • the second Region does not receive traffic from the applications. 
  • It only functions as a secondary location for disaster recovery reasons. If the primary Region is not available, the second Region receives traffic.

Planning High-Performance Storage Options

  • Cassandra is sequential for a write-heavy workload.  But read-heavy workloads require random access.
  • If your working set (data + index) does not fit into memory, Then you need to have more I/O requests on the disk
  • Very important to select the correct storage option
  • We are not recommended to use magnetic volume types(HDD)  due to low-performance reasons
  • AWS provides two main options,
  • Amazon EC2 Instance stores 
  • Amazon EBS
Amazon EC2 Instance Store

  • Disk storage located on disks that are physically attached to the host computer – Called it as “Instance store”
  • If you are using  more than a single volume, we can stripe the instance store volumes ( RAID 0)
  • Enhanced I/O throughput
  • But If the instance is stopped, fails or is terminated,
  • You will lose all your data
  • Therefore, we need to replicate data across the multiple nodes across the Availability Zones or can go across the region level based on the requirements
Amazon EBS - Amazon Elastic Block Store

  • It provides persistent block storage 
  • Each Amazon volume id automatically replicated within its Availability Zone to protect from component failure ( High Availability, Durability)
  • By using Amazon CloudWatch with AWS Lambda you can automate volume changes 
  • General Purpose SSD  (gp2)
    • gp2 is designed to offer single-digit millisecond latencies
    • Deliver a consistent baseline performance of 3 IOPS/GB (minimum 100 IOPS) to a maximum of 10,000 IOPS
    • provide up to 160 MB/s of throughput per volume. 
    • Can reach up to 3000 IOPS if the volume is less than 1 TB
  • Provisioned IOPS SSD (io1)
    • The highest performance EBS storage option designed for critical, I/O intensive database.
    • 50 IOPS/GB to maximum 32000 IOPS
    • 500 MB/s of throughput per volume.
    • single-digit millisecond latencies and it designed to deliver the provisioned performance 99.9% of the time
Why EBS Optimized Instance?

  • Usually, the network traffic and EBS traffic is shared in Amazon EC2 instance
  • Meaning, Consistent EBS performance depends on the amount of non-EBS related network.  So we can't guarantee network traffic between Instance and EBS volume
  • The solution, EBS-Optimized instances
  • It has an additional and dedicated capacity between EC2 and EBS I/O
  • This optimization minimizes the contention between EBS I/O and other traffic from EC2
  • It has dedicated bandwidth to Amazon EBS depending on the instance type
  • Minimum  425 Mbps and 14,000 Mbps
Instance Types that support EBS Optimization
  • The Current generation instance types are EBS-optimized by default
  •  C5, C4, M5, M4, P3, P2, G3, and D2 instances,
  • No need to enable EBS optimization and no effect if you disable EBS optimization
  • We can enable EBS optimization if the instance that is not EBS-optimized by default
  • Can enable when launching the instance, or enable while the instance is running
  • We need to pay additionally if the instance doesn’t come with EBS-optimized
Available Instance Types for Cassandra

  • Computer Optimized  - C4, C5
    • High Level of Network performance
    • Default EBS-optimized for increased storage performance at no additional cost
  • Storage Optimized – I3
    • SSD-backed instance storage optimized for low latency
    • Very high random I/O performance,
  • High sequential and read throughput provide high IOPS at the low cost
    • Memory Optimized – x1e, R4
    • Optimized for memory – intensive applications 
    • SSD storage and EBS-optimized by default and at no additional cost
    • The lowest price of RAM
Planning Instance Types Based on Storage Needs

  • Let's assume we need to have a 600TB and 10% for overhead for disk formatting 
  • 95% writes and 5% reads ( Write heavy )
  • The most common instance type for Cassandra on AWS is “i3”. Why?
    • Designed for I/O intensive workloads
    • Having with SSD storage ( Instance store)
    • Instances are available in On-Demand, Reserved, and Spot from in 15 regions
  • Let’s pick i3.2xlarge instance
    • (600 x 1024)/(1900 x 0.9)  = 360 Instances 
    • 360 x $0.624 x 720 = $161,740.8 per month
      • Not including data transferring charges
      • Assumed commit log also stored on the same drive
  • This value is so costly 
    • Can use more local storage but it might not help with the cost
Decoupling with EBS ( Instance and storage separately)

  • Cassandra recommends separate drives for Data and commit log(better performance) 
  •  We will be allocated in each node, 
    • 500GB for commit log
    • 4 TB for data
  • Computer Optimized instance type is more popular for running Cassandra on EBS 
    • C4 Instance type does not have  any local storage
    • C4.4xlarge Good fit with for production workload 
    • 30 GB memory, 16 vCPU, 2,000 Mbps  Dedicated EBS Throughput
  • Will see the cost estimation based on the facts that we discussed 

Number of instances = 600TB/ 4TB => 150 Instances

 Storage Requirements =Data Storage + Commit Log storage 
600 TB   +   75 TB
    675TB

Calculating EC2  cost  per month -  US East(N. Virginia)    
      150 c4.4xlarge =  150 x $0.796 x720 = $85,968

Calculating EBS volume costs per month
  675 TB EBS GP2 = 675 x1024x $0.1 = $69,120

Total cost = $ 85,968 + $ 69,120
$155,088

We can consider Reserved Instance for further optimizations
  • Let’s say that we are happy with C4.4xlarge plus EBS configuration with cluster performance over couple of months
  • We can make reservations and optimize cost
  •  For 3 years plan with partial upfront plan
    • Per month Instances cost = 150 x $0.33x720 => $35,640
    • Per month EBS cost = 675 x1024x $0.1 => $69,120
    • Total cost per month = $35,640 + 69,120  => $104,760
  • For i3.2xlarge instance, 3 years plan with partial upfront plan
    • Per month cost =  360 x $0.28 x 720 => $72,576
Planning Elastic Network Interfaces (ENI)


  • The virtual Network interface that you can be attached /detached to/from an instance in VPC in single Availability Zone.
  • Can be attached to one instance, detached it from that instance, and attached it to another instance in the same availability zone
  • When you move ENI from one instance to another, network traffic is redirected to the new instance
  • Really help with seeds node configuration( Hard code config)
  • Failure of a seed node, you can automate in such a way that the new seed node takes over the ENI IP address programmatically

Monitoring by using Amazon CloudWatch

  • Amazon CloudWatch can be used as a resource monitoring service 
  • It collects and tracks metrics,
    • Collect and monitor log files
    • Set alarms
  • We can write a custom metric and submit it to Amazon CloudWatch
  • We can configure alarms to notify you when the metrics exceed certain defined thresholds
Maintenance
  • In terms of Cassandra cluster health,
    • Scaling 
      • Cassandra is horizontally scaled by adding more instances to the ring.
    • Upgrades
      • Rolling upgrade pattern will be used for Cassandra, Operating System patching and instance type)
    • Backup & Restore 
      • Cassandra supports snapshots and incremental backup
      • Can be used instance store, the file-based backup tool works best
      • These backup files are copied to new instances to restore
      • We recommend using S3 to durably store backup files for long-term storage.
Security

  • Ensure that the data is encrypted at rest and in transit. 
  • The second step is to restrict access to unauthorized users.
  • Encryption at rest
    • Encryption at rest can be achieved by using EBS volumes with encryption enabled.
  • Encryption in transit
    • Cassandra uses Transport Layer Security (TLS) for client and internode communications.
References 


https://d1.awsstatic.com/whitepapers/Cassandra_on_AWS.pdf
https://aws.amazon.com/ec2/instance-types/
https://aws.amazon.com/ebs/details/
https://aws.amazon.com/ec2/pricing/reserved-instances/pricing/
https://aws.amazon.com/ec2/pricing/on-demand/
https://aws.amazon.com/ec2/instance-types/x1e/
https://aws.amazon.com/blogs/aws/now-available-new-c4-instances/
https://aws.amazon.com/blogs/aws/now-available-i3-instances-for-demanding-io-intensive-applications/