Cassandra Training

Cassandra Training Overview

Apache Cassandra is an open source second-generation distributed database released by Facebook. The write-optimized and shared-nothing architecture results in excellent performance and scalability. The master class ring design of Apache Cassandra makes it elegant, easy setup and maintenance. Cassandra used to provide a simple solution for complex problems like Metrics and Logging. Cassandra is the small footprint of Major or Primary Database so easy to learn. Cassandra used to handles a large amount of data and operate massive data. This Cassandra Training will provide you extensive knowledge of Cassandra concepts, high-scalable data models and the Cassandra architecture which will enable you to build applications for big data.

Objectives of the Course

  • Creating Sample Application in Cassandra
  • Configuring, Reading and Writing Data in Cassandra
  • Integrating Cassandra with Hadoop
  • Cassandra Data Model
  • Cassandra Environment
  • Understanding Cassandra Architecture

Who should do the course

  • Professionals looking for a career in Cassandra
  • Project Managers
  • IT Developers
  • Testing professionals
  • Graduates looking to upgrade their skills to Cassandra Databases
  • Analyst/Researcher

Cassandra Course Content

What is Big Data

  • Technology Landscape
  • Big Data Relevance
  • Distributed Systems and Challenges

Why NoSQL Databases

  • Relational DB vs. NoSQL
  • Type of NoSQL Databases
  • NoSQL Landscape
  • CAP Theorem and Eventual Consistency
  • Key Characteristics of NoSQL Database systems
  • ACID vs BASE

Cassandra Fundamentals

  • Distributed and Decentralized
  • Elastic Scalability
  • High Availability and Fault Tolerance
  • Tuneable Consistency
  • Row-Oriented
  • Schema-Free
  • High Performance

The Cassandra Data Model

  • The Relational Data Model
  • A Simple Introduction
  • Clusters
  • Keyspaces
  • Hands-on Session

Installation and Setup of Cassandra

  • Single Node Setup
  • Multi-Node Cluster Setup
  • Key Configurations for Cassandra
  • CLI and Hands-On with Cassandra

Cassandra Modeling

  • Cassandra (Column Family NoSQL DB)
  • Key Concepts – Key Space – Column Family – Column Family Options – Wide Rows, Skinny Row – Column Sorting – Super Columns – Counter Column Family – Composite Keys and Columns – Time To Live –
  • Secondary Indexes in Cassandra
  • Difference between Custom Indexes and Secondary Indexes
  • Difference between Relational Modeling and Cassandra Modeling
  • Key Points to note while modeling a Cassandra Database
  • Patterns and Anit-Patterns in Cassandra Modeling

Cassandra Architecture & Intro to CQL

  • Anatomy of Reading operation in Cassandra
  • Anatomy of the Write operation in Cassandra
  • How is Deletes handled in Cassandra
  • System Keyspace
  • Peer to Peer Model Logical Data Model: Keyspace, Column Family/Table, Rows, Columns
  • Traditional Ring design vs. VNodes
  • Partitioners: Murmer3, Random (md5) and ByteOrdered
  • Gossip and Failure Detection
  • Anti-Entropy and Read Repair
  • Memtables, SSTables and Commit Log
  • Compaction fundamentals to reduce SSTable data files
  • Hinted Handoff
  • Compaction
  • Bloom Filters, Tombstones
  • Managers and Services
  • VNodes
  • Indexes and Caches
  • Coordinator node
  • Seed nodes
  • Write/Read consistency levels: Any, One, Two, Three, Quorum
  • Snitches: Dynamic snitching, Simple Snitch, Rack Inferring Snitch, Property File Snitch, Gossiping Property File Snitch
  • Routing Client requests
  • Nodetool commands: gossipinfo, cfstats, describing
  • YAML file fundamentals
  • Operations management web GUI
  • Stress testing Cassandra
  • CQL command fundamentals

Cassandra API

  • Key concepts for Reading and Write in Cassandra
  • Tunable Consistency
  • Simple Get, Multi-get Slice
  • Range and Slice
  • Slice Predicate
  • Delete
  • Hands-on CLI commands

Cassandra CQSHL

  • SQL over Cassandra
  • Composite Keys
  • Hands-on examples on CQL 3.0

Cassandra Clients

  • How to establish Client Connections
  • Thrift Client
  • Connection Pooling
  • Auto-discovery and Failover in Hector
  • Client with CQL

Cassandra Monitoring and Administration

  • Tuning Cassandra
  • Backup and Recovery methods
  • Balancing
  • Bootstrapping
  • Node Tools Commands
  • Upgrades
  • Monitoring critical metrics
  • Bulk Loading Data to Cassandra
  • Bulk Export of Data from Cassandra
  • Hands-on Examples for each of them

Cassandra Analytics Cluster

  • Cassandra Hadoop Integration

Cassandra Search Cluster

  • Integration of Solr with Cassandra
  • Search Query on Cassandra