Archive for the ‘Big Data’ Category

Sparse Vectors in Apache Spark

What is a Sparse Vector

A vector is a one-dimensional array of elements. So in a programming language, an implementation of a vector is as a one-dimensional array. A vector is said to be sparse when many elements of a have zero values. And when we write programs it will not be a good idea from storage perspective to store all these zero values in the array.

So the best way of representation of a sparse vector will be by just specifying the location and value.

Quick Zookeeper Tutorial from a Developer’s Perspective

All of us love zookeeper and whenever we have to pick up choosing a coordination service this has really become the default choice for almost all of us.

A zookeeper is a lot like a file system (but is not one) with hierarchical structure. Every node in the zookeeper tree is called a znode (pronounced as zee-node). It contains data and can also have children.

I’m listing down some of the key characteristics of Zookeeper which all of us should know before using it.

Enter your email address:

Delivered by FeedBurner

  • RSS