Archive for the ‘Big Data’ Category

Sparse Vectors in Apache Spark

What is a Sparse Vector

A vector is a one-dimensional array of elements. So in a programming language, an implementation of a vector is as a one-dimensional array. A vector is said to be sparse when many elements of a have zero values. And when we write programs it will not be a good idea from storage perspective to store all these zero values in the array.

So the best way of representation of a sparse vector will be by just specifying the location and value.
(more…)

Quick Zookeeper Tutorial from a Developer’s Perspective

All of us love zookeeper and whenever we have to pick up choosing a coordination service this has really become the default choice for almost all of us.

A zookeeper is a lot like a file system (but is not one) with hierarchical structure. Every node in the zookeeper tree is called a znode (pronounced as zee-node). It contains data and can also have children.

I’m listing down some of the key characteristics of Zookeeper which all of us should know before using it.
(more…)




Enter your email address:

Delivered by FeedBurner

  • RSS