February 2, 2012

Bigtable and The Skeleton of 3D for Android

      Bigtable is a distributed database system, designed to scale to a very large size(petabytes) across thousands of servers. It is owned by Google and used on some of their applications(more than sixty) such as Google Maps, Google Earth, Gmail and so on.

      It's closed source, although Google offers access to it as part of its Google App Engine. Since his deployment(late 2003) Bigtable has achived serveral goals: wide applicability, scalability, high performance, and high availability.


      Each table on this system is a sparse, distributed, multi-dimensional map where data is organized into three dimensions: rows, columns and timestamps.


(row:string, column:string, time:int64) → string


      In order to optimize the managing of a huge amount of data, the tables are split at row boundaries and stored as tablets. Each tablet hold contiguous rows and have between 100-200 MB distributed on several machines.
Each machine stores about 100 tablets(in GFS), this setup allowing good load balancing and fast recovery(if a system goes down, other machines take one tablet, so the load on each is fairly small).

       When sizes threaten to grow beyond a specified limit, the tablets are subject of three different type of compaction:
  1. Minor Compaction - creates new SSTables - who has two goals: to reduce memory usage and reduce the amount of data that has to be read during recovery if the server dies.
  2. Merging Compaction, periodically executed in the background, reads the contents of a few SSTables and writes out a new SSTable.
  3. Major Compaction rewrites all SSTables into exactly one.

    More details about the implementation, data model and Google infrastructure on which Bigtable depends you can find on this lecture from University of Washington or on this paper.





       Regarding the small Rest API that I was about to develop, it has proved to be quite easy considering that I had some experience with Google App Engine, Jersey and Java. So I have created a small application on GAE and through Rest calls via Http I can Create(POST/PUT), Read(GET), Update(POST), Delete(DELETE) data in my table(Bigtable) on cloud. Also I created a simple application for Android who can do those operations as well.

      Now I will focus on describing the requirements of my future software implementation, but for that I have to do a research to find the best solutions  who fits.

1 comment:

  1. Ion, very good. One comment: try to ask more focused questions. The task in the last paragraph is correct but it will take us quite some time to answer it. I'd like to see questions, which can be answered in one week. Respectively I'd like to see the answers in the next blog. Slicing the problem to smaller tasks keeps us to stay focused and we will gain the feeling of making progress. Thanks for good job, Jan

    ReplyDelete