Distributed, Decentralized Database For Mobile Devices

May 22, 2012

State of the art

It's been a while since my last post. Those past few weeks I was busy with multiple school related projects, exams and so on, but at the same time I was working on my final project too.

So, where am I now?

At the suggestion of my mentors I switched the Desktop Web Application from Java Servlets and JSP to Google Web Toolkit(GWT).

Why?

Because a Web application developed with servlets is working as a fat client. In a fat client system, a high degree of processing occurs on the client or desktop system, while relatively little is done on the server.

With GWT is totally different, because it is working as thin client, this is a model of computing whereby almost all the processing is done at the server end. This enables low powered computers to be used at the front end.

There are a lot of other pros and cons for each approach, though in the last decade most of the industry has moved largely in favor of thin client systems because thin client systems can support on-demand and other Internet-based applications with relatively little administrative or technical support.

To move on, a few days ago I've added a new feature to my API, delta updates. That means, now I am not exchanging the whole database between server and mobile device just data that has been modified since his last synchronization.

Also exchanging of c2dm messages are done, so when the client on the desktop web app is updating the data, a poke is sent to the android device, to inform that he has to synchronize his database.

With this features I have removed another two restrictions from the beginning:

Database from the server side (cloud) will be accessible only from Android devices.
Changes on the client side will result with an update of the entire database on the cloud.

Now, I have to continue with testing the API/Desktop Web Application, fix bugs if it is necessary and start to write the theoretical part.

April 10, 2012

Progress

Generalization of database schema is done for my system. Now an authenticated user can create through REST calls his own schema based on his preferences and interact with it.

Also a basic implementation of c2dm protocol is available. As a reminder c2dm is a protocol that will push notifications to client when something change on the server side.

For some of you who are interested on how to implement this on the server and the client, you can check this tutorial, is very useful and easy. It provides a simple example of getting the auth token from the c2dm servers, registering an android client to c2dm and exchange a message between client/server.

Next, I have to change the desktop web client as well and develop it based on functions that I've created.

Few days ago the first demo on Android platform was released. It is available here. I will appreciate any kind of feedback, I know that there are some bugs to fix, but the main functionalities are met.

Let's point out again the main functionalities:

The user can create a Gmail account if he doesn't have one already associated with the phone.
Authenticate with it on the server side.
View(GET) the current Tasks on the server(it there are any).
Sync(GET) the client database with server database.
Add(POST) a Task on the local database(Content Provider) and then on the server.
Update(PUT) a desired Task, form the current list of available Tasks.
Delete(DELETE) a Task.

The following pictures should clarify what I was describing:

Authorization of an account

Main Activity - RUD functions.

Create function

April 3, 2012

Low-Level Datastore API

I had to postpone the ACLs for a while and focus on a more important requirement. Until now my system offers data through REST architecture only for a single database schema, so it's time to make it more general and support any type of database structure based on user's preferences.

To achieve this I will use Java Low-Level Datastore API to work with the datastore, because it doesn't require a predefined schema and I can expose the service capabilities directly.

The datastore writes data in objects known as entities, and each entity has a key that identifies the entity. Entities can belong to the same entity group,which allows to perform a single transaction with multiple entities.

Let’s create an Entity representing the login information of an user.

import com.google.appengine.api.datastore.Entity;
import com.google.appengine.api.datastore.Key;
import com.google.appengine.api.datastore.KeyFactory;

String user    = "test@example.com"
String message = "Hello world!";
Date   date    = new Date();

Entity LoginInfo = new Entity("Login", user);
       LoginInfo.setProperty("date", date);
       LoginInfo.setProperty("content", content);

Above we defined an Entity with a raw constructor. We are passing two strings: the kind(or the name of the schema) and the key(or the unique identifier). Entities are typeless, so we can specify any string as a type. In fact the number of kinds is limited only by the number of kinds that we need, and as long as we don’t lose track of them, we could have many different kinds without having to create a class for each one.

The key name is what we’ll use to retrieve user later on when we need him again. Think of it as a Map or Dictionary Key.

Once we have an Entity object, we need to define the properties. In this example I defined the current authentication date and a welcome message as properties. Note that, again, we can define as many properties as we want.

After we construct the entity, we instantiate the datastore service, and put the entity in the datastore:

DatastoreService datastore = 
             DatastoreServiceFactory.getDatastoreService();
datastore.put(LoginInfo);

The low-level Java API provides a Query class for constructing queries for fetching and returning the entities that match the query from the datastore. Here is a simple example based on our code:

Query query = new Query("Login");
Iterator iterator = datastore.prepare(query)
                                     .asIterator();
while(iterator.hasNext()){
    Entity person = iterator.next();

}

This code creates a new query on the Login entity, which returns an interator to a list of Entity objects.
On the following days based on this approach I will rewrite the entire API to meet the requirement.

March 26, 2012

ACLs

Those past few weeks I successfully deal with this:

Add security constraint: an user is not allowed to access the API without being authenticated.
Full CRUD functions for desktop web application: add the Update(PUT) feature(an user can edit theirs entry).
Multitenancy: Creating namespaces for each user.
Changed the schema of the database and changed the API accordingly.
Bug fix: Get date as long and convert it to Date (public issue Gson)
Offer for requester's API the data from their own namspace.
Rearrange the entire project.

Now let's stop to a point from the list and discuss the pros and cons regarding to this:

Multitenancy: Creating namespaces for each user.

As I said in my previous post a good advantage of multitenancy is that it simplifies administration and data becomes easier to manipulate because all namespaces share the same database schema.

But, creating a namespace for each user will limit the boundaries, so we were thinking to split the systems on layers. For example two people could share a single list, so one person could add items to the list and the other person sees them populate on their phone.

To achieve this we have to implement some access control lists and offer to the client the possibility to create a shared namespace(where he will add people to the group). For that I have to create groups and for each member of the group to add some permissions(read, write, execute).

On the following lines I will describe the requirements that I want to achieve:

An user can create a desired namespace.
He can share it with the others.
The accepted users can read, write, execute on the namespace.
The owner of the namespace can delete the namespace.

Nice to have requirements:

The owner can see the members of their namespaces.
The owner of the namespace can set permissions for each member.

Now I am thinking what should I use: a specialized framework or to create my own. I already searched on the internet and I found some but they look too hard to follow. I will see.

March 7, 2012

/* TODO */

Google authentication for my system is done, BUT during the implementation, some problems / questions raised, so let's take a look and try to answer them:

1. Yesterday night while I was working on a totally different project, I was wondering if I can access the data from the database via GET method on the browser even though I'm not authenticated, and after trying I found out that I could :) because I forgot to set the security constraint on my system. To solve this issues I google it and I found out that Java web applications for Google App Engine use a deployment descriptor file to determine how URLs map to servlets, which URLs require authentication, and other information. This file is named web.xml, and resides in the app's WAR under the WEB-INF/ directory. web.xml is part of the servlet standard for web applications. In this file I added a <security-constraint> element who defines a security constraint for URLs that match a pattern. If a user accesses a URL whose path has a security constraint and the user is not signed in, App Engine redirects the user to the Google Accounts sign-in page. Google Accounts redirects the user back to the application URL after successfully signing in or registering a new account. The app does not need to do anything else to ensure that only signed-in users can access the URL.[Source] Below is the code that I had to add to fix this bug.

<security-constraint>
      <web-resource-collection>
            <url-pattern>/api/*</url-pattern>
      </web-resource-collection>
      <auth-constraint>
            <role-name>*</role-name>
      </auth-constraint>
</security-constraint>

2. Currently to distinguish the data between clients I've added beside their data, two more fields, the email address and the userid. Those kind of information(which by the way is unique) help me to easily get their data after logging into the system. Below you can see a snapshot of how I did it.

/* Get the instance of the Database */
PersistenceManager db = PMF.get().getPersistenceManager();
/* Create the Sql query */
Query q = db.newQuery("select from " + Note.class.getName()
          + " where userId=='" + user.getUserId()
          + "' && emailAddress=='" + user.getEmail()
          + "' " + " order by date");
/* Execute the query */
List<Note> list = (List<Note>) q.execute();

On the other side my mentors said that this approach is good, BUT a correct one should use the power of Multitenancy which is supported by the Google App Engine Api. Basically multitenancy is the name given to a software architecture in which one instance of an application, running on a remote server, serves many client organizations (also known as tenants). Using a multitenant architecture simplifies administration and provisioning of tenants. You can provide a more streamlined, customized user experience, and also aggregate different silos of data under a single database schema. As a result, the application becomes more scalable. Data becomes easier to segregate and analyze across tenants because all tenants share the same database schema[Source]. Below I've added an example of creating a namespace for an authenticated user.

if (NamespaceManager.get() == null) {
  // Assuming there is a logged in user.
  namespace = UserServiceFactory.getUserService().
              getCurrentUser().getUserId();
  NamespaceManager.set(namespace);
}

I will try to add this feature to my system, so this it will be another task on my TODO list.

3. A small modification on the server side application that I should do, is to allow a user to edit their entries(Tasks), so all CRUD functions will be met.

Ok so let's point the main tasks that I should solve in the near future:

Allow an user to edit their's entries.
Add Multitenancy.
! Add C2DM protocol, that will push notifications to client when something change on the server side.

P.S. The Demo is available on this link. It's just an small application that will prove the functionalities of the system.

March 2, 2012

App Engine connected to Android Device

Trying to implement google accounts login to my application and searching through internet information about Android and Google App Engine I found an interesting Google Talk event.

In that session two engineers from Google presented a new feature, App Engine Tooling for Android. It's a complete set of Eclipse-based Java development tools for building Android applications that are backed by App Engine.

Just create a new application "App Engine connected to Android device" and Eclipse make 2 projects: Android application and App Engine Application who provide a simple example of communication between the GAE server and the client application.

The interesting part is that this kind of application takes care of authentication with Google account and implementation of C2DM protocol, so basically you just login with your gmail account on both sides (server - GAE and client - Android App) and then you establish a communication between those two entities. Then you can send messages from the server to client.

The basic architecture of the "framework" looks very similar to a wide range of applications:

Unfortunately this is not suitable for my work because as my mentor says "it's not a good ideea to combine RPC with REST Architecture".

Anyway it was nice to "play" with this project and try to understand how it works, even though it took more than 5 hours to build and deploy it.

February 29, 2012

Where we are

For the past few weeks we are trying to improve the functionalities of our system, so let's take a look at what have we done:

We switched the exchange data format used for serializing from XML to JSON(GSON).
Improve the CRUD functions, user can add more than one entry to database when he calls the POST method.
We solve the problem of generating an unique identifier for an entry with Universally Unique Identifier.
To assure that the database is consistent we added Transactions so each operation in the process of a transaction is guarantee to be atomic, which means that transactions are never partially applied. Either all of the operations in the transaction are applied, or none of them.

Now I will describe next tasks:

Authentication with Google accounts - This kind of authentication is needed in order to use C2DM protocol (I will describe in a later post this feature) and to distinguish data between clients.
Delta Updates - In order to optimize the network traffic and save time I will focus on implementing Delta updates. This is an update that only requires the user to download the data that has changed, not the whole database. Any application ready for updating can be updated almost immediately due to this system. If, for example, a local database that is 100 megabytes is updated with a new amount of data that is 2 megabytes, the system will download only the 2 megabytes instead of 102 me gabytes.

In order to keep track of changes during development and testing phase we decided to create a repository. We have created a SVN based repository on Google Code.

Pages