GAE Datastore
What do you use it for? any time you have structured data you want to scale
-
real-time inventory and product details for a retailer.
-
User profiles that deliver a customized experience based on the user’s past activities and preferences.
-
Transactions based on ACID properties, for example, transferring funds from one bank account to another.
-
AND MUCH MORE
Lets learn.......
datastore = GAE's "database"
- not the traditional relational databases --> yields greater scalability
- more like object database
datastore entity = has one or more (name, value) pairs
Concepts of an Object based GAE datastore
- Entity = Object (loosely think of this as a row in a relational database--a "data entry")
- Properties= these store the data of an Entity (loosely think of this in a relational database as the field values in a data entry)
-
- name
- value(s) - a property may have more than one value (think about Entity=Dog, Property=color (values=white and brown))
- each value is one of many data-types like string, an integer, a date-time, or a null value.
- NOTE: the values do not have to be of the same type -- this is a departure of the concept of field in a relational database whoes values are of same type and singular.
- Key = each entity has a key that uniquely identifies it across entire system
- application ID = this makes sure nothing else about the
key can collide with the entities of any other application.
- It also ensures that no other
app can access your app's data, and that your app cannot access data for other apps.
- You won't see the app ID mentioned in the datastore API;
- kind = An entity's kind categorizes the entity for the
purposes of queries, and for ensuring the uniqueness of the rest of the key.
- example: a shopping cart application represents each customer order with an entity of the
kind "Order."
- specify when create entity.
- This is somewhat different than the realtional database concept of table but, that is the closest.
- entity ID = This can be an arbitrary string specified by the app
or it can be generated automatically by the datastore.
CREATED (only one of following ways):
- an entity ID given
by the app called key name, will be a string
- an entity ID generated by the datastore called an ID, will be an integer
It's tempting to compare these concepts with similar concepts in relational
databases: kinds are tables; entities are rows; properties are fields
or columns. That's a useful comparison, but watch out for differences.
Unlike a table in a relational database, there is no relationship between
an entity's kind and its properties. Two entities of the same kind can
have different properties set or not set, and can each have a property of
the same name but with values of different types. You can (and often
will) enforce a data schema in your own code, and App Engine includes
libraries to make this easy, but this is not required by the datastore.
Also unlike relational databases, keys are not properties. You can perform
queries on key names just like properties, but you cannot change
a key name after the entity has been created.
And of course, a relational database cannot store multiple values in a single cell, while an App Engine property can have multiopel values.
|
Some things about keys
- Keys can not be changed once set
- creating the concept of a in a relational data base, , in this way it is a reference to entity A's entity B.
GAE: low-level package to access datastore
com.google.appengin.api.datastore.* package
STEP 1: Create instance of DataStore (ds) for this application
DatastoreService ds = DatastoreServiceFactory.getDatastoreService();
STEP 2: Create instance of Entity called book
Entity book = new Entity("Book");
STEP 3: Set various properties of book Entity with name, value pairs
book.set*(*);
STEP 4: Add Entity instance book to the DataStore ds
ds.put(book);
Showing code in Java here (can also do in Python)
NOTICE: how the application code setups up the entity ---there is no structure that is setup prior in the datastore.
import java.io.IOException;
import java.util.Calendar;
import java.util.Date;
import java.util.GregorianCalendar;
import javax.servlet.http.HttpServlet;
import javax.servlet.http.HttpServletRequest;
import javax.servlet.http.HttpServletResponse;
import com.google.appengine.api.datastore.DatastoreService;
import com.google.appengine.api.datastore.DatastoreServiceFactory;
import com.google.appengine.api.datastore.Entity;
// ...
//STEP 1: Create instance of DataStore for this application
DatastoreService ds = DatastoreServiceFactory.getDatastoreService();
//STEP 2: Create instance of Entity called book
Entity book = new Entity("Book");
//STEP 3: Set various properties of book Entity with name, value pairs
book.setProperty("title", "The Grapes of Wrath");
book.setProperty("author", "John Steinbeck");
book.setProperty("copyrightYear", 1939);
//STEP 3: Create instance of java.util.Data and set it as a property associated with name "authorBirthDate"
Date authorBirthdate = new GregorianCalendar(1902, Calendar.FEBRUARY, 27).getTime();
book.setProperty("authorBirthdate", authorBirthdate);
//STEP 4: Add Entity instance book to the DataStore ds
ds.put(book); // ...
GAE: higher-level langauge specific packages to access datastore
- Python Datastore API
- Java - Java Persistence API and Java Data Objects -- Using this over the low-level GAE API may be better as it makes your Java based web app more portable to other servers/platforms beyond GAE
Monitor your Datastore useage and Datastore Entity statistics on GAE app
watch out $$$$$
You can see your datastore statisics by clicking in admin module the
For this example application (see article) you see that the data store entity "SSItemMark" takes up 97% of the datastore size
and there are 306,569 of them (this can get $$$costly)
Why Object Database --- Google's BigTable
- BigTable, the "data objects in the cloud" technology which undergirds Google's massive applications, has the magic property of being essentially infinitely scalable with respect to the amount of data, and the amount of transaction activity. It is essentially the horizontal "partitioning" or "sharding" of data taken to the extreme.
- databases, and not leveraging most of the relational features. They are paying a big cost for relational functionality without real need. Google App Engine acknowledges this fact, and provides the true object interface that most application developers are using anyway.
- Google App Engine forces you to be explicit about data indexes, but this is something you have to do anyway when horizontally scaling traditional databases anyway. In traditional web application architectures, scalability almost always involves partitioning data among several database instances. The moment you partition data among multiple SQL stores, you have to think about indexes, because searches across those stores requires you to perform some sort of scatter/gather (in functional programming & Google parlance "MapReduce") - and if you aren't careful about how you build indexes, you'll end up with incredible inefficiencies (like having to merge large data sets from multiple stores in memory).
What is horizontal partitioning -> distributed horizontally DB entries
Example horizontal partitioning by region closeness --- here into 2 partitions (shards)
How does Google do this ---well it uses techniques from MapReduce to do the splitting and also when you need to get the data from the database it must go out to the different parititions (shards) do the query and then combine the results ---- THIS is not simple and its a great feature that Google gives its infrastructure to you to do this!!!
|