CS6320:  SW Engineering of Web Based Systems

 

Google App Engine

https://appengine.google.com/

function of GAE

  • Using Googles Infrastructure to host and build your Web Applications
  • Free Account --- for limited bandwidth ....see http://code.google.com/appengine/ for details
  • NOT a virtual machine environment like Amazon or other "IaaS Cloud computing environments".
    • is a platform as a service (PaaS) cloud computing platform for developing and hosting web applications in Google-managed data centers.
    • Applications are sandboxed and run across multiple servers
    • App Engine offers automatic scaling for web applications—as the number of requests increases for an application, App Engine automatically allocates more resources for the web application to handle the additional demand
  • GAE serves more than billions of  pages views a day across all applications.
  • You can scale up to more than 7 billion pages per day and scale way down to your need.
  • NOTE: Amazon now getting into PaaS (http://aws.amazon.com/elasticbeanstalk/) but GAE still considered leader in PaaS

 

Why Google App Engine? Provides solution for Data, Cache, Authentication, etc.

  • how many developers or architects have experience and the mindset to build applications that support 100s of thousands of concurrent users up all the time?

  • Scaling Big is Really Hard

  • "Commoditization of Software Architecture and Scaling Skills"

  • Horizontal scaling model

    • this is not a model that most web application developers have experience with.

    • instead of using more capable hardware (vertical scaling), you use more instances of less-capable hardware, each handling a slice of the work, often doing the same function (e.g. sliced between groups of users).

    • intent is to reduce centralization of resources

    • ultimate goal is to simply be able to add more instances of the hardware without limit to meet increased scale requirements.

  • Google AppEngine Makes It Easy by Imposing a New Architecture
  • You won't have to worry about things like replicating state, scaling datastores, building caches, etc. You won't have to hire as many really smart systems architects to bend your applications around the constraints imposed by a requirement of unbounded scaling.
  • BUT THERE ARE SOME RESTRICTIONS

"Here's why Google App Engine is important, ---If you build your app on the Google App Engine architecture, it will scale to unlimited levels without any extra effort. Full stop."

What is Free and What is NOT

  • FREE: All applications have a default quota configuration, the "free quotas", which should allow for roughly 5 million pageviews a month for an efficient application. You can read more about system quotas in the quota documentation.
  • PAY FOR MORE: As your application grows, it may need a higher resource allocation than the default quota configuration provides. You can purchase additional computing resources by enabling billing for your application. Billing enables developers to raise the limits on all system resources and pay for even higher limits on CPU, bandwidth, storage, and email usage

 

Can handle/ some features

  • data storage (distributed---Google's solution to data)

  • application hosting

  • application version control

  • services - authentication (OAuth), google file system, more

  • scalable infrastructure (e.g. Memcache)

  • Support for python, java, php and Go with SDK so you can develop locally on your machine (deploy to google app engine)

  • web-based adminsitration interface

 

GAE Architecture

 

2 Datastore options on Google AppEngine

http://code.google.com/appengine/docs/java/datastore/hr/

  • discuss later

Java on Google AppEngine

  • Java 7 JVM,

  • a Java Servlets, JSP

  • support for standard interfaces to the App Engine scalable datastore and services

    • such as JDO, JPA, JavaMail, and JCache

 

Java on Google App with Eclipse Pluggin

 

Other Java IDEs for Google AppEngine

 

Java Servlets, JSPs on Google AppEngine

  • You provide your app's servlet classes, JavaServer Pages (JSPs), static files and data files, along with the deployment descriptor (the web.xml file) and other configuration files, in a standard WAR directory structure.

  • App Engine serves requests by invoking servlets according to the deployment descriptor.

 

Restrictions on Java Servlets in Google AppEngine--- Limitations & New Way of Thinking

To allow App Engine to distribute requests for applications across multiple web servers, and to prevent one application from interfering with another, the application runs in a restricted "sandbox" environment.

  • The JVM runs in a secured "sandbox" environment to isolate your application for service and security.

  • The sandbox ensures that apps can only perform actions that do not interfere with the performance and scalability of other apps.

  • For instance, an app cannot spawn threads, write data to the local file system or make arbitrary network connections. (cant allow to store to local file system when things are distributed ---could be problems...you need to use the datastore instead)

    • HOWEVER--Apps use the URL Fetch service to access resources over the web, and to communicate with other hosts using the HTTP and HTTPS protocols. Java apps can simply use java.net.URLConnection and related classes from the Java standard library to access this service.

  • app also cannot use JNI or other native code.

  • See Servlet Environment for more information.

  • need to unlearn some deeply held principles about "efficiency" and scalability.     

    • Example: your instincts as a developer are to keep as much state (e.g. web sessions) in memory between requests as possible. With App Engine, however, you'll learn to accept a certain fixed (i.e. invariant with respect to scale) latency of accessing BigTable (Google's 'data could') in exchange for never having to have to worry about any added latency in handling 100,000 (or a million) concurrent users.

    • WHY USE App Engine????? As a software architect, you take a fixed hit for the benefit of infinite scaling.     

 

WHAT YOU CANT DO -AGAIN

  • write to the filesystem. SOLUTION --- Applications must use the App Engine datastore for storing persistent data. Reading from the filesystem is allowed, and all application files uploaded with the application are available.
  • open a socket or access another host directly. SOLTUION --- An application can use the App Engine URL fetch service to make HTTP and HTTPS requests to other hosts on ports 80 and 443, respectively.
  • spawn a sub-process or thread. CAVEAT A web request to an application must be handled in a single process within a few seconds. Processes that take a very long time to respond are terminated to avoid overloading the web server.
  • make other kinds of system calls.   


One potential problem -- no threads and no classes using threads

A Java application cannot create a new java.lang.ThreadGroup nor a new java.lang.Thread. These restrictions also apply to JRE classes that make use of threads. For example, an application cannot create a new java.util.concurrent.ThreadPoolExecutor, or a java.util.Timer. An application can perform operations against the current thread, such as Thread.currentThread().dumpStack().

 

 

Another potention problem --- the time your app takes to respond is limited

  • All requests (including tasks) in app engine have a time limit of XXX seconds (30 seconds --but, see current limits on google documentation). If your calculations will take longer than that, you will need to figure out a way to break them down into smaller chunks. App engine's sweet spot is web apps, not number crunching.

 

More on Java on Google AppEngine --data storage, caching, etc.

  • http://code.google.com/appengine/docs/java/overview.html

 

When you have depleted your budgeted (or free) resources for an app

  • When an application consumes all of an allocated resource, the resource becomes unavailable until the quota is replenished.

  • This may mean that your application will not work until the quota is replenished.

  • For resources that are required to initiate a request, when the resource is depleted, App Engine by default returns an HTTP 403 Forbidden status code for the request instead of calling a request handler. The following resources have this behavior:

    • Bandwidth, incoming and outgoing

  • For all other resources, when the resource is depleted, an attempt in the application to consume the resource results in an exception. This exception can be caught by the application and handled, such as by displaying a friendly error message to the user. In the Python API, this exception is apiproxy_errors.OverQuotaError. In the Java API, this exception is com.google.apphosting.api.ApiProxy.OverQuotaException.

 

 

How Google App Engine achieves its scalability --- is in its restrictions

  • In some senses, an AppEngine app looks just like a traditional web app, answering HTTP requests, and rendering HTTP responses. But the subtle difference (and what makes it all scalable) is that each request handler must be entirely stateless.

  • This means no web session state on the server side - no data specific to this user can be stored in memory between requests

  • This is good because it allows requests to be routed to any server in the world that has the code loaded (and can access BigTable) - this is what enables inifinite scalability.

  • But because of this subtle little change, a developer may have to rethink what they are doing in each request.

A QUOTE: " Do your processing, modify your data store, fire off other behavior (other events), and return. You don't worry about threading (mostly), you don't worry about hanging the server (there is no "the server"), you don't worry about resources beyond those which you consume in handling the immediate request."

 

Why Not GAE?

  • when you are going to exceed the free usage and have no budget --- but, what can you do with no budget anyways? nothing!

  • when you have a budget but, your application(s) or some set of them do not need to have scalability ---maybe they can be served by cheaper hosting solutions or buying and maintaining your own app server(s).

  • possibly you are a mid to large size company and it might make sense to have your own servers (farm, architecture for scalability,etc). --you don't want to share.

  • well ---even for things that need scalability you might consider competitors like Amazon (even more flexible--well my opinion)...but, no free options

What are the alternatives?(well there not equivalents but, potential alternatives -- some not as scalable, some have different kinds of restrictions...most famous is Amazon)

GAE growth

gae growth

© Lynne Grewe