Part 2: In Memory Services that own their data, i.e., Object Oriented Service Development (SOA 2.0+)
Tuesday, July 30th, 2013In our last segment we introduced the concept of ProjectX, our next generation model to rapidly develop fast, scalable services, that are exposed as REST and WebSocket services. Please remember that ProjectX is a placeholder name and the final name is yet to be determined.
Feeling creative? Please feel free to send name to sales@caucho.com.
Think Differently About Service-Oriented Development
In the same way NoSQL is an alternative way of thinking about data storage, ProjectX is a different way to think about writing services.
When you develop with ProjectX, you develop in a service-first manner, allowing you to spend your time on writing objects versus wasting time on cycles and iterations of complex schema design and schema migration. Not to mention you’ll see a significant decrease in cache coherency issues.
How does it work?
If you recall objects are data and logic. Your data is in your objects and the objects that you have in memory have the data that you need. ProjectX just makes sure that those objects are backed up to disk. This allows you to write services that are in Java and only Java.
ProjectX and Non-Blocking RPC Services
Often when people think about services, they think about blocking RPC services like REST. ProjectX allows you to easily develop non-blocking RPC services, as well as allowing you to register callbacks and/or use one-way method calls. ProjectX allows these services to be consumed over HTTP/REST/JSON or WebSocket/JSON.
ProjectX has an open wire protocol, based on JSON, called JAMP that is easy to implement. You can call into any ProjectX service from any language. All that is required is for that language to have HTTP support and JSON support (so Ruby, Go, C#, Java, Python, JavaScript). If the language also has WebSocket support, then the conversation can be bi-directional and very efficient. (We also have HAMP, which is Hessian based. Hessian is a binary protocol which has been ported to many languages including ActionScript, Python, Java, C#, and others.)
Object-Oriented Development Is Logic and Data Together
One of the original complaints about J2EE, was that it pulled developers away from the object-oriented model. You ended up writing procedural code and all of the data existed in the database. All services were stateless and had the logic in them. No services maintained their own state and relied on third-party frameworks to map objects to a relational database.
Many frameworks like Spring, CDI, Hibernate and Guice, mitigated some of the early issues with J2EE and its lack of OO. Also, the NoSQL movement made the mapping of objects to database easier. But, inherently the majority of modern service development still typically splits the data into a database or some sort of data store, which separates the data from the service. In this common model, services do not own their data.
Object-oriented programming means objects owning their data and logic. Stateless services are more akin to procedural programming. ProjectX allows true and productive OO development of your services.
Don’t Misapply Distribute Caches and DataGrids
When you start using caches and DataGrids to speed up storage and retrieval to databases, you are trading the problem of latency for a new more complex set of problems that include but are not limited to: cache coherency issues and split brain to name just a few. These are not easy issues to handle.
In the ProjectX model, objects services own their data and the objects are in memory. ProjectX enables service developers to back those objects to disk in the most efficient manner possible. There is no longer database usage merely for data safety. You can still use a database for reporting, but now your operational data can exist purely in Java.
Did you know that a modern commodity hard disk can read/write up to 300 MB per second? If you are using SSD, the sequential reads are up to 500 MB per second. Phase shift memory and advances in Flash mean that this speed will increase. If you add RAID level 0 support, this speed can increase by several multiples. ProjectX journaling and data store takes advantage of sequential writes to ensure data safety at top speeds. More details about this are in subsequent posts.
Using ProjectX is as easy as just using a few simple annotations. Your code will look like code written for a typical service in EJB 3, Spring or Guice. But with ProjectX you can avoid the common mistake of using the database as a synchronization mechanism.
Using the database as a synchronization mechanism is an anti-pattern that causes many performance and scalability problems in service development. Rest assured, ProjectX is a Java POJO approach to development. Your code can be completely annotation free, or, if you choose to use Java EE/CDI, you can use a few annotations for productivity. Your code base has very little to no direct tie to ProjectX. It is just Java. We don’t try to tie you to our platform.
The Real Expense of Abusing Caching
Using ProjectX also enables you to avoid the anti-pattern of duplicating all the data in the database and every possible query of that data in a data grid or data cache. By using a cache or adding a lot of complexity to your application, you may incur problems of cache coherency and split brain. If all you know is horizontal scaling and caching, every large-scale system looks like a nail. ProjectX can be the hammer.
It is very easy and, from my experience, very common to paint a project into a corner by abusing caching. Caching is the equivalent of applying a quick and dirty (as in dirty read) Band-Aid solution that can cause many operational and development issues down the road. Many have worked on projects that had 80 GB of data, but the same data existed in many cache layers to the tune of 12 TB of RAM. There are projects that solve all these issues with more horizontal scale out and more caching, and these projects can quickly become a vast waste of hardware and developer productivity – not to mention the near impossibility of properly invalidating a cache. Misapplying horizontal scale out and caching have wasted countless developer and operation-engineering years.
Using ProjectX does not preclude horizontal scale out and caching. But when you have services that are up to 10x – or as much as 100x – more efficient and don’t require cache for all of their data; then you reduce cache coherency issues and you need less server instances. It would not be uncommon to replace 10 to 100 hardware servers written the traditional way with six to 12 servers using ProjectX. The ProjectX approach should also be 2x to 10x faster than normal service development (database, cache, Java REST lib, JPA, local cache and distributed cache.). Also, since you have fewer servers and fewer things to worry about (like cache coherency issues which are some of the least fun things in the world to chase down and debug), your operations costs should be 2x to 10x cheaper as well.
ProjectX fully supports horizontal scaling. You can service many more requests/connections from your services. ProjectX is, in fact, a distributed system for service development. More about this will be covered in the next post.
Services Should Own Their Own Data
ProjectX allows the service to own its data, and ProjectX provides fast storage mechanism for crash recovery. ProjectX allows your objects to be served out of memory.
In the ProjectX approach your operational data is your Java objects.
ProjectX provides journaling, replication and fast persistence. The emphasis is not on the persistence. The persistence is a foregone conclusion managed mostly by ProjectX for data safety. This feature allows you to focus on your business logic and derive real value from your services.
Do you want to focus on enhancing the business value of your service or on managing database mapping and cache coherency issues?
The Real Win: The Ability To Develop Faster and Streamline Your System
Just as NoSQL was built for horizontal scaling but found a home in the hearts of developers who wanted to avoid schema migration and wanted more productive, dynamic schema, ProjectX has big productivity wins as well. You don’t have to be the next Internet sensation to get benefits out of ProjectX. If you want to focus on providing business value instead of feeding complexity then ProjectX is a good fit for you.
We feel once you start developing services with ProjectX that you will not want to develop them any other way. Instead of dumbing down distributed service development, we put the engineering rigor and computer science back into service development. You get to take full advantage of your distributed system. Ultimately, and most importantly, you get to focus on writing more collaborative, richer applications. Features that were once cost prohibitive, or could never be squeezed into the budget, are now easy to develop. ProjectX, is a very practical, user-friendly way to create massively collaborative and rich applications. It makes the nearly impossible development easy.
ProjectX makes sense for both enterprise applications and mobile applications that need to send six million requests per second. ProjectX is just simply a more productive way to build services.
Tune in next time when we show you some basic code examples from ProjectX.
If you would like to learn more about ProjectX or become an early adopter or early evaluator, please contact sales@caucho.com.
Check out part 3 with code examples.