Application scaling
Scaling application in cloud is a burning issue. The
very concept of cloud technology implicates on-demand application scaling. Any decent
cloud provider supports respective functions.
Why is scaling out so important? We have several options
regarding to increasing application performance. We can buy new server
hardware, add more RAM, and so on. This is vertical scaling, or scaling up. However,
this option can be pretty costly, time-consuming, and has its limits. You can
go for cutting-edge technology, but it can still be insufficient for your
application.
The second option is horizontal scaling, or
scaling out. It implies extending the computing power of the application by
increasing the number of servers where the application is located, or
application instances in case of PaaS. In other words, if the application is
located on one server, and at some point it stops "pulling its
weight", we can simply buy another server, identical to the first one, and
install our application on it. This way, the queries to the application will be
divided between the two servers.
This is the fundamental principle of scaling out
applications hosted in a "cloud", but instead of real physical
servers, we have the notion of a virtual machine. When one copy of a virtual
machine becomes insufficient for our application, we can increase the
performance by distributing the workload between several virtual machines.
The capabilities of Microsoft cloud platform are
quite broad. They include auto-scaling and on-demand scaling, which are available
through a user interface as well as through SDK, REST API and PowerShell.
Although scaling applications (PaaS) or virtual
machines (IaaS) is simple – just define the necessary number of instances and
you got them – more questions will arise if your application uses MS SQL
databases. The first thing that comes to mind is to create a cluster of virtual
machines with SQL Server. This solution is relatively simple and widely known.
But what if the application uses the database as a service (SaaS)? And what if
we don't want to fiddle with setting up an SQL Server cluster?
Since we
are dealing with Windows Azure, the SQL database should be SQL Azure. This
database supports a scaling out technology (sharding) called SQL Azure
Federations. It is very simple: logically independent lines of one table are
stored in separate databases. A simple example:
What do we get from all of this? First of all,
data isolation. There is no connection between data from one account and data
from another. Thereafter, sharding allows us not only to scale out databases,
but also to implement multi-tenant scenarios. Since the amount of data stored
in a specific shard is significantly smaller than the total size of the
database, data retrieval is much faster.
SQL Azure limitations
Every technology has its drawbacks, and sharding
is not an exception. Let's consider what drawbacks we may encounter with the
sharding technology, using the database described in the example above.
As specified by the SQL Azure architecture, the
service does not support data retrieval from several databases simultaneously.
One database equals one connection. The shards are not an exception. We would
have to run a separate query to return the number of clients in the database
for every shard.
Initial query:
SELECT Count(*) FROM Account
Query for a
specific shard:
USE FEDERATION Accounts(AccountId = 4) WITH RESET, FILTERING = OFF
GO
SELECT Count(*) FROM Account
The logic
of summing up the values returned by this query must be implemented in the application. As a result of using federations, part
of the code will "go into" the application, as some of the
database-level capabilities of regular SQL Server are limited.
Foreword
SQL Azure
Federations is not a silver bullet, of course. You can always develop and
implement your own principle for scaling out your database. Let's assume that
the multi-tenant approach is a kind of database scaling out, as one user’s data
is separated from the data of another not only "logically", but also
"physically".
If we need
to add new users, we configure separate databases for them. The problem is that
the application logic must contain a "routing" mechanism, meaning the
application must know which database it is currently working with.
But let's return to SQL Azure Federations.
The idea
offered by Microsoft idea itself is worthy to be praised. It would be great to
have a tool for scaling out databases quickly and easily. Add to that
auto-scaling based on the results of specific queries (but that would be pure
sci-fi)...
As a rule, before you make the final decision to
start using SQL Azure Federations, you must perform a detailed analysis of the
existing database (because you usually port existing databases), or plan out in
minute detail the architecture of the database that the application will be
using, as well as the logic of the application that will be working with the
database.
That’s all for now with the theory, but when it
comes to practice, there is usually a significant number of obstacles. Instead
of learning how easy it is to start using SQL Azure Federations from scratch,
we will try to migrate an existing SQL Azure database to using federations. We
will observe the steps performed by the DBA, as well as the problems that can
be encountered during the migration.
So stay tuned and have a great day!
Thank you for the article!
ReplyDeleteYour advice here is very important for my work on project connected with Ideals virtual data rooms.