Black Friday Madness – Musings of an E-Commerce Developer

As I sit here typing this post, somewhere close to 30 million Americans are pushing, shoving and otherwise cramming themselves into retail stores after a day of gluttony to partake in the ritual of Black Friday. It’s the one magical day where many retailers experience the largest profits of the year and offer some great deals in order to get more shoppers into their stores for the frenzy of frivolity. Whether it’s brick and mortar stores or their online counterparts, the overarching goal is a common one: throughput. The premise here is quite simple in that the more people you can get into your store looking at merchandise and the more checkout lanes you have open, the more customers you can process in a given period of time and the more money you can potentially make.

For your local retail stores, this means many things. Product placement is critical, with highly popular products usually placed deeper in the store so that you are at increased odds for more impulse purchases of fantastic deals on items you normally wouldn’t buy at all. Also, it’s important for stores to have a flow that can accommodate a mass of customers desperate for retail therapy. Aisles that contain popular products are wide and a conduit that circumnavigates the store is kept clear of debris at all times. I personally think it would be an interesting and amusing application for the Ford-Fulkerson algorithm to maximize the flow of customers through a store, placing higher weights on aisles with more tempting goodies or doorbuster sales.

The Reckoning Approaches

Online stores take a different approach to throughput. For these virtual storefronts, the more people that can hit their sites translate into more sales. That means they need to be able to support intense loads at key sale times. So how can developers for online retailers prepare their sites for the digital onslaught of the Black Friday/Cyber Monday one-two punch? For this article I will be focusing on the use of the new async and await keywords as a way to improve throughput on websites. The async and await keywords are part of the Task-based Asynchronous Pattern (TAP) that was introduced with the release of C# 5.0. It’s not that async/await allow you to do things that weren’t possible before. Asynchronous patterns in .NET have existed since framework version 2.0 with the Asynchronous Programming Model (APM) that brought forth methods like BeginInvoke/EndInvoke and the infamous IAsyncResult. These new keywords just make it much easier to implement and make the code involved much more maintainable as a result. I also want to make it clear that async/await is not a magic bullet and is only part of the total solution. Other throughput improving techniques like output caching and load balancing are still just as important in squeezing the most juice out of your servers.

Let’s analyze a typical ASP.NET web request process. When a page request is received, a thread is dispatched to handle that request. The thread is held while the server processes and builds the response, and once the response has been returned to the client, the thread is returned to the pool to be used for another request. It’s mind numbingly simple, but there are some gotchas. A worker process has a finite number of threads that can be dispatched due to the fact that resources are limited on the machine and at some point as the thread count becomes too high the cost of context switching begins to negatively affect performance as well. If more requests come in than there are threads to process those requests, the users will not be able to get to the site and will quickly get frustrated, reducing the overall sales numbers. The problem is that some requests take longer to process because they just do more stuff, and the stuff that I’m talking about here specifically are non-CPU bound tasks. Consider that you’re shopping at your favorite online store and you are in the process of checking out. After you fill in all of the pertinent information and hit the purchase button you have to incur at least one database call, a service call for payment processing and maybe the sending of a confirmation email of the order all before the response is sent to the client. In a traditional ASP.NET MVC website, this can look something like the following:

    public class CheckoutController : Controller
    {
        private readonly IPaymentProcessor _paymentProcessor;
        private readonly IOrderRepository _orderRepository;
        private readonly IEmailGenerator _emailGenerator;

        public CheckoutController(IPaymentProcessor paymentProcessor, IOrderRepository orderRepository, IEmailGenerator emailGenerator)
        {
            _paymentProcessor = paymentProcessor;
            _orderRepository = orderRepository;
            _emailGenerator = emailGenerator;
        }

        public ActionResult ProcessPurchase(OrderViewModel orderData)
        {
            orderData.ValidateData();

            ProcessPaymentAndSave(orderData); //make various service calls (this can take some time)

            return View("Confirmation", orderData);
        }

        private void ProcessPaymentAndSave(OrderViewModel orderData)
        {
            //collect the payment information
            orderData.PaymentDetails.AuthorizationCode = _paymentProcessor.ProcessPayment(orderData.PaymentDetails);

            //save the order to the database
            orderData.OrderId = _orderRepository.Save(orderData);

            //generate and send a confirmation email.
            _emailGenerator.SendConfirmationEmail(orderData);
        }
    }

These types of instructions are I/O bound and can take a fair amount of time to complete, and the thread that is processing these requests has to wait for all of them to finish before being able to process another request.

Defending against the Horde

As I mentioned above, asynchronous patterns in .NET have been around for quite a while and they center around the idea that calls can be made asynchronously by breaking them into two parts, the beginning (call) part, and the end (callback) portion. Using this model the webserver still dispatches a thread to process the incoming request, but as the asynchronous call is made, the thread is returned to the pool to do process other incoming requests. Once the call has been completed ASP.NET is notified and the callback is queued onto the threadpool (most likely a different thread than the call was sent on, I might add) to pick up where it left off. This allows the webserver to be more efficient with thread management and more requests can be handled simultaneously as a result. It must be noted that to get the benefits of true asynchronous calls, it’s not sufficient to simply make the entry point asynchronous but rather to refactor the underlying long running I/O operations to also use their asynchronous counterparts. As of .NET 4.5, many classes in the framework have been augmented to include asynchronous methods that return Task or Task objects as opposed to the earlier asynchronous model of calling into the Begin/End methods provided on those classes. This means that if we are sending info to a payment gateway using HttpWebRequest, that we use the GetResponseAsync method rather than the synchronous GetResponse or the BeginGetResponse and EndGetResponse combination. In MVC3 and MVC4, there has been a convention for enabling asynchronous behavior, but it involves some significant refactoring and can be detrimental to readability and the flow of what’s actually happening because of all of the separation. Here’s our checkout process example written out in MVC4:

    public class CheckoutController : AsyncController
    {
        private readonly IPaymentProcessorAsync _paymentProcessor;
        private readonly IOrderRepositoryAsync _orderRepository;
        private readonly IEmailGeneratorAsync _emailGenerator;

        public CheckoutController(IPaymentProcessorAsync paymentProcessor, IOrderRepositoryAsync orderRepository, IEmailGeneratorAsync emailGenerator)
        {
            _paymentProcessor = paymentProcessor;
            _orderRepository = orderRepository;
            _emailGenerator = emailGenerator;
        }

        public void ProcessPurchaseAsync(OrderViewModel orderData)
        {
            orderData.ValidateData();

            ProcessPaymentAndSaveAsync(orderData); //make various service calls (this can take some time)
        }

        public ActionResult ProcessPurchaseCompleted(OrderViewModel orderData)
        {
            return View("Confirmation", orderData);
        }

        private void ProcessPaymentAndSaveAsync(OrderViewModel orderData)
        {
            AsyncManager.OutstandingOperations.Increment(3);

            //collect the payment information
            _paymentProcessor.ProcessPaymentAsync(orderData.PaymentDetails)
                .ContinueWith(paymentCollectionResult => 
                    {
                        orderData.PaymentDetails.AuthorizationCode = paymentCollectionResult.Result;
                        AsyncManager.OutstandingOperations.Decrement();

                        //save the order to the database
                        _orderRepository.SaveAsync(orderData)
                        .ContinueWith(saveResult => 
                        {
                             orderData.OrderId = saveResult.Result;
                             AsyncManager.OutstandingOperations.Decrement();

                            //generate and send a confirmation email.
                            _emailGenerator.SendConfirmationEmailAsync(orderData)
                                .ContinueWith(emailResult => 
                                {
                                    AsyncManager.OutstandingOperations.Decrement();
                                });

                        });

                        AsyncManager.Parameters["orderData"] = orderData;
                    });
        }
    }

   internal class SampleOrderRepository : IOrderRepositoryAsync
    {
        public Task<object> SaveAsync(OrderViewModel orderData)
        {
            var connectionString = ConfigurationManager.ConnectionStrings["orderDatabase"].ConnectionString;
            using(var connection = new SqlConnection(connectionString))
            using(var command = new SqlCommand("spInsertOrderData", connection))
            {
                connection.Open();

                //...  building parameters from the orderData not shown here for simplicity

                return command.ExecuteScalarAsync();
            }
        }
    }

As you can see the code is very different from its original form. MVC has a special controller type to enable asynchronous processing and each action that will be called asynchronously has to be created with Async and Completed conventions so that MVC knows how to handle the flow of the asynchronous call. Also, the AsyncManager class is used to track the outputs of the long running portions of code so that they can be passed into the callback once all of the data is aggregated. Also, we’ve converted all of our service interfaces to use asynchronous versions of their calls. An example of that is shown in the SaveAsync method of the SampleOrderRepository class. Instead of using the ExecuteScalar method, we call ExecuteScalarAsync, which returns a Task<object> instead of just object. The Task class serves as a representation of an asynchronous operation, but it also contains information about the operation taking place, like the current status of the operation and the result of the operation. The class also provides a set of continuation methods that allow for the specification of logic that should happen when the operation completes. The Task class sits on top of the Task Parallel Library and makes use of the new threadpool implementation from .NET 4.0 to efficiently schedule when the work takes place.

With the advent of the async and await keywords in combination with the Task class, the simplified syntax lets the compiler know that you want to create an asynchronous call with a callback method, and it can do the heavy lifting on its own which makes your code significantly simpler and easier to read and understand. Let’s take a look at how to accomplish the same asynchronous checkout process using the async/await keywords:

    public class CheckoutController : Controller
    {
        private readonly IPaymentProcessorAsync _paymentProcessor;
        private readonly IOrderRepositoryAsync _orderRepository;
        private readonly IEmailGeneratorAsync _emailGenerator;

        public CheckoutController(IPaymentProcessorAsync paymentProcessor, IOrderRepositoryAsync orderRepository, IEmailGeneratorAsync emailGenerator)
        {
            _paymentProcessor = paymentProcessor;
            _orderRepository = orderRepository;
            _emailGenerator = emailGenerator;
        }

        public async Task ProcessPurchase(OrderViewModel orderData)
        {
            orderData.ValidateData();

            await ProcessPaymentAndSaveAsync(orderData); //make various service calls (this can take some time)

            return View("Confirmation", orderData);
        }

        private async Task ProcessPaymentAndSaveAsync(OrderViewModel orderData)
        {
                //collect the payment information
                orderData.PaymentDetails.AuthorizationCode = await _paymentProcessor.ProcessPaymentAsync(orderData.PaymentDetails);
                //save the order to the database
                orderData.OrderId = await _orderRepository.SaveAsync(orderData);
                //generate and send a confirmation email.
                await _emailGenerator.SendConfirmationEmailAsync(orderData);
        }
    }

    internal class SampleOrderRepository : IOrderRepositoryAsync
    {
        public Task<object> SaveAsync(OrderViewModel orderData)
        {
            var connectionString = ConfigurationManager.ConnectionStrings["orderDatabase"].ConnectionString;
            using(var connection = new SqlConnection(connectionString))
            using(var command = new SqlCommand("spInsertOrderData", connection))
            {
                connection.Open();

                //...  building parameters from the orderData not shown here for simplicity

                return command.ExecuteScalarAsync();
            }
        }
    }

The first thing to notice here is that our MVC code doesn’t look so fragmented anymore. It’s quite representative of the synchronous model. As before, we’re using our asynchronous implementations of the services that perform our long running processes. There are three additional callouts here that all work together to make the call happen smoothly. The first is that the Action method now returns Task instead of just ActionResult. As we mentioned before, the Task class represents an asynchronous operation that in this case promises to return an ActionResult object. Secondly, The async keyword is also added to the action method signature. Adding this keyword allows the await keyword to be used within the body of the method. It’s a way of telling the compiler that we will be making some asynchronous calls there. Finally, the await keyword distinguishes the calling portion of the code from the callback. To the compiler, when the await keyword is encountered, it immediately returns from the method after that line. Everything after the await and before the end of the method is treated as code that will run after the async call has returned. Think of everything below the await keyword as an in-line callback, because the compiler will be creating a callback delegate behind the scenes using this logic. Those simple changes allow our checkout process to be more scalable, process more orders and therefore bring in more money.

That’s A Wrap

The Taskmatics Scheduler makes judicious use of the new async and await keywords in the administration website. In order to ensure that the UI is responsive when retrieving a lot of data, we make data retrieval calls asynchronously so that we can retrieve more data concurrently and therefore display the screen to the users as fast as possible. For us, using async and await translates into a better user experience when administering tasks through the website, and if you’re a developer for a major retailer’s online store this year, it could mean enjoying more delicious thanksgiving leftovers knowing that your site has an upper hand against the throngs of shoppers looking to cash in on the savings.

Why use Taskmatics Scheduler

Why use Taskmatics Scheduler?

Taskmatics is preparing for its first major release of its flagship application, Taskmatics Scheduler.  If you’re a .Net developer you should be excited about this application.  To understand why, it helps to understand the motivation behind creating the system in the first place.  We believe that the same reasons that compelled us to create Taskmatics Scheduler will be the same set of criteria that drive .Net developers to adopting it.

The Problem

As .Net developers, most of us have been involved with the development of Enterprise class systems.  These are generally large, often complex applications that:

-        Have encapsulated business logic in code – via services, assemblies, etc…  e.g., the ‘Customer’ object that encapsulates all rules for managing a customer

-        Encapsulates complex business processes involving multiple objects, e.g. onboarding/ingesting product, creating an order, etc…

-        Uses one, or multiple data stores for persistence

-        Have one or multiple UI layers, e.g. an administrative app, consumer facing app, etc…

All of these components work together to form our respective ‘systems’;  However, problems begin to arise when we need to support batch or offline operations utilizing the business rules already built into these systems.  For example:

-        At regular intervals we need to ingest new product into our catalog

-        At specified times we need to check with a 3rd party to see if there are new orders to add to our system

-        ETL activities (import/export data)

-        Rebuild indices, aggregate data

Thus, the need for a centralized job management solution in this type of environment is crucial.

A Pseudo Solution

Like many .Net developers, we turned to Windows Task Scheduler for a solution to our job management needs.  Windows Task Scheduler provides an ability to start batch or .exe files, has a multitude of scheduling options, and is available on any flavor of Windows servers’ installations.

Issues with approach

Windows Task Scheduler views each task under its management as an independent entity.  This approach has some benefits, but also some severe drawbacks that quickly become management headaches as the underlying job infrastructure evolves.  Notably:

-        No common framework, job infrastructure quickly becomes the wild west

-        No centralized logging solution

-        No remote management

-        Very difficult/clumsy mechanism for using shared files (assemblies)

-        No extensibility

-        Single server solution

-        Laborious to update/maintain existing jobs

(Note:  We evaluated SQL Server, Quartz, and ActiveBatch as well.  Please stay tuned for pending discussions regarding these products shortcomings and why we were compelled to create Taskmatics Scheduler)

Finally, a Solution!

We concluded that there was not a suitable task management system available that addressed the needs of an enterprise .Net developer, so we decided to write our own.  From our experience, we were certain that a task management solution needed to have at least the following:

-        Job isolation (a poor performing job can NOT bring down the entire job infrastructure)

-        Remote management

-        Extensibility

-        Common Framework

-        Common Logging

-        Ability to update jobs while jobs are running

-        Ability to leverage common code (assemblies)

-        Reporting

-        Resource utilization by Job

-        Security (access and authorization)

-        Ability to scale out to multiple job servers

-        High Availability configuration

We knew that there weren’t any applications available in the marketplace that supported our desired feature set, so we decided that the only way to get what we needed was to create our own.  So, we did just that.

Our initial version of the application was created for our own internal use.  It was a bare bones application that lacked an administrative console and had a very crude configuration system.  However, our internal adoption and reliance on that crude system convinced us of the needs for this feature set, so after 2 years of evolution and internal use we re-developed the application with a goal of releasing it to the public.

And thus, Taskmatics Scheduler was born!  Taskmatics Scheduler represents all of the knowledge learned from our experience with the initial system and, most importantly, addresses the areas that we found lacking (configuration, installation).  The result is a full featured task management system that is a must have for any .Net developer.

Dials and Switches – Building Configurable Tasks

As a developer, it’s rare to see an application that doesn’t make use of external configuration settings. Whether it’s to set the connection string to a database or to store custom keys with the credentials to a service being consumed, configuration settings are a ubiquitous tool for developers. It’s not a mystery why this is the case as there are many advantages to using configuration settings when developing. Here are some of the key values that configuration settings can add to an application.

Reconfiguration without Recompilation

Many times an application will need to make use of values that can change after deployment. A prime example of this would be the connection string for data storage. Many times these connection parameters differ based on the environment to which the application is being deployed. Hard coding these values into the program itself introduces the need to recompiling and deploying the code each time any of these values are changed. This leads to poor maintainability of the code base and an inability to easily deploy applications across multiple environments. External configuration allows us to centralize and maintain all of the parameters of an application independent from the build and deployment process.

Reusability of code

Another frequent use for configuration settings can be seen in applications that share common behavior while differing only in the data they work with. For example, if I’m writing an application that zips up a directory and sends the zipped contents to a backup drive, I wouldn’t want to duplicate the application each time that a new folder needed to be backed up. External configuration settings allow for reuse of common logic while providing a way to introduce non-static data elements.

Basic Data Validation

Configuration not only allows you to specify parameters for use within your applications, but they also define a general syntax that allows for rules to be created that control the validity of the values entered for those parameters. The ability to restrict data types, specify string patterns and control a host of other criteria over the values being set increases the integrity of the data and the application itself because developers know what to expect when referencing these parameters in the code.

Configuration Options in Taskmatics Scheduler

There are a couple of ways to specify configuration values in the scheduler. Standard .NET configuration files are supported and are managed in the file system. In addition, the scheduler API allows developers to create their own custom parameters objects that can be used to describe input or output values for their custom scheduler components. Furthermore, the parameters objects are dynamically translated into a custom form in the administration website where the values can be set when creating or modifying scheduler components. The two methodologies can be used in conjunction with one another, so you can use both of them if you so desire.

.NET Configuration files

Tasks for the Taskmatics Scheduler are written in .NET, therefore they include support for .NET configuration files. When developing a custom scheduler component, simply adding an app.config to your project will allow you to access that configuration from your code. Here are some key things to know about using .NET configuration files when you create scheduler components:

  • Using .NET configuration files does not preclude you from being able to use a custom parameters object as the two mechanisms can be used in conjunction. You may want to use .NET configuration to define database connection strings while using a custom parameters object for specifying service address Uri values.
  • In the scheduler system, updating a configuration file requires that you locate the folder where the file is stored and update the file at that location. This can be done from the administration website by updating that file from the built in file management screens. It can also be done from the file system directly, but production deployment folders often have restricted permissions making this strategy less than ideal.

Custom Parameters Object

The scheduler API allows developers to create their own objects that can define the input or output parameters of the components they create. These objects are nothing more than classes that are linked to the components that they are creating. Consider a sample configuration for a task that copies files from a source folder to a destination folder:

public class CopyTaskParameters
{
    public string SourceFolderPath { get; set; }
    public string DestinationFolderPath { get; set; }
}

 

This class gives us a basic parameters object for copying files between two folders, but the custom API allows us to also specify some basic validation criteria of these properties using well known attributes from the DataAnnotations .NET library. Adding that to our example we have something like:

public class CopyTaskParameters
{
    [Required]
    [Display(Name = "Source Copy Path", Description = "The folder that files will be copied from.")]
    public string SourceFolderPath { get; set; }

    [Required]
    [Display(Name = "Destination Copy Path", Description = "The folder that files will be copied to.")]
    [RegularExpression(@"\w{1,100}")]
    public string DestinationFolderPath { get; set; }
}

 

As you can see from the above, it’s fairly trivial to describe the parameters object and even set some basic validation. As a last step, we need to link this object to the task that we’re developing so that the scheduler can associate the task with the configuration object when we are administering the task later on.

[InputParameters(typeof(CopyTaskParameters))]
public class CopyTask : Taskmatics.Scheduler.Core.TaskBase
{
    protected override void Execute()
    {
        var copyParameters = (CopyTaskParameters)Context.Parameters;

        //your task implementation goes here
    }
}

 

In the code above, we’re linking the CopyTaskParameters object to the CopyTask by using the InputParameters attribute from the scheduler API. When scheduler components are linked to custom parameters objects, the administration website will use the custom parameters object to display an entry form for users to configure those parameters directly from the website. It will also provide validation feedback to the user to ensure that the data being entered is what’s expected by the parameters object. Once the task is initiated by the system, it will create an instance of the CopyTaskParameters object and pass it in as part of the Context object that’s accessible from the task. Here’s a shot of what our configuration object will look like when we’re configuring our task from the website:

copy-task-sample-config-ui

Here are some key points to keep in mind when using custom parameter objects in the scheduler system:

  • Like .NET configuration files, custom parameter objects allow you to avoid hardcoding data elements that may change frequently or that don’t warrant a recompile of the code.
  • Custom parameter objects are integrated with the administration website to make managing the configuration for custom tasks easy. It abstracts the need to know the .NET configuration file schema so that non developers can tweak configuration when desired.
  • Currently developers can only use scalar and enumeration properties in a custom parameters object. List and complex object support are to be added in future releases.

Summing It Up

Regardless of the complexity of application you write, using external configuration settings will go a long way towards making your code more versatile and facilitate the maintenance of your application post deployment. Taskmatics Scheduler allows you to use the methods you already know and love, and also introduces a new, simple way to manage external settings that integrates seamlessly with the administration website. For a fully working sample that contains the code used in the article, head over to our GitHub repository, and for more information and more advanced configuration scenarios, check out the online documentation.