Modern application development extensively relies on REST APIs. You can hardly find a client application that doesn’t require backend services, and REST protocol is a popular choice because of simplicity and wide platform support. Things start to get complicated when you deploy the REST API to the public domain. Now you have to worry about maintenance, scalability, security, and other responsibilities that come with hosting a publicly accessible web service. Many times these APIs aren’t very complex and don’t require much business logic so the service maintenance overhead can be very significant relative to the overall service functionality. A combination of Amazon API Gateway and AWS Lambda services can significantly reduce the complexities typically associated with hosting and managing your REST APIs.
AWS Lambda service introduction
AWS Lambda is a managed compute service that executes your application code units (referred to as Lambda functions) triggered programmatically or in response to various events raised by other AWS services. Some of the key features of AWS Lambda are:
Fully managed – there’s no infrastructure to manage. Simply upload the code and let AWS Lambda take care of the rest.
Scalability and high availability – AWS Lambda automatically scales and manages compute resources across multiple Availability Zones.
Cost efficiency – only pay for the time your code actually runs, in 100ms increments.
Compatibility – currently supports Node.js and Java programming languages.
Amazon API Gateway service introduction
Amazon API Gateway and AWS Lambda
As you can see, these services can already be very useful on their own but they also complement each other greatly. Amazon API Gateway tightly integrates with AWS Lambda and allows developers to implement truly serverless REST APIs. Amazon API Gateway endpoints can be configured to invoke AWS Lambda functions which makes it possible to build and deploy publicly accessible, secure, scalable, and reliable REST APIs backed by Node.js or Java code of practically any complexity without having to worry about the infrastructure.
Scalable and reliable real-time event processing has never been a simple task. Software architects and developers involved in such projects often spend the majority of their time focusing on building custom solutions to satisfy the reliability and performance requirements of the applications and significantly less time on the actual business logic implementation. The introduction of cloud services made it easier to scale real-time event processing applications but it still requires complex solutions to properly distribute the application load across multiple worker nodes. For example, EventProcessorHost class provides a robust way to process Event Hubs data in a thread-safe and multi-process safe manner but you are still responsible for hosting and managing the worker instances. Wouldn’t it be nice not having to worry about the event processing infrastructure and focus on the business logic instead?
Azure Stream Analytics service introduction
Azure Stream Analytics is a fully managed, scalable and highly available real-time event processing service capable of handling millions of events per second. It works exceptionally well in combination with Azure Event Hubs and enables the 2 main scenarios:
Perform real-time data analytics and immediately detect and react to special conditions.
Save event data to persistent storage for archival or further analysis.
Each stream analytics job consists of one or more inputs, a query, and one or more outputs. At the time of writing the available input sources are Event Hubs and Blob storage, and the output sink options consist of SQL Database, Blob storage, Event Hubs, Power BI and Table storage. Pay special attention to the Power BI output option (currently in preview), which allows you to easily build real-time Power BI dashboards. The diagram below provides a graphical representation of the input and output options available.
The queries are built using a Stream Analytics Query Language – a SQL-like query language specifically designed for performing transformations and computations over streams of events.
Step-by-step Event Hub data archival to Table storage using Azure Stream Analytics
For this walkthrough, let’s assume that there’s an Event Hub called MyEventHub and it receives location data from connected devices in the following JSON format:
Building an application for a connected device like a mobile phone, tablet, or a microcomputer is easier than ever these days. A variety of tools and supported programming languages allow developers to quickly build applications that collect sensor data, capture telemetry information or location data, and send the collected information to backend services for processing.
Connected devices communication challenges
A functional prototype can often be built in a matter of days or even hours but a number of challenges arise and must be addressed in order to prepare the application for a wider distribution. Some of the most common challenges are:
Platform support – applications running on various platforms must be able to send information.
Reliability – the backend service must be able to reliably accept information.
Security – only authorized devices should be able to send information and the information must be protected from unauthorized access and manipulation.
Latency – the network calls must be as quick as possible to avoid network interruptions, preserve system resources and battery power.
Scalability – in many cases, the backend service must be able to handle massive amounts of data.
It is obviously possible to build a custom solution that satisfies all of your application-specific requirements but would it bring you any business value or would you rather focus on the actual application functionality? Luckily, there are cloud services specifically designed for these types of scenarios, and they can significantly simplify your development efforts.
Azure Event Hubs service introduction
Meet Azure Event Hubs – a highly scalable, low latency and high availability event ingestor service capable of reliably handling millions of events per second and buffering them for further processing for anywhere from 24 hours up to 7 days. The service supports HTTP and AMQP protocols which makes it an attractive option for a wide variety of devices and platforms.
The Event Hubs REST API is pretty straight-forward and easy to use. Simply submit an HTTP POST request to your event hub endpoint and set the request body to a JSON-encoded string that contains one or more messages. For example, below is a sample Send Event request, courtesy of Event Hubs REST API Reference on MSDN.
Generating a request like this should not require much effort in any programming language on any platform, be it a mobile app or a Node.js application running on a Raspberry Pi. The only part that deserves special attention is the Authorization request header. The Authorization header contains a Shared Access Signature (SAS) token that can be generated by any client that has access to the signing key specified in the shared access authorization rule. Below are some of the best practices to follow when generating SAS tokens for connected devices.
Azure Event Hubs SAS token generation best practices
Never use the RootManageSharedAccessKey shared access policy to generate SAS tokens for your connected devices since it’s a highly privileged policy. Follow the principle of least privilege and always create a new policy with Send-only permission for each event hub.
Never send or store your policy keys on connected devices as it may expose your keys for unauthorized access and makes it difficult to rotate or revoke them. Always generate your SAS tokens on the server and only store the generated tokens on the devices.
Never generate SAS tokens and submit HTTP POST requests to the common event hub URI as the same SAS token would be valid for any device. Always target unique device-specific event hub endpoints when generating tokens and publishing events.