4 Pillars of the “Infrastructure from Code”

Asher Sterkin
15 min readNov 25, 2022

--

Part Two: Types of Services, Vendors, and APIs

In this series, I’m trying to answer the question: “Precisely what kind of problems is IfC trying to solve?”

Acknowledgements

Many thanks to Shruti Kukkar for the valuable feedback on the early draft of this paper.

Introduction

In Part One of this series I suggested the following mission statement for Infrastructure of Code solutions: “To ensure an automatic conversion of cloud-neutral application code into a coordinated set of cloud-specific assets using operational feedback as an input wherever appropriate”.

In attempt to clarify what does exactly this mean, the following diagram was introduced:

Fig1: 4 Pillars of Infrastructure From Code

In Part One I analyzed 4 different types of interactions with cloud services (aka four pillars of IfC): Acquisition, Configuration, Consumption, and Operation.

In this Part Two, I’m going to take a closer look at different types of cloud services, vendors, and different types of APIs to communicate with cloud services.

This is the second part of a three-part series organized as follows:

  1. Part One: Types of Communication with Cloud Services
  2. Part Two: Types of Services, Vendors, and APIs
  3. Part Three: Deployment Locations, IfC Mission Elaborated

Cloud Services

This topic is apparently a worm can. Indeed, which cloud service are we talking about? Every cloud platform provides some most basic services such as compute, storage and networking. More specifically, could platform provides own large datacenters, sometimes called availability zones, distanced from each other by tens of miles. Each availability zone is filled with compute (CPU), memory (RAM, NVRAM, Flush, Magnetic disks, tapes), networking (switches, routers) and security (HSM) hardware devices connected by superfast copper and/or fiber cables. There is a complex management software that makes it possible to create virtual machines and virtual networking devices on the top of this hardware and to connect them to each other (exact details are irrelevant for this discussion). What is important to realize, that this constitutes some kind of basic level of cloud services, unless they also function as hosting services and just rent out hardware. On the top of these basic services, some form of higher level services, such as database clusters, cloud storage buckets, container instances, API gateways and many more could be built up and provided. I tried to provide a historical perspective of this evolution in my previous article. These higher level services could be supplied by the same cloud platform provider or by a 3rd party (more on that later) and these services could partially or fully managed. The latter, so-called “serverless” services, do not require too detailed specifications, upfront capacity reservation, and normally scale down unused capacity and do not charge for it. Even though the most of the considerations are equally applicable to the both types, I will mostly focus on the Serverless branch since IfC approach was initially conceived as a way for unleashing true potential of the Serverless Promise.

Initially, I did not plan writing this section. As I argued in one of my early articles, at least within the scope of serverless applications we may consider one region of one account of one cloud platform as just a cloud super-computer, treating individual cloud services as hardware modules and platform native SDKs as a kind of drivers.

Therefore, one might argue that going beneath this level is not required for the same reason that we normally do not analyze exact structure of a traditional computing device chipset. As with chipsets, we do know that internally cloud services are not monolithic and have their own hierarchy, but why we could not ignore it thus simplifying the overall analysis, which is complex enough without this extra dimension?

There are two reasons for this. First, could services at different levels are not completely invisible for IfC configuration. Most of them are, but some are not. A good example would be CPU type: is it x86–64, amd64, or arm? See more detailed discussion below.

Second, IfC does need to support drop replacement of cloud services by comparable 3rd party alternative. During the last couple of years, this trend seems to be increasing: one may choose to use either AWS Serverless KeySpaces or Datastax Astra. From the application code it’s going to be the same Cassandra Python Client, while the whole difference will be specifying via configuration which vendor to use, plus some IfC automatic code generation.

We, therefore, need to understand, albeit at a very high level, the inner structure of cloud services.

This could be modeled as the following, surprisingly very traditional, structure:

  • Hardware:
    - CPU
    - GPU
    - HSM
    - FPGA
    - Memory
    - …
  • Infrastructure as a Service (IaaS):
    - Virtual Machines
    - Software-Defined Networks
  • Platform as a Service (PaaS):
    - Containers
    - Functions
    - Virtual Network Functions
    - Network Storage
    - Blob Storage
    - Databases (SQL and NoSQL)
    - Message Brokers (point-to-point, pub/sub, push/pull)
    - API Gateways
    - Websites
    - Security
    - Map-Reduce
    - Workflow Orchestration
    - …
  • Software as a Service (SaaS) / Cloud Software Products (CSP):
    - Full code IDE
    - No Code Studio
    - Console
    - Shell
    - Dashboards
    - Packaged Digital Businesses Capabilities
    - …

Hardware

We exclude hardware renting from the consideration, since otherwise we would be talking about hosting rather than cloud service providers. It’s true, that some cloud service providers also offer renting dedicated hardware, but it means that they combine two business models.

While we do not deal with hardware allocation or management (a very complex topic by itself), some hardware-related issues need to be taken into account by IfC solutions. The first is CPU and GPU type: at least the final code generation will be affected by this choice, but also, most likely, the cloud-based Integrated Development Environment, if any. At the moment, we could ignore specific types of Memory and for sure specific types of disks, tapes, various hardware accelerators, and network equipment. This, however, might change is the future. The last two hardware types, namely, Hardware Security Module (HSM) and Field-programmable Gate Array (FPGA) might be relevant for very advanced security and performance optimization use cases.

Infrastructure as a Service (IaaS)

The next level, IaaS offers two major types of virtual resources, which constitute the most basic level of what any cloud service provider really offers: virtual machines equipped with particular hardware elements (CPU, GPU, Memory, Elastic Block Storage, etc.), potentially organized in a form of auto-scaling groups, and connected to each other and external world through a software-defined network (private/public subnets, security groups, etc.). Anything else is built on the top of them (e.g. Platform as a Service (PaaS) or Software as a Service (SaaS) offers.

Traditional IaaS definition includes significantly larger number of services, but I prefer this one since anything else could be defined in various forms on the top of this really basic infrastructure and heavily depends on many factors, such as orchestration engine.

This IaaS level is partially visible to the IfC infrastructure. First, virtual machines might be relevant for providing a cloud-based development environment (e.g. Microsoft VSCode Remote). Second, and this is more serious, software-defined networks specification will have to be visible for many production deployments which require strong network isolation of computation and data (e.g. AWS VPC). Big players, such as AWS, very often permit themselves terrible terminology inconsistency being pretty confident that the rest of industry will follow. In fact, what they call Virtual Private Cloud, would be better called Private Software Defined Network. But be the terminology bad and inconsistent as it is, IfC faces a more serious problem. On the one hand, it cannot ignore network security, on the other hand it’s incapable of deriving all possible network security configurations solely from the application code. Here, we have another example of additional two fundamental rules of IfC system design:

IfC Rule #2: If you cannot provide a higher level abstraction which will simplify configuration by an order of magnitude, don’t do it — leaky abstraction is always worse than original model.

IfC Rule #3: Support the most common use case as a fully automated default case and leave anything more sophisticated to be manually configured as external resources through traditional Infrastructure as Code solutions.

In the case of software-defined networks, we may support simple VPC configuration in a form of address ranges per public and private subnet leaving more complex VPC configuration to be externally specified using, for example, an AWS CloudFormation Template, which in general case is indeed very complex.

Platform as a Service (PaaS)

The next level, PaaS, is more versatile and, at least in principle, is more open to competition (see discussion about vendors below). Traditional PaaS Definition is slightly different and more focused on higher-level services such as Database as a Service (DBaaS), but as in the IaaS case, I prefer this extended definition since it IMHO better reflects potential choices between different vendors offering running on the top of the same cloud platform.

Internally, PaaS offerings cloud be split into two subgroups: higher-level infrastructure services and true platform services. The first subgroup, indeed, extends the basic infrastructure with more convenient and/cost-efficient offerings: containers (e.g. AWS ECS or EKS), functions (e.g. AWS Lambda), virtual network functions (e.g. AWS Internet Gateway) and network storage (e.g. AWS EFS).

The second subgroup is composed of more traditional platform services such as Blob Storage (e.g. AWS S3), SQL (e.g. AWS RDS), NoSQL (e.g. AWS DynamoDB) databases, data warehouses (e.g. AWS Redshift), various message brokers (e.g. AWS SQS and AWS SNS), API Gateways, security services (e.g. AWS KMS), Map-Reduce (e.g. AWS EMR), workflow orchestration (e.g. AWS StepFunctions), etc. The complete list is large and ever-growing.

What is common to all PaaS offerings is that they all run on the top of basic IaaS elements, namely virtual machines, usually combined in auto-scaling groups, and connected via software-defined networks.

The PaaS group of cloud services is the most visible to IfC where the most of the effort is going to be concentrated (see more on this in Types of Activities) and where the competition between cloud platform and independent 3rd vendors is most intense (see more on this in Vendors).

The very name “Infrastructure From Code” is quite misleading since it’s mostly about automatic management of platform services with some touches at IaaS and SaaS here and there. It would be more appropriate to call such approach “Platform From Code”, that is PfC, but since the IfC is an accepted name, we will stay with it.

Many, but not all, platform services have a fully-managed, so called serverless, flavors, which leads us the next.

IfC Rule #4: For every platform service, if available, opt for fully automated support to a serverless version of this service, if not — provide an automated management of minimal configuration suitable for development mode; leave more advanced cases to externally defined resources, presumably using some manually-crafted Infrastructure as Code (IaC) templates.

Corollary from the IfC Rule #4: for the foreseeable future, the both IaC and IfC approaches will need to co-exist side by side.

Software as a Service/Cloud Software Products (SaaS/CSP)

The last group, SaaS/CSP, is virtually unlimited and includes virtually any software product which could run on cloud either within the customer’s cloud environment as a remote service. Some of these products are related to cloud-based software development and operation, such as Cloud IDEs, Low-Code/No-Code Studios, Consoles, Dashboards and Shells. Others are related to some particular domain, such as Weather. Perhaps, at the top of pyramid reside so-called Packaged Digital Businesses Capabilities such as AWS Connect or Google Consumer Packaged Goods.

The long-term IfC strategy needs to keep an eye at the SaaS/CSP area in three aspects: cloud-based integrated development environments, integrations, and plug-in development.

Cloud Service Vendors

Only at the IaaS level the competition is solely between cloud service providers: AWS vs GCP vs Azure, etc. Indeed, by definition, cloud platform vendors own and operate hardware and rent it out, at the lowest level, in a form of virtual machines and software-defined networks. Anything else is built on the top of these basic services by either the same cloud platform vendor (e.g. AWS) or a 3rd party competitor (e.g. Snowflake). Therefore, at the PaaS and SaaS/CSP levels the competition is both among cloud platform vendors and their customers.

Generally speaking, cloud platform vendors, might have an edge due to intimate knowledge of the underlying IaaS platform and even use some undocumented features. While for cloud platform vendor proprietary services such as AWS DynamoDB that would be usually true, it’s more complicated for 3rd party and especially Open Source services such as Apache Cassandra or MongoDB.

First, regulations sooner or later will start insisting on equal level of IaaS capabilities available to cloud platform vendors and 3rd party alternatives. Second, cloud platform vendors might lack intimate knowledge of the product itself and thus end up with inferior offering.

This is especially visible in the last, SaaS/CSP, group, where cloud platform vendors are trying to play up stack in order to avoid the “dumb infrastructure” destiny. At average, they are too often even less successful here than at the PaaS level. Providing superior user experience requires a quite different company DNA than cost-efficient, secure and bullet-proof reliable infrastructure. Probably nobody manages to do it equally well. While Google G Suite and Microsoft Office 365 are currently leading in some SaaS areas, their general-purpose clouds are lagging seriously behind AWS.

Within the PaaS and, especially, SaaS/CSP space, we shall anticipate that competition, including mutual lawsuits, will only intensify and the long-term IfC strategy shall take this trend into account.

Another strategically important consideration is that many software vendors will opt for a mixed SaaS/CSP strategy making their products available via Web APIs but also launched (e.g. from a Marketplace) and running in the customer’s own cloud environment. There are many reasons for such a strategy, full analysis of which deserves a separate publication. Here, it would suffice to say that in order to server SaaS/CSP vendors properly, the IfC solution needs to take this challenge seriously.

Cloud Services APis

Cloud services, whether IaaS, PaaS or SaaS/CSP, could be interacted with at different levels of abstraction, namely:

  • Web API
  • Low-level Software Development Kit (SDK) for a particular programming language (e.g. Python)
  • High-level SDK
  • Command Line Interface (CLI)
  • Console
  • IaC Engine Template
  • IaC Engine Template Macro

Web API

Each API abstraction level is characterized by a certain distance from the corresponding cloud service could really do and level of automation/usability: the closer to the original cloud service Web API, the more confident one could be about what one gets, however, normally the less convenient and more error-prone the application code will be.

Every cloud service has some form of Web API. This is the whole nature of cloud service versus something else. For example AWS APIs, Azure Rest API, Google Cloud APIs. This is the most reliable “source of truth” about what services are available. Everything is built on the top of this level.

Important to notice, every Web API of a cloud service is always supplied by its service vendor, but above IaaS level it does not have to be the cloud platform vendor.

Software Development Kit (SDK)

While cloud services Web APIs are the most authentic and dependable they are usually the complex and hardest to use. Therefore, many cloud service vendors usually wrap them with a Software Development Kit (SDK) libraries for particular programming language such as Python. Very often, but not always, cloud service vendors provide SDKs at two levels: low level, such as AWS botocore or MSRest, and high-level, such as AWS boto3 or Azure SDK for Python. Normally, SDK libraries are provided by the same cloud service vendor, who have full knowledge about the API intricate details and could update libraries in the shortest possible time. There are, however, some notable exceptions such as those provided by aio-libs.

Command Line Interface (CLI)

The next, Command Line Interface (CLI) level, is usually intended for system administrators, who do not like programming in anything except for bash. Every cloud service vendor has it, for example AWS CLI or Azure CLI. One would expect a one-to-one correspondence between high-level SDK and CLI, which proves to be not true. For example, the AWS CLI is implemented on the top of AWS botocore rather than AWS boto3 thus violating the “eat your own dog food” principle. It also provides some high-level operations either directly (e.g. aws s3 sync) or via special plugin (e.g. aws cli ssm).

Many Cloud CLI tools are implemented in Python, which adds some extra headache since client computer might not have required version of Python. Some are trying to alleviate the problem by using Docker or re-writing CLI tools completely in compiled languages such as Golang.

There are some attempts to implement 3rd party CLI tools, which provide a coherent interface across multiple clouds such as pyclvm or rclone.

Console

Another administrator-oriented tool is of course a Console, where cloud services are interacted with manually. Whether Console UI is implemented using the same SDK libraries or not, except for Console App developers, nobody knows. What we do know that some new cloud services or capabilities might be available first through Console UI and CLI only. Other means, including language-specific SDKs might be lagging.

IaC Engine Template and Template Macro

For any cloud platform, there is usually one or more solutions to specify coordinated set of cloud resources, provided by various services. Often, it comes in two flavors: basic engine template (e.g. AWS CloudFormation) and some macro generator, which simplifies template generation. It could be embedded into some semi-declarative (YAML, JSON) (e.g. AWS SAM) or main stream programming language (e.g. AWS CDK).

Every cloud platform has its native IaC solution: AWS CloudFormation, Azure ARM or GCP Deployment Manager with one or more macro extensions. There are, however 3rd party competitors such as HashiCorp Terraform, Pulumi or Serverless Framework.

What is important to keep in mind is that every IaC or IaC Macro engine comes with its own console and operational environment. Therefore, while exact IaC template language should be invisible for the IfC end user, support for particular native or 3rd-party tool by the IfC code generator might be required for operational reasons, especially in the brownfield deployments.

There is another catch with IaC templates — they depend on how the computation is deployed and orchestrated. For example, if the AWS Lambda is used, corresponding cloud resources could be allocated via AWS CloudFormation, its macro derivatives or competitors. However, if computation is running on the top of AWS EKS service, then there might be a need to use AWS Controllers for Kubernetes to simplify resource allocation and access control directly from Kubernetes.

IaC engines, too often, even more seriously lag behind new cloud service capabilities’ introduction: first they are made available via Console and CLI and later on via IaC templates. Interestingly enough, 3rd party IaC services, such as Terraform could come even before cloud platform-native IaC service, such as AWS CloudFormation. Partially, it’s made possible thanks to generic Cloud Control API.

IaC Engine Extensions and Modularity

Every IaC engine provides multiple, sometimes competing, ways to extend its basic capabilities and support large scale structures. For example, AWS CloudFormation provides the following advanced features:

Comparing all these capabilities is beyond the scope of this paper (hopefully to be covered in a future one). From the overall IfC perspective it’s important to formulate the next

IfC Rule #5: Use underlying IaC engine extension and modularity mechanisms to support pre-defined and be-spoke created specifications of cloud resources, standard or custom alike, that could not be automatically generated by the IfC compiler.

Using such mechanisms is quite similar to using libraries developed in machine code language (assembler) from a high-level programming language (C/C++ or above).

References

Publications

Here is the list of all IfC publications I’m aware about, including my own. If there is anything else not included here, drop me a line.

Products

Here is the list of pure IfC products, I’m aware about, including my own CAIOS. If there is anything else not included here, drop me a line.

About

The author, Asher Sterkin, is an SVP Engineering and GM at BST LABS. BST LABS is breaking the cloud barrier — making it easier for organizations to realize the full potential of cloud computing through a range of open-source and commercial offerings. We are best known for CAIOS, the Cloud AI Operating System, a development platform featuring Infrastructure-from-Code technology. BST LABS is a software engineering unit of BlackSwan Technologies.

--

--

Asher Sterkin

Software technologist/architect; connecting dots across multiple disciplines; C-level mentoring