We need platforms for models in the cloud!
Lots of discussion in the language engineering and model community evolves around editing models within the browser. While I think this is a good direction, I also think it is not going far enough.
In this post, I’m making the argument why it’s not enough to put the editor into the browser but that we need to go a step further to provide value to organizations and users alike.
The discussion about the next generation of modeling tools is often dominated by the possibilities opened up by using the browser as a frontend. Google Docs like real-time collaboration or nice looking, native to the browser, digram, and graphical editors are features that get a lot of attention. There is nothing wrong with this per se; these features are easy to sell and provide real-world benefits to certain user groups. However, it is often overlooked that we can do more with the next generation of modeling tools that reach far beyond the front end. I think two key aspects are even more critical than the frontend: Scaleability and Accessibility.
Before I explain in detail what I mean by those two aspects, let’s first explore what I mean by the “cloud” part of “models in the cloud.”
What is the Cloud?
By cloud, I don’t mean “the internet” or a public cloud offering like AKS, Google Cloud, Azure, or others. I mean a platform that runs on top of an Infrastructure as a Service offering and is built to leverage it. For smaller organizations, the infrastructure might be provided by external services providers like AWS, Google, or others. Larger organizations might be able to provide such infrastructure from within the organization. No matter who is providing the infrastructure, it needs to provide three fundamental properties:
Automatability: Interaction with the infrastructure needs to able to be automated. Automation allows the modeling platform to change the underlying infrastructure dynamically. New compute resources can be commissioned and decommissioned based on the current demand. Changes to the underlying infrastructure might be purely based on none functional properties like the number of active users but could also be triggered by events from the domain itself. Think about changing the storage of models based on their lifecycle. Models under active development are put into fast, hence expensive, storage while models that aren’t actively developed are stored in cheaper, less performant storage.
Scaleability: Infrastructure needs to be able to scale with the modeling platform. Modeling platforms will often produce bursts of load in the infrastructure based on the user interaction with the system. For example, complex model checking, data exports, or other operations can require vast amounts of resources but only for a short amount of time. These workloads often don’t care about predictable performance. Still, the fact that the infrastructure can provide available resources for a short period when required opens up new use cases that aren’t possible with today’s tools.
Pay-As-You-Go: Costs need to be coupled to the consumption of resources. The teams building the modeling system on top of the platform should not require extensive upfront investments to have infrastructure at their disposal. The risk of providing and maintaining the infrastructure is taken away from the modeling team into the specialized infrastructure provider. Cost should grow with the application, which gives the developers of the modeling system freedom to consume resources as required and provide incentives to optimize the workload down the line to keep costs reasonable.
While we as a community think it’s different, the impact of modeling used within an organization is often confined to silos within the organization. The trend in “Classical” software engineering is to make source code available to as many people as possible. Open source is an established practice even for large products or services. Large organizations adopt open source-like models where every member of the organization gets access to the source code and can contribute as desired. With models, we aren’t quite there yet. Models are often locked into specific tools required to access them. Even if the models are available as text, the tools to parse and consume them are still needed. Accessibility to models for humans and machines alike should be one of the major goals for “a platform for models in the cloud.”
Once data becomes available, people will start using it. Therefore, providing as few technical and organizational barriers to the information stored in models as possible is vital.
Accessibility for Humans
Taking away barriers for humans to access models is essential. Removing the requirement to install any software on the users’ computer is a huge step forward. If one is allowed to access a model, one should be able to do so with the tools present on the computer: the browser. There is no need to talk to the IT department to get the software installed or keep the software updated.
The platform should not require the user to choose which version of the tool is needed to work a particular model. Migrating models to a newer version of the meta-model must be transparent and taken care of by the platform. Removing this “noise” allows users to focus on working with the models instead of dealing with details of the tooling, potentially opening modeling up to less tech-savvy users.
Accessibility for Machines
Similar to humans, models need to be accessible for machines. The possibility to work with models in programs or scripts creates new value for the user. For me, accessibility for machines has three distinct levels: Basic, Advanced, and Integrated.
The main property of basic access to the models is that access has to be as simple as possible and work across as many technologies as possible. The primary purpose of basic access is to consume models, not to modify them. Models are served in easy-to-consume formats like XML or JSON without all the bells and whistles of a meta-model.
Why is basic access significant? Because it allows exploring the models with little overhead. It will enable teams to get started with a shallow entry barrier. Since the format is simple and consumable in virtually any programming language or tool, people can easily experiment with the models. Scripting is helpful to confirm or contradict a theory about the models quickly. Data scientists can use their preferred tools like python to pull statics out of the models. Embedded engineers can use rust or go to generate configuration files or code from the models.
While some of these use cases will always stay on the simple level, others will evolve into the advanced level over time. Making it easy to use the models lowers the upfront barrier for new use cases.
In my opinion, basic accessibility for machines is vastly underrated at the moment. We often don’t think about this because it looks too limited to be helpful from a language engineering perspective. Instead, making it easy to consume models gets people interested in models and how they can use them.
Advanced access is where things get more refined than on the basic level. Advanced access involves the meta-model of the models. While the models are still exchanged via XML or JSON, the data format is very different from the one on the basic level. Data is exchanged via formats that are specifically tailored for model exchange like EMF or similar.
The knowledge about the meta-model allows the generation of data structures to consume and edit the models. Constraints imposed by the meta-model can be part of such generated code; therefore, clients can construct structurally sound models. The downside of this approach is that the complexity in the client requires code generators and/or sophisticated client libraries for the desired target language or tool.
Advanced access is not only used to consume the models but also to change them. Example use cases are integration with simulation tools or data import/export.
In contrast to the previous levels where models are consumed and changed at the edge of the platform, it must be possible to integrate deeply with that platform itself. While monolithic language workbenches have shown their strength in a couple of use cases on the desktop for a platform that provides models in the cloud, this is undesirable.
An open modeling platform in the cloud provides ways to extend its functionality as a first-class citizen. Contributed extensions feel the same to the user as those features provided by the platform itself.
A platform that hosts models in the cloud build on top of scaleable infrastructure can leverage this property to enable new use cases.
For example, particular use cases can be too resource-heavy to be executed on a single machine. Specific analyses on the models can require too much memory to be executed on the users’ devices at all. A global check that helps the user reason about a refactoring might be possible to run but takes too long to be of value to the user. Getting notified that a change done 10 minutes ago introduces a problem somewhere else is often too late to be acceptable. A platform that puts models in the cloud can allocate the required resources when necessary and free them once no longer needed.
When the users’ computer limits tools, organizational complexity is a factor to consider as well. When modeling tools are used throughout an organization, many different departments or business units can be involved. In a classical environment where the modeling tool runs on the user’s machine, the modeling tool needs to work with the constraints. These machines will, at some point, be limited in memory or compute resources. New features and an evolving tool can increase the resource requirements over time. Once the resource requirements outgrow the available resources on the users’ machine, investment in new hardware is required. In some organizations, this can create a conflict where the business unit using the modeling tool must cover the investment into new hardware. With a platform largely independent of the resource available to an individual user, we can avoid this conflict. Because the resources are provided by the modeling platform, which is owned by the part of the organization responsible for building the modeling tool, it allows for strategic decisions focused on the modeling tool.
On the other hand, a platform for models in the cloud provides much better resource utilization. A user’s machine would always need to be designed for the worst case. While a user runs the very costly analyses that require all the resources on their machine, 1% of the time, the resources need to be available 100% of the time. A platform doesn’t need to have the resources available to run the most expensive task for every user at the same time because it’s unlikely that this will ever happen. Even if many users request the same expensive task simultaneously, the platform can decide to pool these tasks and execute them only once but deliver the result to multiple users. The platform can cache results of expensive tasks, and subsequent results are provided from that cache.
The last reason why scalability is essential is a mix of all of the above. It gives the team creating the modeling tool much more autonomy. The scalable nature of the platform allows the exploration of new features primarily unconstrained. A feature can be prototyped and refined with the users and then later optimized for resource consumption.
When modeling tools embrace the fact that they need to be more than an editor for models, they can become platforms. These platforms can provide ways to help organizations to transform into more model-centric organizations. Not because such platforms enable use cases that aren’t technically possible with today’s tools, but because they remove technical and organizational barriers. Eliminating these barriers allows for gradual change in the organizations and is an enabler for exploring the value of models in much a leaner way than possible today.
If you liked the content consider subscribing to the email newsletter below. The newsletter delivers all posts directly into your inbox. For feedback on the topic feel free to reach out to me. You can find me on Twitter @dumdidum or write a mail to firstname.lastname@example.org.