Modern IT architecture, with all the layers of abstraction, is difficult. It also seems like every couple of months a brand new shiny thing pops up out of Silicon Valley, Seattle, Raleigh, Austin or any of the other many myriad tech-hubs, further complicating our technology stacks.
For many years computing was green screens and a centralized mainframe, then we had the Mini Computer (here we are talking about AS/400, DECs not Intel NUCs), then IBM brought out the MicroComputer – and the rest, as they say, is history. Fast forward several years and we got virtualization. We dabbled with zones but the time was not right, later we reinvented zones and called them containers, decided that on-site infrastructure was bad and everything had to be in the Cloud. We then changed our minds, and decided that perhaps keeping everything local or working in a hybrid way was actually cheaper and operationally more sensible. Changed our minds again and thought that multi-cloud was the answer, Now we are are looking at serverless and functions, which from a nebulous overview appear to imply no need for a server at all.
There is however one common theme with these paradigm shifts and that is the shrinkage of the unit being worked on, coupled with an incessant move up the infrastructure stack.
Why am I saying this? It is simple really. And all boils down to one statement: Architecture is now harder than ever. It seems that every improvement in paradigm every abstraction that is made to make our lives easier has the opposite effect with complicating the overall design process.
People assume that architecture is simple. We have all read those white papers with the pretty Visio design and have thought I can do that. I know I did before I became an Infrastructure architect focused on VMware. That first project was a baptism of fire. Your first design or architecture gig will introduce you to the concept of Functional and non-functional requirements.
The terms Functional and Non-Functional are very important as they are the key inputs to the design process. The core differences can be simply put as non-functional requirements describe how the system works, while functional requirements describe what the system should do. Now there are many advantages to creating a requirements document containing functional and non-functional needs as a part of a design process, below we have highlighted some of the core reasons.
Functional Requirement documentation advantages
- It ensures that all parties are on the same page and have a single truth source.
- It reduces misunderstandings, conflicts, and redo work by delineating the system’s or software’s functionality and expectations precisely.
- By facilitating communication, collaboration, and testing among developers, designers, and QA evaluators, it increases productivity and transparency.
- It aids in determining whether the application satisfies user requirements and provides all the features listed in the document.
- It provides a basis for estimating project costs, resources, and timelines.
- It increases consumer satisfaction by delivering a system or piece of software that meets their needs and resolves their issues.
Non-Functional Requirement documentation advantages
Some advantages of writing a non-functional requirements document are:
- It defines quality attributes of the software, such as reliability, availability, scalability, security, usability, etc.
- It ensures performance in key areas and helps to avoid bottlenecks, errors, or failures.
- It follows any legal or adherence rules that apply to the system or software, such as compliance, standards, or regulations.
- It focuses on the user experience and makes the system easy to operate and maintain.
- It provides greater visibility to the non-functional aspects of the system or software and allows teams to review and refine them.
- It saves time and money by minimizing the risk of potential rework, defects, or dissatisfaction.
- The nonfunctional requirements ensure the software system follows legal and compliance rules.
- They ensure the reliability, availability, and performance of the software system
Now traditionally the gathering of business requirements was the purview of a Business Analyst. These have, unfortunately, by and large, gone the way of the dodo, and the gathering of these have now fallen into the remit of the Architect. This is not a natural fit as the majority of Architects are technology-focused, not business focused and requirement gathering is not a task that comes naturally, unfortunately, gathering requirements means talking up the stack to none technical business process owners. Asking questions of them in their language to pull together their needs.
These questions are used to paint the picture of the needs of the business and the problem to be solved. This is not a technical discussion, but a business one. This is not the time to be considering technologies. I have heard a great quote by Epictetus “you have two ears and one mouth, so we can listen twice as much as we speak”. Those of you who know me personally can attest to how hard it is for me to follow that statement, I am known to like the sound of my own voice. But it is great advice, as the design process starts before you even open up Visio or OmniGraffle. It starts with Questions.Ask your questions, and wait for the answer
This leads on to a great quote by Wilson Mizer who said, “a good listener is not only popular everywhere, but after a while, he knows something”. Listen to the answer. Do not try to jump in with a second question before you have received the full answer to your firstListen and where appropriate dig deeper. You are building up your design requirements, the functional and non-functional needs of the business that your design needs to satisfy. Sometimes needs are conflicting, this is where clarification questions are asked.
- Why, (why do you need this?)
- What (what problem are you aiming to solve?)
- Where (where do you envision the solution being Public Cloud or locally on-site?)
- When (when does is this needed by?)
- Who (who holds the budget and who is your business sponsor?)
Notice I leave how to last, this is the least important of the questions to be asked
- How (How do we do it?)
That said, how do you currently answer this business problem is a valid question. Architecture is like Design. But at a strategic level rather than a tactical level, an Architect needs to see the big picture, not just the echo box of their particular project. It is no good building infrastructure or an application to deliver widgets and then the project failing because you did not consider WAN connectivity or the ability to recover data after a catastrophic failure as that was not part of your scope or design. It is here that the differences between an Architect and Designer become apparent. But surely in today’s cloud world, these things don’t matter, we are not dealing with physical devices anymore that is the cloud provider’s consideration, not so today’s world is actually more complicated, with more moving parts than ever before.The list below is taken from VMware’s design methodology and the concepts outlaid in it are as, if not more relevant in today’s landscape than ever.
- AMPRS (Availability, Manageability, Performance, Recoverability, and Security).
- RCAR (Requirement, Constraint, Assumption, and Risk)
- Non-functional and functional requirements
- RPO (Recovery Point Objective)
- RTO (Recovery Time Objective)
- SLA (Service Level Agreements)
- Conceptual, Logical and Physical Designs
These are all the things that need to be brought to bear on any design.We have already covered Non-functional and functional requirements these are the raw materials of our design. Bricks and mortar so to speak.We now move onto the creation of the building blocks, AMPRS and RCAR – Architecture building blocks
AMPRS and RCAR are the foundation of your design. These concepts, after your requirements, are the most important part of your design. They guide your decisions. Every decision you make needs to consider these tenants.
Availability, This includes everything related to making sure the system is up and running and doesn’t fail as a whole. For example: preventing Single Points Of Failures. Availability non-functional requirements also typically include the number of nines availability, a mythical number that relates to the amount of downtime a business can endure a particular service being unavailable. Remember to consider organizational availability as well as technical ability. It is not available if the only person who knows anything about a service has fallen under the proverbial bus.
Manageability, All requirements that are related to HOW a system must be managed. For example: providing a single pane of glass management a term often used to denote a Sauron app. The manager of managers or a single source of truth.
Performance, Performance NFRs relate to how well a system should respond and perform. This can include specific IOPS requirements, response times for end-users, the number of transactions needed on a database, response time of the web interface, etc.
Recoverability, Whenever a system or component fails and causes an outage, recoverability requirements dictate how and in which manner a system should recover. Recovery Point Objective (RTO) and Recovery Time Objective requirements are typical recoverability Non-Functional Requirements.
- RTO – How long the service can be down for, or how quick you need to get it back.
- RPO – How much data loss can the business handle with a catastrophe
Security. Everything related to how a system is protected and secured is mapped on the security design quality. Single-sign on and Role-Based Access Control (RBAC) are common examples of security NFRs. These describe HOW a user should logon to the system and how they can use it. Compliance and audit requirements also fall under this remit PCI, GDPR, etc
Requirements. We have already visited Requirements, so there is no need to revisit this.
Constraints are lines in the sand. Things that need to be worked around. Perhaps a better way to view them it to consider them a non-negotiable non-functional requirements. Every constraint is a non-functional requirement (but not all non-functional requirements are constraints). One other way to look at constraints is that they limit you as an architect in your design possibilities and design freedom. If a constraint introduces a risk, you should always include it as a risk in your design!
Assumptions are misused a lot, people use them as a get out of jail free card. It is assumed that the WAN connectivity is sufficient to provide adequate bandwidth for the application. looking at face value they do appear to offer a very appealing escape for every tough architectural challenge; just assume the problem isn’t there and off you go. This is just plain wrong. Consider Assumptions, your known unknowns. You KNOW the problem and associated risk is there. They are a reminder for you the Architect to delve deeper and mitigate or better yet remove. Assumptions are also misused to place certain elements of the design out of scope because it is easier: proper DNS/NTP/AD services are provided by the customer. Again absolving yourself of responsibility in this manner is wrong. Go back and validate the proper working of these external services, verify that there are enough resources and highlight the discrepancies as risks. They may be beyond your span of control, but validation is your responsibility. An Architects goal is to have no assumptions by the end of the design cycle. If that’s not possible and there is still a portion of uncertainty in the design; there must be an accompanying risk explaining what will go wrong if your assumption turns out to be wrong, coupled with how the design mitigates against that scenario. Risks in design mainly originate from constraints and assumptions, however they can be introduced by external factors. A common method of classifying risk is to scoring it against both the probability of it occurring and the impact that will result. The risks that are highly probable and have a high impact should receive the most work to mitigate. Risk should always be documented to explain what can go wrong, what has been done to prevent it, and if not entirely removed, how the blast radius has been reduced if it does happen.
Architecture is difficult, and has unfortunately become surrounded with the mystery of an architect being this supreme being that thinks on such a nebulous level that they may no longer be considered human. But the fact is that if you follow some common sense procedures, it’s not rocket science. This may all seem like overkill for a simple Terraform deployment into AWS, but consideration needs to be taken, even for what seems to be an innocuous change or addition. There is no need for a full VCDX style architecture design, but understanding your requirements, and how your deployment has answered them is just good practice, thinking about everything as a technical problem is not the correct way to think. All technology deployments are there to solve a business problem, improve business performance. If not then they are not servicing the needs of the business. Architectural concepts are not just a process for those on a higher pay-grade. This is everybody’s responsibility. Creating a script to automate a process will save you time, and may by association improve the performance of the team by releasing time to undertake other tasks. However, consideration needs to be given to the performance of that script. It is no good if when you run the script it requires so much resource that it affects the capacity of the environment. When building out an infrastructure via Terraform, you need to understand your constraints. You need to consider how it will affect other infrastructure already deployed. How will they interact? Everybody needs to start to learn architecture principles, but not everybody needs to be an architect.