The diagram above is one example of how to design Twitter. If you are new to software engineering jobs or involves in large scale web architecture, the diagram above might look so complicated, but we will discuss it later.
But you are right though, modern architecture is more complicated right now. Newcomers in the industry, such as junior software engineers need now at least the fundamentals on how a modern web is being run. Even better if they can reason about the thinking behind it. Also, if you are a Frontend Engineers — which is not directly involved in building the infrastructure, a brief understanding of Web Architecture would help you in a day to day work, especially in communication with fellow software engineers or software architects.
The architecture of a web application varies on many things, such as:
- Security and Privacy
If you are building an application like MS Office, where the user interface, backend logic, and the database all reside in the same machine, you are building a single-tier application. You can think of a tier as a logical separation of components in an application. And when I say separation, I mean physical separation at the component level, not the code level.
The components are Database, Application Server, User Interface, Messaging, and Caching. These different components that make up a web service.
A web application usually built as a three-tier application. Almost all of the simple websites like blogs, news websites, etc. are part of this category. In a three-tier application, the user interface, application logic, and the database all lie on different machines and, thus, have different tiers as they are physically separated.
Modern web applications in large scale industry services like Facebook, Uber, Airbnb with a lot of fancy features are n-tier applications. You might have heard the other name for this, as the distributed applications.
Why the need for so many tiers? Two critical explanations for this are the Single Responsibility Principle and the Separation of Concerns.
Single Responsibility Principle is simply meant giving one, only one responsibility to a component and letting it run with perfection. Separation of Concerns kind of means the same thing, be concerned about your job only, and stop worrying about the rest of the stuff.
These principles act at all the levels of the services, be it at the tier level or the code level. This approach makes scaling the service easy in the future when things grow beyond a certain level.
The architecture usually grows little by little depends on the business needs. Generally, what happens is a developer will notice something is going to become a problem and start considering solutions, long before the application becomes unmaintainable. It's all driven by what the application need.
Now we know that web architecture involves multiple components like database, message queue, cache, user interface, and all running in conjunction with each other to form an online service.
Let's back up a little bit and think about a website as big as Twitter and the architecture. What happens each time you visit twitter.com on your web browser?
Let's say a user open a web browser and type www.twitter.com/oianas_ (yes, this is my twitter account 😼). Underneath the hood, the user's browser sends a request to a DNS server to look up how to contact Twitter, and then sends the request.
The request hits load balancer, which randomly chooses one of the 100 or so web servers they have running the site at the time to process the request. The web server looks up some information about the profile image from a caching service and fetches the remaining data about it from the databases. They notice that the theme profile for the page has not been computed yet. Hence, it sends a "them profile" job to a job queue, which their job servers will process asynchronously, updating the databases appropriately with the results.
Next, they attempt to find similar twitter profiles by sending a request to the full-text service using the username of the profile as input. The user happens to be logged in to Twitter as a member, so they look up his account information from their account service. Finally, they fire off a page view event to the data firehose to be recorded on their cloud storage system and eventually loaded into our data warehouse, which analysts use to help answer questions about the business.
You can see the detailed explanation above represented as a Diagram that you see on top of this article.
At the most basic level, DNS provides a key/value lookup from a domain name (e.g., google.com) to an IP address (e.g., 126.96.36.199), which is required for your computer to route a request to the appropriate server.
Load Balancers are the magic sauce that makes scaling horizontally possible. Horizontal scaling means that you scale by adding more machine, whereas "vertical" scaling means that you scale by adding more power (e.g., CPU, RAM) to an existing machine.
In web development, you (almost) always want to scale horizontally because, to keep it simple, stuff breaks. The server crash randomly. Entire data centers occasionally go offline. Having more than one server allows you to plan for outages so that your application continues running.
Every modern web application leverages one or more databases to store information. In most cases, the web app servers talk directly to one, as will the job servers. Additionally, each backend service may have its own database that’s isolated from the rest of the application.
Caching is key to the performance of any kind of application. It ensures low latency and high throughput. An application with caching will undoubtedly do better than any application without caching, only because it returns the response in less time as opposed to the application without cache implemented.
A caching service provides a simple key/value data store that makes it possible to save and lookup information in constant time complexity.
The two most widespread caching server technologies are Redis and Memcache.
Full-text Search Service
Many, if not most, web apps support some sort of search feature where a user provides a text input (often called a “query”), and the app returns the most “relevant” results. The technology powering this functionality is typically referred to as “full-text search”, which leverages an inverted index to look up documents that contain the query keywords quickly.
The most popular full-text search platform today is Elasticsearch though there are other options such as Sphinx or Apache Solr.
According to AWS, Cloud storage is "a simple and scalable way to store, access, and share data over the Internet". You can use it to store and access more or less anything you’d store on a local file system with the benefits of being able to interact with it via a RESTful API over HTTP. Amazon’s S3 offering is by far the most popular cloud storage available today.
In the industry, architects, developers, and product owners spend a lot of time studying and discussing business requirements. In software engineering jargon, this is known as the Requirement Gathering & Analysis.
This requirement analysis leads to picking the appropriate stack, and after that, you start writing a POC (Proof of Concept).
A POC helps you get a closer, more hands-on view of the technology & the primary use case implementation. You would get an insight into the pros and cons of the tech, performance, or other technical limitations, if any.
Now, this is only for industrial-scale production. If you are a solo indie developer or a small group, you can always skip the POC part and start with the main code.
Because you don't want to delve into redesigning stuff, it eats up your time or your engineers' time like a black hole. It has the potential to push your shipping rate further down the calendar by months, if not longer. And it's not even considering the wastage of engineering & financial resources which is caused due to this.
But having in the software engineering industry for more than four years in Indonesia, I often heard that a lot of companies redesign their architecture as they grow bigger exponentially. So, this might be an everyday thing if your web application grows exponentially and looking to be the next unicorn.
I recently listened to mas Xinuc's interview on a podcast. He mentioned his experience on building Bukalapak, and the decision of building Bukalapak as monolithic first before re-architecture it to adopting micro-services (If you understand Bahasa Indonesia, I recommend to give it a listen). And as it grows time to time, Bukalapak apparently grows a lot faster than he thought it would be. So fast that if he was starting again, he might choose a more modular architecture.
I hope this article can help you understand an application design architecture a little bit better, so you won’t be sitting in the dark anymore whenever you stumbled upon in a conversation with software architects or reading the company's design architecture diagram.
How about on the Front End side?
The web application also includes frontend. In modern large web applications, you might also like to look more into Micro Frontend architecture. Writing Micro Frontends is even more of an architectural design decision and development approach as opposed to it being a technology. But that's a topic for another time.