A constant flow of document updates can bring an Elasticsearch cluster to its knees. Fortunately, there are ways to avoid that scenario.
As we’ve seen in my previous article, Elasticseach doesn’t really support updates. In Elasticsearch, an update always means delete+create.
In a previous project, we were using Elasticsearch for full-text search and needed to save some signals, like new followers, along with the user document.
That represented a big issue since thousands of new signals for a single user could be generated in seconds and that meant thousands of sequential updates to the same document.
Going for the naive…
Multiple strategies that you can use to increase Elasticsearch write capacity for batch jobs and/or online transactions.
Over the last few years, I’ve faced bottlenecks and made many mistakes with different ES clusters when it comes to its write capacity. Especially when one of the requirements is to write into a live Index that has strict SLAs for reading operations.
If you use Elasticsearch in production environments, chances are, you’ve faced these issues too and maybe even made some of the same mistakes I did in the past!
I think having a clear picture of the high-level overview on how…
How to build a backend that can handle millions of concurrent users efficiently and consistently.
Brands like Nike, Adidas, or Supreme created a new trend in the market called “drops”, where they release a finite amount of items. It’s usually a limited run or pre-release limited offer before the real release.
This poses some special challenges since every sale is basically a “Black Friday”, and you have thousands (or millions) of users trying to buy a very limited amount of items at the exact same instant.
How you can make the most out of this powerful database
This post will focus on lowering your memory usage and increase your IPC at the same time
This blog post will focus on POSIX oriented OS like Linux or macOS
To avoid the GIL bottleneck, you might have already used multi-processing with python, be it using a pre-fork worker model (more on that here), or just using the multiprocessing package.
What that does, under the hood, is using the OS fork() function that will create a child process with an exact virtual copy of the parent’s memory. The OS is really clever while doing this since it doesn’t copy the…
How to choose the right worker type to scale your wsgi project to the next level by leveraging everything Gunicorn has to offer.
This article assumes you’re using a sync framework like flask or Django and won’t explore the possibility of using the async/await pattern.
First, let’s briefly discuss how python handles concurrency and parallelism.
Python never runs more than 1 thread per process because of the GIL.
Even if you have 100 threads inside your process, the GIL will only allow a single thread to run at the same time. That means that, at any time, 99 of those…