Posts

Showing posts from May, 2022

The Data Mesh - should you adapt?

Image
In actuality, not every firm may be a good fit for the implementation of a Data Mesh.  Larger enterprises that experience uncertainty and change in their operations and environment are the primary target audience for Data Mesh.  A Data Mesh is definitely an unnecessary expense if your organization's data requirements are modest and remain constant over time. What is a "Data Mesh"? As it focuses on delivering useful and safe data products, Data Mesh is a strategic approach to modern data management and a strategy to support an organization's journey toward digital transformation. Data Mesh's major goal is to advance beyond the established centralized data management techniques of using data warehouses and data lakes. By giving data producers and data consumers the ability to access and handle data without having to go through the hassle of involving the data lake or data warehouse team, Data Mesh highlights the concept of organizational agility. Data Mesh's dec

Thinking to Switch to Vue? A Ligthweight but Powerful Framework

Since I started working developing user interfaces with GWT, several years have passed and I have gone through different technologies, such as Angular V1.x and V2.x. Now arriving with Databloom, I met Vue (v3.x). I have been lucky to see how browsers have grown to become capable of not only displaying informative content but also of becoming a multipurpose box, capable of offering functionality of all kinds, which was previously only available in specialised standalone applications developed in Java, or C , VB, etc. Each of these and other technologies that have passed allowed us (developers) to shape various functionalities to the content displayed in the browser, limited only by the imagination and needs of the requirements. However, unlike years ago, we now find ourselves with powerful front-end tools and frameworks, which not only allow us to build web applications, but also give us the job (and the alternative) of knowing which one to choose according to our needs. In this decisio

Towards a Learning-based Query Optimizer

Image
Query optimization is at the core of any data management/analytics system. It is the process of determining the best way to execute an input query or task (i.e., execution plan). Query optimization is composed of several three sub-processes: (i) The enumeration of the different execution plans, (ii) the cost of each subplan required to determine which one is the best, (iii) the cardinality estimation of subplans (i.e., how many tuples a subplan will output) which is crucial because it affects the cost of the plan. Recent research in the field of data management has begun to leverage the power of machine learning (ML) to solve these tasks more effectively and efficiently. In this blog post, we will focus on using ML for estimating the cost of subplans.  Traditional optimizers come with a cost model. This means mathematical formulas that encode the cost of each operator and aggregate these costs to estimate the cost of a query plan. However, coming up with a cost model in a federated s