热门标签 | HotTags
当前位置:  开发笔记 > 后端 > 正文

ACIDinHBase

ByLarsHofhanslAsweknow,ACIDstandsforAtomicity,Consistency,Isolation,andDurability.HBasesupportsACIDinlimitedways,namelyPutstothesamerowprovideallACIDguarantees.(HBASE-3584addsmultioptransactionsandHBASE-

By Lars Hofhansl As we know, ACID stands for Atomicity, Consistency, Isolation, and Durability. HBase supports ACID in limited ways, namely Puts to the same row provide all ACID guarantees. (HBASE-3584 adds multi op transactions and HBASE-

By Lars Hofhansl

As we know, ACID stands for Atomicity, Consistency, Isolation, and Durability.

HBase supports ACID in limited ways, namely Puts to the same row provide all ACID guarantees. (HBASE-3584 adds multi op transactions and HBASE-5229 adds multi row transactions, but the principle remains the same)

So how does ACID work in HBase?

HBase employs a kind of MVCC. And HBase has no mixed read/write transactions.

The nomenclature in HBase is bit strange for historical reasons. In a nutshell each RegionServer maintains what I will call "strictly monotonically increasing transaction numbers".

When a write transaction (a set of puts or deletes) starts it retrieves the next highest transaction number. In HBase this is called a WriteNumber.
When a read transaction (a Scan or Get) starts it retrieves the transaction number of the last committed transaction. HBase calls this the ReadPoint.

Each created KeyValue is tagged with its transaction's WriteNumber (this tag, for historical reasons, is called the memstore timestamp in HBase. Note that this is separate from the application-visible timestamp.)

The highlevel flow of a write transaction in HBase looks like this:
  1. lock the row(s), to guard against concurrent writes to the same row(s)
  2. retrieve the current writenumber
  3. apply changes to the WAL (Write Ahead Log)
  4. apply the changes to the Memstore (using the acquired writenumber to tag the KeyValues)
  5. commit the transaction, i.e. attempt to roll the Readpoint forward to the acquired Writenumber.
  6. unlock the row(s)
The highlevel flow of a read transaction looks like this:
  1. open the scanner
  2. get the current readpoint
  3. filter all scanned KeyValues with memstore timestamp > the readpoint
  4. close the scanner (this is initiated by the client)
In reality it is a bit more complicated, but this is enough to illustrate the point. Note that a reader acquires no locks at all, but we still get all of ACID.

It is important to realize that this only works if transactions are committed strictly serially; otherwise an earlier uncommitted transaction could become visible when one that started later commits first. In HBase transaction are typically short, so this is not a problem.

HBase does exactly that: All transactions are committed serially.

Committing a transaction in HBase means settting the current ReadPoint to the transaction's WriteNumber, and hence make its changes visible to all new Scans.
HBase keeps a list of all unfinished transactions. A transaction's commit is delayed until all prior transactions committed. Note that HBase can still make all changes immediately and concurrently, only the commits are serial.

Since HBase does not guarantee any consistency between regions (and each region is hosted at exactly one RegionServer) all MVCC data structures only need to be kept in memory on every region server.

The next interesting question is what happens during compactions.

In HBase compactions are used to join multiple small store files (create by flushes of the MemStore to disk) into a larger ones and also to remove "garbage" in the process.
Garbage here are KeyValues that either expired due to a column family's TTL or VERSION settings or were marked for deletion. See here and here for more details.

Now imagine a compaction happening while a scanner is still scanning through the KeyValues. It would now be possible see a partial row (see here for how HBase defines a "row") - a row comprised of versions of KeyValues that do not reflect the outcome of any serializable transaction schedule.

The solution in HBase is to keep track of the earliest readpoint used by any open scanner and never collect any KeyValues with a memstore timestamp larger than that readpoint. That logic was - among other enhancements - added with HBASE-2856, which allowed HBase to support ACID guarantees even with concurrent flushes.
HBASE-5569 finally enables the same logic for the delete markers (and hence deleted KeyValues).

Lastly, note that a KeyValue's memstore timestamp can be cleared (set to 0) when it is older than the oldest scanner. I.e. it is known to be visible to every scanner, since all earlier scanner are finished.

Update Thursday, March 22:
A couple of extra points:
  • The readpoint is rolled forward even if the transaction failed in order to not stall later transactions that waiting to be committed (since this is all in the same process, that just mean the roll forward happens in a Java finally block).
  • When updates are written to the WAL a single record is created for the all changes. There is no separate commit record.
  • When a RegionServer crashes, all in flight transaction are eventually replayed on another RegionServer if the WAL record was written completely or discarded otherwise.
推荐阅读
  • MySQL中的MVVC多版本并发控制机制的应用及实现
    本文介绍了MySQL中MVCC的应用及实现机制。MVCC是一种提高并发性能的技术,通过对事务内读取的内存进行处理,避免写操作堵塞读操作的并发问题。与其他数据库系统的MVCC实现机制不尽相同,MySQL的MVCC是在undolog中实现的。通过undolog可以找回数据的历史版本,提供给用户读取或在回滚时覆盖数据页上的数据。MySQL的大多数事务型存储引擎都实现了MVCC,但各自的实现机制有所不同。 ... [详细]
  • 本文介绍了Sencha Touch的学习使用心得,主要包括搭建项目框架的过程。作者强调了使用MVC模式的重要性,并提供了一个干净的引用示例。文章还介绍了Index.html页面的作用,以及如何通过链接样式表来改变全局风格。 ... [详细]
  • 使用J2SE模拟MVC模式开发桌面应用程序的工程包的介绍
    以我开发过的一个娱乐管理系统为例:下图为我系统的业务逻辑的MVC流程:下图为以Eclipse开发中各包的说明:转载于:https:blog ... [详细]
  • wpf+mvvm代码组织结构及实现方式
    本文介绍了wpf+mvvm代码组织结构的由来和实现方式。作者回顾了自己大学时期接触wpf开发和mvvm模式的经历,认为mvvm模式使得开发更加专注于业务且高效。与此同时,作者指出mvvm模式相较于mvc模式的优势。文章还提到了当没有mvvm时处理数据和UI交互的例子,以及前后端分离和组件化的概念。作者希望能够只关注原始数据结构,将数据交给UI自行改变,从而解放劳动力,避免加班。 ... [详细]
  • 本文讨论了在shiro java配置中加入Shiro listener后启动失败的问题。作者引入了一系列jar包,并在web.xml中配置了相关内容,但启动后却无法正常运行。文章提供了具体引入的jar包和web.xml的配置内容,并指出可能的错误原因。该问题可能与jar包版本不兼容、web.xml配置错误等有关。 ... [详细]
  • 本文讨论了在ASP中创建RazorFunctions.cshtml文件时出现的问题,即ASP.global_asax不存在于命名空间ASP中。文章提供了解决该问题的代码示例,并详细解释了代码中涉及的关键概念,如HttpContext、Request和RouteData等。通过阅读本文,读者可以了解如何解决该问题并理解相关的ASP概念。 ... [详细]
  • SpringMVC工作流程概述
    SpringMVC工作流程概述 ... [详细]
  • 数据库锁的分类和应用
    本文介绍了数据库锁的分类和应用,包括并发控制中的读-读、写-写、读-写/写-读操作的问题,以及不同的锁类型和粒度分类。同时还介绍了死锁的产生和避免方法,并详细解释了MVCC的原理以及如何解决幻读的问题。最后,给出了一些使用数据库锁的实际场景和建议。 ... [详细]
  • ps:写的第一个,不足之处,欢迎拍砖---只是想用自己的方法一步步去实现一些框架看似高大上的小功能(比如说模型中的toArraytoJsonsetAtt ... [详细]
  • 今天写一篇blog,已经多长时间没有更了,两个月了吧,没办法,现在银行开发,不能连外网,天天用虚拟机,真烦今天随手写点东西,主要是这两天对于springboot启动的分析,有所领悟 ... [详细]
  • Django + Ansible 主机管理(有源码)
    本文给大家介绍如何利用DjangoAnsible进行Web项目管理。Django介绍一个可以使Web开发工作愉快并且高效的Web开发框架,能够以最小的代价构建和维护高 ... [详细]
  • 从壹开始前后端分离【 .NET Core2.0 +Vue2.0 】框架之六 || API项目整体搭建 6.1 仓储模式
    代码已上传Github+Gitee,文末有地址  书接上文:前几回文章中,我们花了三天的时间简单了解了下接口文档Swagger框架,已经完全解放了我们的以前的Word说明文档,并且可以在线进行调 ... [详细]
  • 一、Struts2是一个基于MVC设计模式的Web应用框架在MVC设计模式中,Struts2作为控制器(Controller)来建立模型与视图的数据交互。Struts2优点1、实现 ... [详细]
  • MVC中的自定义控件
    怎么样创建自定义控 ... [详细]
  • Asp.Net MVC 测试应用程序
    建立一个Asp.NetMVC项目的时候,如果选择建立测试项目,那么系统会为我们建立一个项目所对应的测试项目。包含了Controller文件夹中对应的Controller单元测试文件, ... [详细]
author-avatar
121lzg
这个家伙很懒,什么也没留下!
PHP1.CN | 中国最专业的PHP中文社区 | DevBox开发工具箱 | json解析格式化 |PHP资讯 | PHP教程 | 数据库技术 | 服务器技术 | 前端开发技术 | PHP框架 | 开发工具 | 在线工具
Copyright © 1998 - 2020 PHP1.CN. All Rights Reserved | 京公网安备 11010802041100号 | 京ICP备19059560号-4 | PHP1.CN 第一PHP社区 版权所有