blackproof-CSDN博客

原创 hbase hbck流程

HBaseFsck类的hbase hbck流程；hbck是一个很重的管理工具，他会访问所有rs，扫描整个meta表，以及读取所有table region里的regioninfo，所以不要频繁使用hbck，会给hbase带来压力 /** * This repair method requires the cluster to be online since it conta...

2015-12-21 16:23:24 548

mvn -Dhadoop.version=2.3.0 -DskipTests clean package spark-env.sh因为需要访问hdfs，hive，所以需要压缩lzo，和mysqlexport SPARK_CLASSPATH=$SPARK_CLASSPATH:/data/hadoop/hadoop-2.3.0-cdh5.1.0/share/hadoop/common/hadoop-lzo-0.4.20-SNAPSHOT.jarexport SPARK_LIBRARY_P

2015-11-19 18:20:30 396

原创 hive sql优化

好), 这是由参数hive.auto.convert.join=true 和hive.smalltable.filesize=25000000L）参数控制（默认是25M），如果表文件大小在25M左右，可以适当调整此参数，进行map side join，避免reduce side join。也可以显示声明进行map join：特别适用于小表join大表的时候，SELECT /*+ MAPJOIN(b) */ a.key, a.value FROM a join b on a.key = b.key2.

2015-11-13 17:47:40 211

原创 tachyon 命令

ata/stg/source/s_emarbox_prod_browse/20151120/17/2015-11-20_17_ec_detail.dat

2015-11-11 13:45:35 163

原创 storm kafka consumer

5761 准备，一些相关类GlobalPartitionInformation (storm.kafka.trident)记录partitionid和broker的关系GlobalPartitionInformation info = new GlobalPartitionInformation();info.addPartition(0, new Broker("10.1.110.24",9092));info.addPartition(0, n

2015-11-06 15:58:30 174

原创 java内存使用查看转

转：http://mxsfengg.iteye.com/blog/975393 jmap 能查看jvm内存中，对象占用内存的情况，还提供非常方便的命令将jvm的内存信息导出的文件。 Shell代码 jmap -dump:format=b,file=heap.bin <pid> 命令jhat 能够解析 java...

2015-10-29 14:51:18 112

原创 java内存使用查看转

ll代码 jmap -dump:format=b,file=heap.bin <pid> 命令jhat 能够解析 java内存堆的文件，生成相关信息，并启动webServer提供查询。也就说，我们可以通过浏览器来看这些内存信息。jhat还提供了一个类sql的查询语言---OQL来给我们使用。执行一下 Java代码 jhat -J-Xmx512m heap.bin 就可以将

2015-10-29 14:51:18 120

原创 kafka 获取metadata

数据 NetworkClient类，方法poll，检查metadata是否需要更新方法： /** * Add a metadata request to the list of sends if we can make one */ private void maybeUpdateMetadata(List<NetworkSend> sends, long now) { // Beware that the

2015-10-14 18:48:11 2051

原创 kafka leader balance

Balancing leadership Whenever a broker stops or crashes leadership for that broker's partitions transfers to other replicas. This means that by default when the broker is restarted it will only...

2015-10-14 13:23:35 176

原创 kafka leader balance

ly be a follower for all its partitions, meaning it will not be used for client reads and writes.To avoid this imbalance, Kafka has a notion of preferred replicas. If the list of replicas for a partition is 1,5,9 then node 1 is preferred as the leader to

2015-10-14 13:23:35 412

kafka broker宕机&leader选举

ener() extends IZkChildListener with Logging { this.logIdent = "[BrokerChangeListener on Controller " + controller.config.brokerId + "]: " def handleChildChange(parentPath : String, currentBrokerList : java.util.List[String])

2015-10-09 16:40:50 1343

原创 idea 快捷键

2015-09-24 15:18:24 118

scala 变量，集合

var 可变，可重新赋值，赋值为"_"表示缺省值(0, false, null)，例如： var d:Double = _ // d = 0.0 var i:Int = _ // i = 0 var s:String = _ // s = null val不可变 val (x,y...

2015-09-11 17:46:00 108

scala 变量，集合

_ // s = null val不可变 val (x,y) = (10, "hello") def 实时返回结果变量，可作为方法返回结果，方便使用 def t = System. currentTimeMillis // 每次不一样类型转化：1强转换 var i = 10.asInstanceOf[Double] //类型强制println(i)println(List('A','B','C').

2015-09-11 17:46:00 137

scala 函数

scala函数： 1.正常函数 def normalReturn(x:Int,y:Int):Double ={ return x*y*0.1; } 2.没有返回值,不要等号，或是返回值为Unit def noRetrun():Unit = { println("1000") } def noRetrun2(x:Any) ...

2015-09-11 17:01:29 138

scala 函数

println("1000") } def noRetrun2(x:Any) { println("no return") return x } 3 映射式定义从int到double def f:Int=>Double = { case 1 => 0.1 case 2 => 0.2 case _ => 0.0 }

2015-09-11 17:01:29 128

原创 scala eclipse maven环境搭建

lns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd"> <modelVersion>4.0.0</modelVersion> <

2015-09-11 16:30:57 135

原创 kafka producer服务端

pis类，调用handleProducerOrOffsetCommitRequest方法： def handle(request: RequestChannel.Request) { try{ trace("Handling request: " + request.requestObj + " from client: " + request.remoteAddress) request.requestId match {

2015-09-01 15:56:17 175

原创 kafka KafkaRequestHandlerPool类

KafkaRequestHandlerPool是KafkaRequestHandler的handler池，处理所有请求队列具体的处理，会交由KafkaApis类 for(i <- 0 until numThreads) { runnables(i) = new KafkaRequestHandler(i, brokerId, aggregateIdleMeter,...

2015-09-01 15:12:43 153

原创 kafka KafkaRequestHandlerPool类

questHandler(i, brokerId, aggregateIdleMeter, numThreads, requestChannel, apis) threads(i) = Utils.daemonThread("kafka-request-handler-" + i, runnables(i)) threads(i).start() } run方法： def run() { while(true) { tr

2015-09-01 15:12:43 200

原创 kafka SocketServer类

2015-09-01 15:09:18 183

原创 kafka ReplicaManager类

// start ISR expiration thread scheduler.schedule("isr-expiration", maybeShrinkIsr, period = config.replicaLagTimeMaxMs, unit = TimeUnit.MILLISECONDS) } 主方法：maybeShrinkIsr private def maybeShrinkIsr(): Unit = { trace(&qu

2015-08-27 13:35:17 196

kafka TopicConfigManager类

plog（每个topic每个partition对应一个log）配置 /** * 注册config change的listener * Begin watching for config changes */ def startup() { ZkUtils.makeSurePersistentPathExists(zkClient, ZkUtils.TopicConfigChangesPath) //监听/config/changes的子节点,Confi

2015-08-27 11:24:13 166

原创 kafka logManager类 kafka存储机制

messageSet：每个log file的管道类 base offset：在topic中的绝对offset值 offsetindex：每个log index的管道map类，存储相对offset值和文件position 按照partition分区topic，分发到各个机子上 partition上有多个log文件，每个log文件一个索引文件 log文件是实际的数据，索引文件是log文件里数据的相对偏移量和在log文件里的position

2015-08-26 17:31:44 284

原创 Java线上应用故障排查之二：高内存占用

2015-08-17 16:28:42 99

原创 scala集合

应用一系列的变换，语言本身也对集合操作提供了众多强大的函数，本文将以List类型为例子，介绍常见的集合变换操作。一、常用操作符（操作符其实也是函数）++ ++[B](that: GenTraversableOnce[B]): List[B] 从列表的尾部添加另外一个列表++: ++:[B >: A, That](that: collection.Traversable[B])(implicit bf: CanBuildFrom[List[A], B, That]): That 在

2015-08-14 17:25:00 102

原创 java filechannel

Java NIO中的FileChannel是一个连接到文件的通道。可以通过文件通道读写文件。FileChannel无法设置为非阻塞模式，它总是运行在阻塞模式下。filechannel map方法，kafka里的index用的mbb实现，文件与内存同步 public static MappedByteBuffer generateChannelMap(String filep...

2015-08-14 15:42:32 115

原创 java filechannel

p方法，kafka里的index用的mbb实现，文件与内存同步 public static MappedByteBuffer generateChannelMap(String filepath) throws IOException{ File f = new File(filepath); boolean isnew = f.createNewFile(); System.out.println(isnew); RandomAccessFile raf=null;

2015-08-14 15:42:32 130

原创 Java线上应用故障排查之一：高CPU占用

2015-08-06 13:58:18 551

原创 hbase Number of empty REGIONINFO_QUALIFIER rows in hbase:meta: 1

机房断电 hbase产生漏洞，用http://blackproof.iteye.com/blog/2052898这个帖子，可以删除多余的meta上的region 在hbase hbck报错：ERROR: Empty REGIONINFO_QUALIFIER found in hbase:meta 需要用hbase hbck -details才能显示有问题的rowER...

2015-08-06 13:49:13 291

原创 hbase Number of empty REGIONINFO_QUALIFIER rows in hbase:meta: 1

nd in hbase:meta 需要用hbase hbck -details才能显示有问题的rowERROR: Empty REGIONINFO_QUALIFIER found in hbase:meta.2015-07-14 15:23:23,082 INFO [main] client.HConnectionManager$HConnectionImplementation: Closing zookeeper sessionid=0xd34e8a5d42640f982015-0

2015-08-06 13:49:13 795

原创 storm Async loop died! & reconnect

storm 在有supervisor重启的时候，topology报错，导致所有spout不消费： 2015-07-15T09:48:26.470+0800 b.s.util [ERROR] Async loop died!java.lang.RuntimeException: java.lang.RuntimeException: Client is being closed, a...

2015-08-06 13:48:16 368

原创 storm Async loop died! & reconnect

imeException: Client is being closed, and does not take requests any more at backtype.storm.utils.DisruptorQueue.consumeBatchToCursor(DisruptorQueue.java:128) ~[storm-core-0.9.3.jar:0.9.3] at backtype.storm.utils.DisruptorQueue.consumeBat

2015-08-06 13:48:16 380

原创 ERROR: Found lingering reference file hdfs

ency-in-table-hbase This looks like you had a failed region split, see [HBASE-8052] (https://issues.apache.org/jira/browse/HBASE-8502) for more details.This bug leaves references to parent regions that have been moved in HDFS. To fix, just delet

2015-08-03 18:11:15 465

原创 scala

2015-07-30 14:45:48 132

原创 kafka 重新分配leader kafka-preferred-replica-election.sh

bin/kafka-preferred-replica-election.sh --zookeeper hostzk/kafka-real bin/kafka-preferred-replica-election.sh --zookeeper localhost:12913/kafka --path-to-json-file topicPartitionList.json top...

2015-07-17 17:45:04 760

原创 kafka 重新分配leader kafka-preferred-replica-election.sh

opicPartitionList.json：{"partitions":[{"topic":"topic","partition": 0},{"topic":"topic","partition": 1},{"topic":"topic","partition": 2},{"topic&quot

2015-07-17 17:45:04 939

原创 python深浅复制，类型转换， json操作，数组操作

贝，深拷贝 python类型转换 1 函数描述 2 int(x [,base ]) 将x转换为一个整数 3 long(x [,base ]) 将x转换为一个长整数 4 float(x ) 将x转换到一个浮点数 5 complex(real [,imag ]) 创建一个复数 6 str(x ) 将对象 x 转换为字符串 7 repr(x

2015-07-10 14:07:11 1116

原创 kafka接口协议二详细

kafka没有直接将消息发给某个topic的partition，所以product必须发送partition的broker client可以从任意broker获得cluster metadata信息，获得paritition的leader broker，当leader broker处理数据有误时，有两种情况1.broker死了，2broker不在包含此partition；所以需...

2015-07-10 11:12:45 220

hbase 源码包

hadoop hbase源码包稳定版hbase-0.94.4.tar.gz

2013-01-21

hadoop 源码包

hadoop 稳定版本hadoop-1.0.1源码包

2013-01-21

Java数据结构和算法

Java 数据结构的经典书籍，讲得非常详细，适合有时间的时候翻阅，绝对收益匪浅

2013-01-10

linux内核编程书籍集合

1-UNIX环境高级编程_第二版中文.pdf 2- 深入理解Linux内核（中文版）.pdf 两本linux内核编程的经典书籍

2013-01-10

深入java虚拟机

讲解jvm的经典书籍《深入java虚拟机讲解jvm的经典书籍《深入java虚拟机》

2013-01-10

Hadoop源代码分析(完整版)

Hadoop源代码分析(完整版) 源于iteye的高人http://caibinbupt.iteye.com/

2012-12-21

MapReduce官方介绍

MapReduce的官方论文，对初学者，做过hadoop有一段时间的同学们都有帮助

2012-12-19

python eclipse 插件 pydev

python的eclipse插件 pydev，欢迎加入python的开发队伍

2012-11-27

python-3.3.0

python3最新jar包，欢迎加入python的开发队列

2012-11-27

HADOOP HIVE

hadoop生态圈hive，不错的指导数据。同类还有programming pig

2012-11-16

pig的源码包

pig源码包，最好的学习资料还是源码包。里边包括核心包，以及其他贡献的包，pigunit，还有源码

2012-11-07

hadoop开发者第三期

hadoop开发者第三期，不错的hadoop开发笔记

2012-11-05

Hadoop开发者第一期

Hadoop开发者第一期，不错的hadoop开发笔记

2012-11-05

pigunit的jar包

pig的unit功能非他的core包中的基本功能，这里是他的pigunit.jar,直接导入项目（已经配置好hadoop和pig）中，即可用

2012-11-05

oracle jdbc驱动

oracle jdbc驱动，直接放在项目中就可以用哦

2012-10-17

hadoop命令手册

hadoop命令手册，顺便给出hadoop的官网url http://hadoop.apache.org/docs/r0.20.0/cn/commands_manual.html#fs

2012-10-15

ssh管理工具

ssh工具，用于ssh连接，管理等操作。很实用

2012-10-12

oracle实现自增

oracle 实现自增；使用sequence，和trigger实现表字段自增

2012-10-12

putty 远登 linux

window远登linux的利器，使用简单

2012-10-12

ext4 表格分页实例代码

基于eclipse项目的ext4表格分页实例，基础而实用

2012-10-10

hbase安装lzo压缩包的编译文件master

2015-04-02

lzo压缩的安装包

2015-04-02

hbase-0.98.1源码包

hbase-0.98.1-src.tar.gz hbase 0.98源码包

2015-04-02

maven-3.3.1

maven包，linux下安装的

2015-04-02

findbugs-3.0.1.tar.gz

FindBugs supports Java 8 now (both as runtime and target platform). FindBugs requires minimum Java 7 as runtime environment! FindBugs uses ASM 5 now which means that some 3rd party detectors based on FindBugs 2.x/ASM 3 has to be upgraded. New "Source" filter which can be used to filter out classes generated from other languages: <?xml version="1.0" encoding="UTF-8"?> <FindBugsFilter> <Match> <Source name="~.*\.groovy" /> </Match> </FindBugsFilter> New "-auxclasspathFromFile" and "-analyzeFromFile" command line options. New "nested" ant task attribute. Various bug fixes, also many patches from community. Thanks for your contributions! FindBugs是一个能静态分析源代码中可能会出现Bug的Eclipse插件工具。

2015-04-02

空空如也

TA创建的收藏夹 TA关注的收藏夹

TA关注的人

hbase 源码包

hadoop 源码包

Java数据结构和算法

linux内核编程书籍集合

深入java虚拟机

Hadoop源代码分析(完整版)

MapReduce官方介绍

python eclipse 插件 pydev

python-3.3.0

HADOOP HIVE

pig的源码包

hadoop开发者第三期

Hadoop开发者第一期

pigunit的jar包

oracle jdbc驱动

hadoop命令手册

ssh管理工具

oracle实现自增

putty 远登 linux

ext4 表格分页实例代码

hbase安装lzo压缩包的编译文件master

lzo压缩的安装包

hbase-0.98.1源码包

maven-3.3.1

findbugs-3.0.1.tar.gz

推荐系统入门

python的eclipse插件

python安装包

如何开热点

决策支持系统

机器学习实战

elasticsearch源码

svn文件删除工具

BTree入门讲解

hbase权威指南

antlrwork编辑器

java linux安装包 part2

java linux安装包 part1

twemproxy源码包

nutcracker源码包

空空如也