- 博客(619)
- 资源 (51)
- 收藏
- 关注
原创 hbase hbck流程
HBaseFsck类的hbase hbck流程;hbck是一个很重的管理工具,他会访问所有rs,扫描整个meta表,以及读取所有table region里的regioninfo,所以不要频繁使用hbck,会给hbase带来压力 /** * This repair method requires the cluster to be online since it conta...
2015-12-21 16:23:24
548
原创 spark tachyon 搭建 配置
mvn -Dhadoop.version=2.3.0 -DskipTests clean package spark-env.sh因为需要访问hdfs,hive,所以需要压缩lzo,和mysqlexport SPARK_CLASSPATH=$SPARK_CLASSPATH:/data/hadoop/hadoop-2.3.0-cdh5.1.0/share/hadoop/common/hadoop-lzo-0.4.20-SNAPSHOT.jarexport SPARK_LIBRARY_P
2015-11-19 18:20:30
396
原创 hive sql优化
好), 这是由参数hive.auto.convert.join=true 和hive.smalltable.filesize=25000000L)参数控制(默认是25M),如果表文件大小在25M左右,可以适当调整此参数,进行map side join,避免reduce side join。 也可以显示声明进行map join:特别适用于小表join大表的时候,SELECT /*+ MAPJOIN(b) */ a.key, a.value FROM a join b on a.key = b.key2.
2015-11-13 17:47:40
211
原创 tachyon 命令
ata/stg/source/s_emarbox_prod_browse/20151120/17/2015-11-20_17_ec_detail.dat
2015-11-11 13:45:35
163
原创 storm kafka consumer
5761 准备,一些相关类GlobalPartitionInformation (storm.kafka.trident)记录partitionid和broker的关系GlobalPartitionInformation info = new GlobalPartitionInformation();info.addPartition(0, new Broker("10.1.110.24",9092));info.addPartition(0, n
2015-11-06 15:58:30
174
原创 java内存使用查看 转
转:http://mxsfengg.iteye.com/blog/975393 jmap 能查看jvm内存中,对象占用内存的情况,还提供非常方便的命令将jvm的内存信息导出的文件。 Shell代码 jmap -dump:format=b,file=heap.bin <pid> 命令jhat 能够解析 java...
2015-10-29 14:51:18
112
原创 java内存使用查看 转
ll代码 jmap -dump:format=b,file=heap.bin <pid> 命令jhat 能够解析 java内存堆的文件,生成相关信息,并启动webServer提供查询。 也就说,我们可以通过浏览器来看这些内存信息。jhat还提供了一个类sql的查询语言---OQL来给我们使用。 执行一下 Java代码 jhat -J-Xmx512m heap.bin 就可以将
2015-10-29 14:51:18
120
原创 kafka 获取metadata
数据 NetworkClient类,方法poll,检查metadata是否需要更新方法: /** * Add a metadata request to the list of sends if we can make one */ private void maybeUpdateMetadata(List<NetworkSend> sends, long now) { // Beware that the
2015-10-14 18:48:11
2051
原创 kafka leader balance
Balancing leadership Whenever a broker stops or crashes leadership for that broker's partitions transfers to other replicas. This means that by default when the broker is restarted it will only...
2015-10-14 13:23:35
176
原创 kafka leader balance
ly be a follower for all its partitions, meaning it will not be used for client reads and writes.To avoid this imbalance, Kafka has a notion of preferred replicas. If the list of replicas for a partition is 1,5,9 then node 1 is preferred as the leader to
2015-10-14 13:23:35
412
kafka broker宕机&leader选举
ener() extends IZkChildListener with Logging { this.logIdent = "[BrokerChangeListener on Controller " + controller.config.brokerId + "]: " def handleChildChange(parentPath : String, currentBrokerList : java.util.List[String])
2015-10-09 16:40:50
1343
scala 变量,集合
var 可变,可重新赋值,赋值为"_"表示缺省值(0, false, null),例如: var d:Double = _ // d = 0.0 var i:Int = _ // i = 0 var s:String = _ // s = null val不可变 val (x,y...
2015-09-11 17:46:00
108
scala 变量,集合
_ // s = null val不可变 val (x,y) = (10, "hello") def 实时返回结果变量,可作为方法返回结果,方便使用 def t = System. currentTimeMillis // 每次不一样 类型转化:1强转换 var i = 10.asInstanceOf[Double] //类型强制println(i)println(List('A','B','C').
2015-09-11 17:46:00
137
scala 函数
scala函数: 1.正常函数 def normalReturn(x:Int,y:Int):Double ={ return x*y*0.1; } 2.没有返回值,不要等号,或是返回值为Unit def noRetrun():Unit = { println("1000") } def noRetrun2(x:Any) ...
2015-09-11 17:01:29
138
scala 函数
println("1000") } def noRetrun2(x:Any) { println("no return") return x } 3 映射式定义 从int到double def f:Int=>Double = { case 1 => 0.1 case 2 => 0.2 case _ => 0.0 }
2015-09-11 17:01:29
128
原创 scala eclipse maven环境搭建
lns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd"> <modelVersion>4.0.0</modelVersion> <
2015-09-11 16:30:57
135
原创 kafka producer服务端
pis类,调用handleProducerOrOffsetCommitRequest方法: def handle(request: RequestChannel.Request) { try{ trace("Handling request: " + request.requestObj + " from client: " + request.remoteAddress) request.requestId match {
2015-09-01 15:56:17
175
原创 kafka KafkaRequestHandlerPool类
KafkaRequestHandlerPool是KafkaRequestHandler的handler池,处理所有请求队列具体的处理,会交由KafkaApis类 for(i <- 0 until numThreads) { runnables(i) = new KafkaRequestHandler(i, brokerId, aggregateIdleMeter,...
2015-09-01 15:12:43
153
原创 kafka KafkaRequestHandlerPool类
questHandler(i, brokerId, aggregateIdleMeter, numThreads, requestChannel, apis) threads(i) = Utils.daemonThread("kafka-request-handler-" + i, runnables(i)) threads(i).start() } run方法: def run() { while(true) { tr
2015-09-01 15:12:43
200
原创 kafka ReplicaManager类
// start ISR expiration thread scheduler.schedule("isr-expiration", maybeShrinkIsr, period = config.replicaLagTimeMaxMs, unit = TimeUnit.MILLISECONDS) } 主方法:maybeShrinkIsr private def maybeShrinkIsr(): Unit = { trace(&qu
2015-08-27 13:35:17
196
kafka TopicConfigManager类
plog(每个topic每个partition对应一个log)配置 /** * 注册config change的listener * Begin watching for config changes */ def startup() { ZkUtils.makeSurePersistentPathExists(zkClient, ZkUtils.TopicConfigChangesPath) //监听/config/changes的子节点,Confi
2015-08-27 11:24:13
166
原创 kafka logManager类 kafka存储机制
messageSet:每个log file的管道类 base offset:在topic中的绝对offset值 offsetindex:每个log index的管道map类,存储相对offset值和文件position 按照partition分区topic,分发到各个机子上 partition上有多个log文件,每个log文件一个索引文件 log文件是实际的数据,索引文件是log文件里数据的相对偏移量和在log文件里的position
2015-08-26 17:31:44
284
原创 scala集合
应用一系列的变换,语言本身也对集合操作提供了众多强大的函数,本文将以List类型为例子,介绍常见的集合变换操作。一、常用操作符(操作符其实也是函数)++ ++[B](that: GenTraversableOnce[B]): List[B] 从列表的尾部添加另外一个列表++: ++:[B >: A, That](that: collection.Traversable[B])(implicit bf: CanBuildFrom[List[A], B, That]): That 在
2015-08-14 17:25:00
102
原创 java filechannel
Java NIO中的FileChannel是一个连接到文件的通道。可以通过文件通道读写文件。FileChannel无法设置为非阻塞模式,它总是运行在阻塞模式下。filechannel map方法 ,kafka里的index用的mbb实现,文件与内存同步 public static MappedByteBuffer generateChannelMap(String filep...
2015-08-14 15:42:32
115
原创 java filechannel
p方法 ,kafka里的index用的mbb实现,文件与内存同步 public static MappedByteBuffer generateChannelMap(String filepath) throws IOException{ File f = new File(filepath); boolean isnew = f.createNewFile(); System.out.println(isnew); RandomAccessFile raf=null;
2015-08-14 15:42:32
130
原创 hbase Number of empty REGIONINFO_QUALIFIER rows in hbase:meta: 1
机房断电 hbase产生漏洞,用http://blackproof.iteye.com/blog/2052898这个帖子,可以删除多余的meta上的region 在hbase hbck报错:ERROR: Empty REGIONINFO_QUALIFIER found in hbase:meta 需要用hbase hbck -details才能显示有问题的rowER...
2015-08-06 13:49:13
291
原创 hbase Number of empty REGIONINFO_QUALIFIER rows in hbase:meta: 1
nd in hbase:meta 需要用hbase hbck -details才能显示有问题的rowERROR: Empty REGIONINFO_QUALIFIER found in hbase:meta.2015-07-14 15:23:23,082 INFO [main] client.HConnectionManager$HConnectionImplementation: Closing zookeeper sessionid=0xd34e8a5d42640f982015-0
2015-08-06 13:49:13
795
原创 storm Async loop died! & reconnect
storm 在有supervisor重启的时候,topology报错,导致所有spout不消费: 2015-07-15T09:48:26.470+0800 b.s.util [ERROR] Async loop died!java.lang.RuntimeException: java.lang.RuntimeException: Client is being closed, a...
2015-08-06 13:48:16
368
原创 storm Async loop died! & reconnect
imeException: Client is being closed, and does not take requests any more at backtype.storm.utils.DisruptorQueue.consumeBatchToCursor(DisruptorQueue.java:128) ~[storm-core-0.9.3.jar:0.9.3] at backtype.storm.utils.DisruptorQueue.consumeBat
2015-08-06 13:48:16
380
原创 ERROR: Found lingering reference file hdfs
ency-in-table-hbase This looks like you had a failed region split, see [HBASE-8052] (https://issues.apache.org/jira/browse/HBASE-8502) for more details.This bug leaves references to parent regions that have been moved in HDFS. To fix, just delet
2015-08-03 18:11:15
465
原创 kafka 重新分配leader kafka-preferred-replica-election.sh
bin/kafka-preferred-replica-election.sh --zookeeper hostzk/kafka-real bin/kafka-preferred-replica-election.sh --zookeeper localhost:12913/kafka --path-to-json-file topicPartitionList.json top...
2015-07-17 17:45:04
760
原创 kafka 重新分配leader kafka-preferred-replica-election.sh
opicPartitionList.json:{"partitions":[{"topic":"topic","partition": 0},{"topic":"topic","partition": 1},{"topic":"topic","partition": 2},{"topic"
2015-07-17 17:45:04
939
原创 python深浅复制,类型转换, json操作,数组操作
贝,深拷贝 python类型转换 1 函数 描述 2 int(x [,base ]) 将x转换为一个整数 3 long(x [,base ]) 将x转换为一个长整数 4 float(x ) 将x转换到一个浮点数 5 complex(real [,imag ]) 创建一个复数 6 str(x ) 将对象 x 转换为字符串 7 repr(x
2015-07-10 14:07:11
1116
原创 kafka接口协议二 详细
kafka没有直接将消息发给某个topic的partition,所以product必须发送partition的broker client可以从任意broker获得cluster metadata信息,获得paritition的leader broker,当leader broker处理数据有误时,有两种情况1.broker死了,2broker不在包含此partition;所以需...
2015-07-10 11:12:45
220
hadoop命令手册
2012-10-15
findbugs-3.0.1.tar.gz
2015-04-02
推荐系统入门
2014-09-17
java linux安装包 part2
2013-01-31
空空如也
TA创建的收藏夹 TA关注的收藏夹
TA关注的人