1、事情起因
集群是CDH6.2.0,权限使用sentry鉴权,一次hadoop修改配置重启后出现,无论是使用hive客户端还是直接加载文件进去都是提示文件权限问题,使用访问hiveserver2的服务是正常的(hiveserver2正常应该是鉴权是在sentry,鉴权完成后使用hive用户访问),查看了sentry上面的赋权,权限是正常的,再查看hdfs文件系统上面的目录和文件,发现acl权限全部丢失。
2、异常原因及排查思路
查询hadoop namenode的日志,发现有个同步sentry权限到acl的异常,如下:
2024-11-21 12:31:58,976 ERROR org.apache.sentry.core.common.transport.RetryClientInvocationHandler: failed to execute getAllUpdatesFrom
java.lang.reflect.InvocationTargetException
at sun.reflect.GeneratedMethodAccessor294.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.sentry.core.common.transport.RetryClientInvocationHandler.invokeImpl(RetryClientInvocationHandler.java:95)
at org.apache.sentry.core.common.transport.SentryClientInvocationHandler.invoke(SentryClientInvocationHandler.java:41)
at com.sun.proxy.$Proxy22.getAllUpdatesFrom(Unknown Source)
at org.apache.sentry.hdfs.SentryUpdater.getUpdates(SentryUpdater.java:49)
at org.apache.sentry.hdfs.SentryAuthorizationInfo.update(SentryAuthorizationInfo.java:125)
at org.apache.sentry.hdfs.SentryAuthorizationInfo.run(SentryAuthorizationInfo.java:220)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: org.apache.sentry.core.common.exception.SentryHdfsServiceException: Thrift Exception occurred !!
at org.apache.sentry.hdfs.SentryHDFSServiceClientDefaultImpl.getAllUpdatesFrom(SentryHDFSServiceClientDefaultImpl.java:140)
... 16 more
Caused by: org.apache.thrift.transport.TTransportException: java.net.SocketTimeoutException: Read timed out
at org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:129)
at org.apache.thrift.transport.TTransport.readAll(TTransport.java:86)
at org.apache.thrift.transport.TSaslTransport.readLength(TSaslTransport.java:374)
at org.apache.thrift.transport.TSaslTransport.readFrame(TSaslTransport.java:451)
at org.apache.thrift.transport.TSaslTransport.read(TSaslTransport.java:433)
at org.apache.thrift.transport.TSaslClientTransport.read(TSaslClientTransport.java:37)
at org.apache.thrift.transport.TTransport.readAll(TTransport.java:86)
at org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:429)
at org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:318)
at org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:219)
at org.apache.thrift.protocol.TProtocolDecorator.readMessageBegin(TProtocolDecorator.java:135)
at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:77)
at org.apache.sentry.hdfs.service.thrift.SentryHDFSService$Client.recv_get_authz_updates(SentryHDFSService.java:171)
at org.apache.sentry.hdfs.service.thrift.SentryHDFSService$Client.get_authz_updates(SentryHDFSService.java:158)
at org.apache.sentry.hdfs.SentryHDFSServiceClientDefaultImpl.getAllUpdatesFrom(SentryHDFSServiceClientDefaultImpl.java:107)
... 16 more
Caused by: java.net.SocketTimeoutException: Read timed out
at java.net.SocketInputStream.socketRead0(Native Method)
at java.net.SocketInputStream.socketRead(SocketInputStream.java:116)
at java.net.SocketInputStream.read(SocketInputStream.java:171)
at java.net.SocketInputStream.read(SocketInputStream.java:141)
at java.io.BufferedInputStream.fill(BufferedInputStream.java:246)
at java.io.BufferedInputStream.read1(BufferedInputStream.java:286)
at java.io.BufferedInputStream.read(BufferedInputStream.java:345)
at org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:127)
... 30 more
2024-11-21 12:31:58,977 ERROR org.apache.sentry.core.common.transport.RetryClientInvocationHandler: Thrift call failed
org.apache.thrift.transport.TTransportException: org.apache.thrift.transport.TTransportException: java.net.SocketTimeoutException: Read timed out
at org.apache.sentry.core.common.transport.RetryClientInvocationHandler.invokeImpl(RetryClientInvocationHandler.java:110)
at org.apache.sentry.core.common.transport.SentryClientInvocationHandler.invoke(SentryClientInvocationHandler.java:41)
at com.sun.proxy.$Proxy22.getAllUpdatesFrom(Unknown Source)
at org.apache.sentry.hdfs.SentryUpdater.getUpdates(SentryUpdater.java:49)
at org.apache.sentry.hdfs.SentryAuthorizationInfo.update(SentryAuthorizationInfo.java:125)
at org.apache.sentry.hdfs.SentryAuthorizationInfo.run(SentryAuthorizationInfo.java:220)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: org.apache.thrift.transport.TTransportException: java.net.SocketTimeoutException: Read timed out
at org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:129)
at org.apache.thrift.transport.TTransport.readAll(TTransport.java:86)
at org.apache.thrift.transport.TSaslTransport.readLength(TSaslTransport.java:374)
at org.apache.thrift.transport.TSaslTransport.readFrame(TSaslTransport.java:451)
at org.apache.thrift.transport.TSaslTransport.read(TSaslTransport.java:433)
at org.apache.thrift.transport.TSaslClientTransport.read(TSaslClientTransport.java:37)
at org.apache.thrift.transport.TTransport.readAll(TTransport.java:86)
at org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:429)
at org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:318)
at org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:219)
at org.apache.thrift.protocol.TProtocolDecorator.readMessageBegin(TProtocolDecorator.java:135)
at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:77)
at org.apache.sentry.hdfs.service.thrift.SentryHDFSService$Client.recv_get_authz_updates(SentryHDFSService.java:171)
at org.apache.sentry.hdfs.service.thrift.SentryHDFSService$Client.get_authz_updates(SentryHDFSService.java:158)
at org.apache.sentry.hdfs.SentryHDFSServiceClientDefaultImpl.getAllUpdatesFrom(SentryHDFSServiceClientDefaultImpl.java:107)
at sun.reflect.GeneratedMethodAccessor294.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.sentry.core.common.transport.RetryClientInvocationHandler.invokeImpl(RetryClientInvocationHandler.java:95)
... 12 more
Caused by: java.net.SocketTimeoutException: Read timed out
at java.net.SocketInputStream.socketRead0(Native Method)
at java.net.SocketInputStream.socketRead(SocketInputStream.java:116)
at java.net.SocketInputStream.read(SocketInputStream.java:171)
at java.net.SocketInputStream.read(SocketInputStream.java:141)
at java.io.BufferedInputStream.fill(BufferedInputStream.java:246)
at java.io.BufferedInputStream.read1(BufferedInputStream.java:286)
at java.io.BufferedInputStream.read(BufferedInputStream.java:345)
at org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:127)
... 30 more
提示其实很明显,就是sentryHdfsPlugin同步sentry权限到hdfs超时,但是因为sentry这个项目现在已经不维护了,网上的资料也很少,导致应该修改哪些参数也不清楚,于是只能翻看源码的解决了。
首先是入口,我们可以随便找个hadoop集群,把sentry的勾去掉,可以看到修改的配置,如图:
找到这个类的start方法:
public void start() {
if (started) {
throw new IllegalStateException("Provider already started");
}
started = true;
try {
if (!conf.getBoolean(DFSConfigKeys.DFS_NAMENODE_ACLS_ENABLED_KEY,
false)) {
throw new RuntimeException("HDFS ACLs must be enabled");
}
Configuration conf = new Configuration(this.conf);
conf.addResource(SentryAuthorizationConstants.CONFIG_FILE, true);
user = conf.get(SentryAuthorizationConstants.HDFS_USER_KEY,
SentryAuthorizationConstants.HDFS_USER_DEFAULT);
group = conf.get(SentryAuthorizationConstants.HDFS_GROUP_KEY,
SentryAuthorizationConstants.HDFS_GROUP_DEFAULT);
permission = FsPermission.createImmutable(
(short) conf.getLong(SentryAuthorizationConstants
.HDFS_PERMISSION_KEY,
SentryAuthorizationConstants.HDFS_PERMISSION_DEFAULT)
);
originalAuthzAsAcl = conf.getBoolean(
SentryAuthorizationConstants.INCLUDE_HDFS_AUTHZ_AS_ACL_KEY,
SentryAuthorizationConstants.INCLUDE_HDFS_AUTHZ_AS_ACL_DEFAULT);
LOG.info("Starting");
LOG.info("Config: hdfs-user[{}] hdfs-group[{}] hdfs-permission[{}] " +
"include-hdfs-authz-as-acl[{}]", new Object[]
{
user, group, permission, originalAuthzAsAcl});
if (authzInfo == null) {
authzInfo = new SentryAuthorizationInfo(conf);
}
authzInfo.start