一、问题描述

升级平台商店后,发现服务没有成功启动。报错日志如下:

Caused by: com.zaxxer.hikari.pool.HikariPool$PoolInitializationException: Failed to initialize pool: 网络通信异常
	at com.zaxxer.hikari.pool.HikariPool.throwPoolInitializationException(HikariPool.java:576)
	at com.zaxxer.hikari.pool.HikariPool.checkFailFast(HikariPool.java:562)
	at com.zaxxer.hikari.pool.HikariPool.<init>(HikariPool.java:115)
	at com.zaxxer.hikari.HikariDataSource.<init>(HikariDataSource.java:81)
	at com.baomidou.dynamic.datasource.creator.HikariDataSourceCreator.createDataSource(HikariDataSourceCreator.java:90)
	at com.baomidou.dynamic.datasource.creator.DefaultDataSourceCreator.createDataSource(DefaultDataSourceCreator.java:68)
	at com.baomidou.dynamic.datasource.provider.AbstractDataSourceProvider.createDataSourceMap(AbstractDataSourceProvider.java:44)
	at com.baomidou.dynamic.datasource.provider.YmlDynamicDataSourceProvider.loadDataSources(YmlDynamicDataSourceProvider.java:42)
	at com.baomidou.dynamic.datasource.DynamicRoutingDataSource.afterPropertiesSet(DynamicRoutingDataSource.java:229)
	at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.invokeInitMethods(AbstractAutowireCapableBeanFactory.java:1837)
	at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.initializeBean(AbstractAutowireCapableBeanFactory.java:1774)
	... 121 common frames omitted
Caused by: dm.jdbc.driver.DMException: 网络通信异常 Communication error
	at dm.jdbc.driver.DBError.throwException(DBError.java:683)
	at dm.jdbc.c.a.a(DBAccess.java:764)
	at dm.jdbc.c.a.r(DBAccess.java:143)
	at dm.jdbc.driver.DmdbConnection.openConnection(DmdbConnection.java:660)
	at dm.jdbc.driver.DmDriver.do_connect(DmDriver.java:183)
	at dm.jdbc.driver.DmDriver.connect(DmDriver.java:458)
	at com.zaxxer.hikari.util.DriverDataSource.getConnection(DriverDataSource.java:136)
	at com.zaxxer.hikari.pool.PoolBase.newConnection(PoolBase.java:369)
	at com.zaxxer.hikari.pool.PoolBase.newPoolEntry(PoolBase.java:198)
	at com.zaxxer.hikari.pool.HikariPool.createPoolEntry(HikariPool.java:467)
	at com.zaxxer.hikari.pool.HikariPool.checkFailFast(HikariPool.java:541)
	... 130 common frames omitted
Caused by: java.io.IOException: 你的主机中的软件中止了一个已建立的连接。
	at sun.nio.ch.SocketDispatcher.read0(Native Method)
	at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:43)
	at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223)
	at sun.nio.ch.IOUtil.read(IOUtil.java:197)
	at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:380)
	at dm.jdbc.util.Buffer$Node.load(Buffer.java:1181)
	at dm.jdbc.util.Buffer$Node.access$6(Buffer.java:1164)
	at dm.jdbc.util.Buffer.load(Buffer.java:351)
	at dm.jdbc.c.a.c(DBAccess.java:855)
	at dm.jdbc.c.a.a(DBAccess.java:757)
	... 139 common frames omitted

现场多租户环境下,偶现现象,部分租户可以成功升级启动,部分租户无法成功启动。

网络通信异常是一个比较难定位的异常,尤其是使用了不太常用的达梦数据库。但是在确定了达梦服务运行正常且有别的租户成功升级之后,那么问题就比较好定位了,数据库超时,连接数大多拒绝连接等。

背景再介绍一下本次升级内容吧:商店新增支持神通金仓数据库,使用了mybatis-plus的动态数据源。


    // mybatis-plus
    implementation group: 'com.baomidou', name: 'mybatis-plus-boot-starter', version: '3.4.3.1'
    // 多数据源
    implementation group: 'com.baomidou', name: 'dynamic-datasource-spring-boot-starter', version: '3.4.0'

二、问题排除

1、更改数据源的超时配置:

先粘一下现在的配置吧。数据源大概配置如下

spring:
  application:
    name: store
  # profiles:
  #   include: ca
  datasource:
    type: com.zaxxer.hikari.HikariDataSource
    hikari:
      #最小空闲连接数量
      minimum-idle: 5
      #从池返回的连接默认自动提交
      auto-commit: true
      #空闲连接最大时间,10秒
      idle-timeout: 10000
      #连接池名字
      pool-name: StoreHikariCP
      #池中连接的最长生命周期
      max-lifetime: 1800000
      #数据库连接的超时时间
      connection-timeout: 30000

    # 旧的单数据源配置
#    url: jdbc:dm://localhost
#    username: LALA
#    password: LALALALA123
#    driver-class-name: dm.jdbc.driver.DmDriver

    # 新增多数据源配置
    dynamic:
      primary: dm #设置默认的数据源或者数据源组,默认值即为master
      strict: true #严格匹配数据源,默认false. true未匹配到指定数据源时抛异常,false使用默认数据源
      datasource:
        dm:
          url: jdbc:dm://localhost
          username: LALA
          password: LALALALA123
          driver-class-name: dm.jdbc.driver.DmDriver
#        kingbasees:
#          url: jdbc:kingbase8://localhost:54321/springbootv2
#          username: LALA
#          password: LALALALA123
#          driver-class-name: com.kingbase8.Driver
#        oscar:
#          url: jdbc:oscar://locahost:2003/OSRDB?serverTimezone=UTC&useSSL=FALSE
#          username: LALA
#          password: LALALALA123
#          driver-class-name: com.oscar.Driver
#        mysql:
#          url: jdbc:mysql://localhost:3306/am?characterEncoding=UTF-8&serverTimezone=UTC
#          username: LALA
#          password: LALALALA123
#          driver-class-name: com.mysql.cj.jdbc.Driver

更改上述数据源连接时间,数据库超时时间都调大一倍,问题仍得不到解决。

2.排查数据库连接数的问题

默认使用的达梦数据库,查看达梦数据库配置dm.ini


#IO
		DIRECT_IO                       =  0                    #Flag For Io Mode(Non-Windows Only), 0: Using File System Cache; 1: Without Using File System Cache
		IO_THR_GROUPS                   =  2                    #The Number Of Io Thread Groups(Non-Windows Only)
		HIO_THR_GROUPS                  =  2                    #The Number Of Huge Io Thread Groups(Non-Windows Only)

#database
		MAX_SESSIONS                    =  100                   #Maximum number of concurrent sessions
		MAX_CONCURRENT_TRX              =  0                    #Maximum number of concurrent transactions
		MAX_SESSION_STATEMENT           =  20000                #Maximum number of statement handle of one session
		MAX_CONCURRENT_OLAP_QUERY       =  0                    #Maximum number of concurrent OLAP queries
		BIG_TABLE_THRESHHOLD            =  1000                 #Threshhold value of a big table in 10k
		MAX_EP_SITES                    =  64                   #Maximum number of EP sites for MPP
		PORT_NUM                        =  5236                 #Number Of Database Server Listening Port

注意上述MAX_SESSION 默认最大会话数为100,再回头看看程序数据源配置空闲最小连接数为5

15个租户 15*5=75,应该不会是连接数的问题。确保一下连接数是否正常。执行sql

select * from v$sessions

查看结果

 竟然有10个会话,不符合预期的5个。也就是说之前配置的hikari未生效。

查看官方文档的连接池配置

 是需要配置在spring.dataresource.dynamic层级下的,添加此处配置

spring:
  application:
    name: store
  # profiles:
  #   include: ca
  datasource:
###省略一些

    # 新增多数据源配置
    dynamic:
      hikari:  # 全局hikariCP参数,所有值和默认保持一致。(现已支持的参数如下,不清楚含义不要乱设置)
        max-pool-size: 5
        min-idle: 5
      primary: dm #设置默认的数据源或者数据源组,默认值即为master
      strict: true #严格匹配数据源,默认false. true未匹配到指定数据源时抛异常,false使用默认数据源
      datasource:
        dm:
          url: jdbc:dm://localhost
          username: LALA
          password: LALALALA123
          driver-class-name: dm.jdbc.driver.DmDriver
###省略一些

注:默认最大连接数和最小空闲数都为10。

更改配置之后,重新启动,再次查询,配置生效。空闲连接数更新为5。

大概追了一下源码

配置类DynamicDataSourceAutoConfiguration
@Slf4j
@Configuration
@EnableConfigurationProperties(DynamicDataSourceProperties.class)
@AutoConfigureBefore(value = DataSourceAutoConfiguration.class, name = "com.alibaba.druid.spring.boot.autoconfigure.DruidDataSourceAutoConfigure")
@Import(value = {DruidDynamicDataSourceConfiguration.class, DynamicDataSourceCreatorAutoConfiguration.class, DynamicDataSourceHealthCheckConfiguration.class})
@ConditionalOnProperty(prefix = DynamicDataSourceProperties.PREFIX, name = "enabled", havingValue = "true", matchIfMissing = true)
public class DynamicDataSourceAutoConfiguration implements InitializingBean {

    private final DynamicDataSourceProperties properties;

    private final List<DynamicDataSourcePropertiesCustomizer> dataSourcePropertiesCustomizers;

    public DynamicDataSourceAutoConfiguration(
            DynamicDataSourceProperties properties,
            ObjectProvider<List<DynamicDataSourcePropertiesCustomizer>> dataSourcePropertiesCustomizers) {
        this.properties = properties;
        this.dataSourcePropertiesCustomizers = dataSourcePropertiesCustomizers.getIfAvailable();
    }

    @Bean
    public DynamicDataSourceProvider ymlDynamicDataSourceProvider() {
        return new YmlDynamicDataSourceProvider(properties.getDatasource());
    }

    @Bean
    @ConditionalOnMissingBean
    public DataSource dataSource() {
        DynamicRoutingDataSource dataSource = new DynamicRoutingDataSource();
        dataSource.setPrimary(properties.getPrimary());
        dataSource.setStrict(properties.getStrict());
        dataSource.setStrategy(properties.getStrategy());
        dataSource.setP6spy(properties.getP6spy());
        dataSource.setSeata(properties.getSeata());
        return dataSource;
    }
}

由于注解@AutoConfigureBefore的存在,使得@Bean Datasource在旧的DataSourceAutoConfiguration 先被生成。

旧的DataSourceAutoConfiguration配置类
@Configuration
@ConditionalOnClass({ DataSource.class, EmbeddedDatabaseType.class })
@EnableConfigurationProperties(DataSourceProperties.class)
@Import({ DataSourcePoolMetadataProvidersConfiguration.class, DataSourceInitializationConfiguration.class })
public class DataSourceAutoConfiguration {

	@Configuration
	@Conditional(EmbeddedDatabaseCondition.class)
	@ConditionalOnMissingBean({ DataSource.class, XADataSource.class })
	@Import(EmbeddedDataSourceConfiguration.class)
	protected static class EmbeddedDatabaseConfiguration {

	}

	@Configuration
	@Conditional(PooledDataSourceCondition.class)
	@ConditionalOnMissingBean({ DataSource.class, XADataSource.class })
	@Import({ DataSourceConfiguration.Hikari.class, DataSourceConfiguration.Tomcat.class,
			DataSourceConfiguration.Dbcp2.class, DataSourceConfiguration.Generic.class,
			DataSourceJmxConfiguration.class })
	protected static class PooledDataSourceConfiguration {

	}
}

导致之前配置的hikari数据源配置不生效!

三、问题解决

1.更改程序数据源连接池配置改小一点。

2.更改数据库服务端配置连接数调大一些。

四、反思一下

盲目更改配置之后,并没有验证配置是否生效,考虑问题不周到,使用新配置仍需验证旧配置是否生效,最好多做几轮评估测试,完备之后再进行发布。

Logo

华为开发者空间,是为全球开发者打造的专属开发空间,汇聚了华为优质开发资源及工具,致力于让每一位开发者拥有一台云主机,基于华为根生态开发、创新。

更多推荐