搭建一个 SkyWalking 集群环境,步骤如下:

第一步,搭建一个 Elasticsearch 集群。
第二步, 搭建一个注册中心的集群。目前 SkyWalking 支持 Zookeeper、Kubernetes、Consul、Nacos 作为注册中心。
第三步,搭建一个 SkyWalking OAP 服务的集群, 将 SkyWalking OAP 服务注册到注册中心上。
第四步,启动一个 Spring Boot 应用,并配置 SkyWalking Agent。另外,在设置 SkyWaling Agent 的 SW_AGENT_COLLECTOR_BACKEND_SERVICES 地址时,需要设置多个 SkyWalking OAP 服务的地址数组。
第五步,搭建一个 SkyWalking UI 服务,另外,在设置 SkyWalking UI 的 collector.ribbon.listOfServers 地址时,也可设置多个 SkyWalking OAP 服务的地址数组。

软件版本

  • apache-skywalking-apm-8.7.0.tar.gz
  • apache-zookeeper-3.6.1-bin.tar.gz
  • elasticsearch-6.5.4.tar.gz
  • jdk-8u202-linux-x64.tar.gz

下载地址

https://archive.apache.org/dist/skywalking/8.7.0/apache-skywalking-apm-8.7.0.tar.gz
https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-6.5.4.tar.gz
http://archive.apache.org/dist/zookeeper/zookeeper-3.6.1/apache-zookeeper-3.6.1-bin.tar.gz

在这里插入图片描述
skywalking各端口作用:

  • 5000:skywalking web UI
  • 5005:skywalking grpc自身用,agent用
  • 5006:skywalking web ui用来查询

部署步骤

JDK安装

# 部署jdk-8u202-linux-x64.tar.gz
$ tar zxf jdk-8u202-linux-x64.tar.gz -C /opt  # 解压jdk安装包
# 添加环境变量
$ cat << EOF >> /etc/profile  
export JAVA_HOME=/opt/jdk1.8.0_202
export JRE_HOME=/opt/jdk1.8.0_202/jre
export CLASSPATH=.:$CLASSPATH:$JAVA_HOME/lib:$JRE_HOME/lib
export PATH=$PATH:$JAVA_HOME/bin:$JRE_HOME/bin
EOF
$ source /etc/profile
$ java -version   # 如出现如下则安装成功
openjdk version "1.8.0_202"
OpenJDK Runtime Environment (build 1.8.0_262-b10)
OpenJDK 64-Bit Server VM (build 25.262-b10, mixed mode)

ES 安装

zookeeper集群

三台机器分别部署,分别修改server.1/2/3的ip地址即可。

wget http://archive.apache.org/dist/zookeeper/zookeeper-3.6.1/apache-zookeeper-3.6.1-bin.tar.gz
tar zxf apache-zookeeper-3.6.1-bin.tar.gz -C /opt
mv apache-zookeeper-3.6.1-bin /opt/zookeeper
 
 
useradd -s /bin/bash -U zookeeper
mkdir -p /opt/data/zookeeper-data/
chown -Rf zookeeper.zookeeper /opt/data/zookeeper-data/
chown -Rf zookeeper.zookeeper /opt/zookeeper
 
echo '1' > /opt/data/zookeeper-data/myid
 
cat << EOF > /opt/zookeeper/conf/zoo.cfg
tickTime=2000
initLimit=10
syncLimit=5
dataDir=/opt/data/zookeeper-data
dataLogDir=/opt/zookeeper/logs
clientPort=2181
maxClientCnxns=60
autopurge.snapRetainCount=3
autopurge.purgeInterval=1
server.1=10.0.33.193:2888:3888
server.2=10.0.33.194:2888:3888
server.3=10.0.33.195:2888:3888
EOF
 
cat << EOF > /usr/lib/systemd/system/zookeeper.service
[Unit]
Description=zookeeper.service
After=network.target
 
[Service]
User=zookeeper
Group=zookeeper
Type=forking
ExecStart=/opt/zookeeper/bin/zkServer.sh start
ExecStop=/opt/zookeeper/bin/zkServer.sh stop
ExecReload=/opt/zookeeper/bin/zkServer.sh restart
[Install]
EOF
 
systemctl start zookeeper.service
systemctl enable zookeeper.service
systemctl status zookeeper.service

安装skywalking

(启动skywalking前保证elastricsearch集群,zookeeper集群已启动成功)

  • Backend(也称oap。包括core、cluster、storage、query配置)
  • UI(也称webapp)
    三台机器均部署
wget https://archive.apache.org/dist/skywalking/8.7.0/apache-skywalking-apm-8.7.0.tar.gz
tar -zxvf apache-skywalking-apm-8.7.0.tar.gz -C /opt
vim /opt/apache-skywalking-apm-bin/config/application.yml
cluster:
  zookeeper:      # 表示使用zookeeper,standalone表示使用单节点
    nameSpace: ${SW_NAMESPACE:""}
    hostPort: ${SW_CLUSTER_ZK_HOST_PORT:10.0.33.193:2182,10.0.33.194:2182,10.0.33.195:2182}
    baseSleepTimeMs: ${SW_CLUSTER_ZK_SLEEP_TIME:1000} # initial amount of time to wait between retries
    maxRetries: ${SW_CLUSTER_ZK_MAX_RETRIES:3} # max number of times to retry
    enableACL: ${SW_ZK_ENABLE_ACL:false} # disable ACL in default
    schema: ${SW_ZK_SCHEMA:digest} # only support digest schema
    expression: ${SW_ZK_EXPRESSION:skywalking:skywalking}
core:
  selector: ${SW_CORE:default}
  default:
    role: ${SW_CORE_ROLE:Mixed} # Mixed/Receiver/Aggregator
    restHost: ${SW_CORE_REST_HOST:0.0.0.0}
    restPort: ${SW_CORE_REST_PORT:12800}
    restContextPath: ${SW_CORE_REST_CONTEXT_PATH:/}
    restMinThreads: ${SW_CORE_REST_JETTY_MIN_THREADS:1}
    restMaxThreads: ${SW_CORE_REST_JETTY_MAX_THREADS:200}
    restIdleTimeOut: ${SW_CORE_REST_JETTY_IDLE_TIMEOUT:30000}
    restAcceptorPriorityDelta: ${SW_CORE_REST_JETTY_DELTA:0}
    restAcceptQueueSize: ${SW_CORE_REST_JETTY_QUEUE_SIZE:0}
    httpMaxRequestHeaderSize: ${SW_CORE_HTTP_MAX_REQUEST_HEADER_SIZE:8192}
    gRPCHost: ${SW_CORE_GRPC_HOST:0.0.0.0}
    gRPCPort: ${SW_CORE_GRPC_PORT:5006}
    maxConcurrentCallsPerConnection: ${SW_CORE_GRPC_MAX_CONCURRENT_CALL:0}
    maxMessageSize: ${SW_CORE_GRPC_MAX_MESSAGE_SIZE:0}
    gRPCThreadPoolQueueSize: ${SW_CORE_GRPC_POOL_QUEUE_SIZE:-1}
    gRPCThreadPoolSize: ${SW_CORE_GRPC_THREAD_POOL_SIZE:-1}
    gRPCSslEnabled: ${SW_CORE_GRPC_SSL_ENABLED:false}
    gRPCSslKeyPath: ${SW_CORE_GRPC_SSL_KEY_PATH:""}
    gRPCSslCertChainPath: ${SW_CORE_GRPC_SSL_CERT_CHAIN_PATH:""}
    gRPCSslTrustedCAPath: ${SW_CORE_GRPC_SSL_TRUSTED_CA_PATH:""}
    downsampling:
      - Hour
      - Day
    enableDataKeeperExecutor: ${SW_CORE_ENABLE_DATA_KEEPER_EXECUTOR:true} # 
    dataKeeperExecutePeriod: ${SW_CORE_DATA_KEEPER_EXECUTE_PERIOD:5} #
    recordDataTTL: ${SW_CORE_RECORD_DATA_TTL:3} # Unit is day
    metricsDataTTL: ${SW_CORE_METRICS_DATA_TTL:7} # Unit is day
    l1FlushPeriod: ${SW_CORE_L1_AGGREGATION_FLUSH_PERIOD:500}
    storageSessionTimeout: ${SW_CORE_STORAGE_SESSION_TIMEOUT:70000}
    enableDatabaseSession: ${SW_CORE_ENABLE_DATABASE_SESSION:true}
    topNReportPeriod: ${SW_CORE_TOPN_REPORT_PERIOD:10} # top_n record worker report cycle, unit is minute
    activeExtraModelColumns: ${SW_CORE_ACTIVE_EXTRA_MODEL_COLUMNS:false}
    serviceNameMaxLength: ${SW_SERVICE_NAME_MAX_LENGTH:70}
    instanceNameMaxLength: ${SW_INSTANCE_NAME_MAX_LENGTH:70}
    endpointNameMaxLength: ${SW_ENDPOINT_NAME_MAX_LENGTH:150}
    searchableTracesTags: ${SW_SEARCHABLE_TAG_KEYS:http.method,status_code,db.type,db.instance,mq.queue,mq.topic,mq.broker}
    searchableLogsTags: ${SW_SEARCHABLE_LOGS_TAG_KEYS:level}
    searchableAlarmTags: ${SW_SEARCHABLE_ALARM_TAG_KEYS:level}
    prepareThreads: ${SW_CORE_PREPARE_THREADS:2}
    enableEndpointNameGroupingByOpenapi: ${SW_CORE_ENABLE_ENDPOINT_NAME_GROUPING_BY_OPAENAPI:true}
storage:
  elasticsearch:
    nameSpace: ${SW_NAMESPACE:"sk-es-new"}
    clusterNodes: ${SW_STORAGE_ES_CLUSTER_NODES:10.0.63.22:5010,10.0.63.23:5011,10.0.63.24:5012}
    protocol: ${SW_STORAGE_ES_HTTP_PROTOCOL:"http"}
    connectTimeout: ${SW_STORAGE_ES_CONNECT_TIMEOUT:500}
    socketTimeout: ${SW_STORAGE_ES_SOCKET_TIMEOUT:30000}
    user: ${SW_ES_USER:""}
    password: ${SW_ES_PASSWORD:""}
    trustStorePath: ${SW_STORAGE_ES_SSL_JKS_PATH:""}
    trustStorePass: ${SW_STORAGE_ES_SSL_JKS_PASS:""}
    secretsManagementFile: ${SW_ES_SECRETS_MANAGEMENT_FILE:""} # 
    dayStep: ${SW_STORAGE_DAY_STEP:1} # Represent the number of days in the one minute/hour/day index.
    indexShardsNumber: ${SW_STORAGE_ES_INDEX_SHARDS_NUMBER:1} # Shard number of new indexes
    indexReplicasNumber: ${SW_STORAGE_ES_INDEX_REPLICAS_NUMBER:1} # Replicas number of new indexes
    superDatasetDayStep: ${SW_SUPERDATASET_STORAGE_DAY_STEP:-1} 
    superDatasetIndexShardsFactor: ${SW_STORAGE_ES_SUPER_DATASET_INDEX_SHARDS_FACTOR:5} #  
    superDatasetIndexReplicasNumber: ${SW_STORAGE_ES_SUPER_DATASET_INDEX_REPLICAS_NUMBER:0} # 
    indexTemplateOrder: ${SW_STORAGE_ES_INDEX_TEMPLATE_ORDER:0} # the order of index template
    bulkActions: ${SW_STORAGE_ES_BULK_ACTIONS:5000} # 
    flushInterval: ${SW_STORAGE_ES_FLUSH_INTERVAL:15}
    concurrentRequests: ${SW_STORAGE_ES_CONCURRENT_REQUESTS:2} # the number of concurrent requests
    resultWindowMaxSize: ${SW_STORAGE_ES_QUERY_MAX_WINDOW_SIZE:10000}
    metadataQueryMaxSize: ${SW_STORAGE_ES_QUERY_MAX_SIZE:5000}
    segmentQueryMaxSize: ${SW_STORAGE_ES_QUERY_SEGMENT_SIZE:200}
    profileTaskQueryMaxSize: ${SW_STORAGE_ES_QUERY_PROFILE_TASK_SIZE:200}
    oapAnalyzer: ${SW_STORAGE_ES_OAP_ANALYZER:"{\"analyzer\":{\"oap_analyzer\":{\"type\":\"stop\"}}}"} # the oap analyzer.
    oapLogAnalyzer: ${SW_STORAGE_ES_OAP_LOG_ANALYZER:"{\"analyzer\":{\"oap_log_analyzer\":{\"type\":\"standard\"}}}"} # the oap log analyzer. It could be customized by the ES analyzer configuration to support more language log formats, such as Chinese log, Japanese log and etc.
    advanced: ${SW_STORAGE_ES_ADVANCED:""}
agent-analyzer:
  selector: ${SW_AGENT_ANALYZER:default}
  default:
    sampleRate: ${SW_TRACE_SAMPLE_RATE:10000} # 
    slowDBAccessThreshold: ${SW_SLOW_DB_THRESHOLD:default:200,mongodb:100} #
    forceSampleErrorSegment: ${SW_FORCE_SAMPLE_ERROR_SEGMENT:true} #
    segmentStatusAnalysisStrategy: ${SW_SEGMENT_STATUS_ANALYSIS_STRATEGY:FROM_SPAN_STATUS} 
    noUpstreamRealAddressAgents: ${SW_NO_UPSTREAM_REAL_ADDRESS:6000,9000}
    slowTraceSegmentThreshold: ${SW_SLOW_TRACE_SEGMENT_THRESHOLD:-1} #
    meterAnalyzerActiveFiles: ${SW_METER_ANALYZER_ACTIVE_FILES:} #
log-analyzer:
  selector: ${SW_LOG_ANALYZER:default}
  default:
    lalFiles: ${SW_LOG_LAL_FILES:default}
    malFiles: ${SW_LOG_MAL_FILES:""}
event-analyzer:
  selector: ${SW_EVENT_ANALYZER:default}
  default:
receiver-sharing-server:
  selector: ${SW_RECEIVER_SHARING_SERVER:default}
  default:
    # For Jetty server
    restHost: ${SW_RECEIVER_SHARING_REST_HOST:0.0.0.0}
    restPort: ${SW_RECEIVER_SHARING_REST_PORT:0}
    restContextPath: ${SW_RECEIVER_SHARING_REST_CONTEXT_PATH:/}
    restMinThreads: ${SW_RECEIVER_SHARING_JETTY_MIN_THREADS:1}
    restMaxThreads: ${SW_RECEIVER_SHARING_JETTY_MAX_THREADS:200}
    restIdleTimeOut: ${SW_RECEIVER_SHARING_JETTY_IDLE_TIMEOUT:30000}
    restAcceptorPriorityDelta: ${SW_RECEIVER_SHARING_JETTY_DELTA:0}
    restAcceptQueueSize: ${SW_RECEIVER_SHARING_JETTY_QUEUE_SIZE:0}
    httpMaxRequestHeaderSize: ${SW_RECEIVER_SHARING_HTTP_MAX_REQUEST_HEADER_SIZE:8192}
    # For gRPC server
    gRPCHost: ${SW_RECEIVER_GRPC_HOST:0.0.0.0}
    gRPCPort: ${SW_RECEIVER_GRPC_PORT:0}
    maxConcurrentCallsPerConnection: ${SW_RECEIVER_GRPC_MAX_CONCURRENT_CALL:0}
    maxMessageSize: ${SW_RECEIVER_GRPC_MAX_MESSAGE_SIZE:0}
    gRPCThreadPoolQueueSize: ${SW_RECEIVER_GRPC_POOL_QUEUE_SIZE:0}
    gRPCThreadPoolSize: ${SW_RECEIVER_GRPC_THREAD_POOL_SIZE:0}
    gRPCSslEnabled: ${SW_RECEIVER_GRPC_SSL_ENABLED:false}
    gRPCSslKeyPath: ${SW_RECEIVER_GRPC_SSL_KEY_PATH:""}
    gRPCSslCertChainPath: ${SW_RECEIVER_GRPC_SSL_CERT_CHAIN_PATH:""}
    authentication: ${SW_AUTHENTICATION:""}
receiver-register:
  selector: ${SW_RECEIVER_REGISTER:default}
  default:
receiver-trace:
  selector: ${SW_RECEIVER_TRACE:default}
  default:
receiver-jvm:
  selector: ${SW_RECEIVER_JVM:default}
  default:
receiver-clr:
  selector: ${SW_RECEIVER_CLR:default}
  default:
receiver-profile:
  selector: ${SW_RECEIVER_PROFILE:default}
  default:
receiver-zabbix:
  selector: ${SW_RECEIVER_ZABBIX:-}
  default:
    port: ${SW_RECEIVER_ZABBIX_PORT:10051}
    host: ${SW_RECEIVER_ZABBIX_HOST:0.0.0.0}
    activeFiles: ${SW_RECEIVER_ZABBIX_ACTIVE_FILES:agent}
service-mesh:
  selector: ${SW_SERVICE_MESH:default}
  default:
envoy-metric:
  selector: ${SW_ENVOY_METRIC:default}
  default:
    acceptMetricsService: ${SW_ENVOY_METRIC_SERVICE:true}
    alsHTTPAnalysis: ${SW_ENVOY_METRIC_ALS_HTTP_ANALYSIS:""}
    alsTCPAnalysis: ${SW_ENVOY_METRIC_ALS_TCP_ANALYSIS:""}
    k8sServiceNameRule: ${K8S_SERVICE_NAME_RULE:"${pod.metadata.labels.(service.istio.io/canonical-name)}"}
prometheus-fetcher:
  selector: ${SW_PROMETHEUS_FETCHER:-}
  default:
    enabledRules: ${SW_PROMETHEUS_FETCHER_ENABLED_RULES:"self"}
    maxConvertWorker: ${SW_PROMETHEUS_FETCHER_NUM_CONVERT_WORKER:-1}
kafka-fetcher:
  selector: ${SW_KAFKA_FETCHER:-}
  default:
    bootstrapServers: ${SW_KAFKA_FETCHER_SERVERS:localhost:9092}
    namespace: ${SW_NAMESPACE:""}
    partitions: ${SW_KAFKA_FETCHER_PARTITIONS:3}
    replicationFactor: ${SW_KAFKA_FETCHER_PARTITIONS_FACTOR:2}
    enableNativeProtoLog: ${SW_KAFKA_FETCHER_ENABLE_NATIVE_PROTO_LOG:false}
    enableNativeJsonLog: ${SW_KAFKA_FETCHER_ENABLE_NATIVE_JSON_LOG:false}
    isSharding: ${SW_KAFKA_FETCHER_IS_SHARDING:false}
    consumePartitions: ${SW_KAFKA_FETCHER_CONSUME_PARTITIONS:""}
    kafkaHandlerThreadPoolSize: ${SW_KAFKA_HANDLER_THREAD_POOL_SIZE:-1}
    kafkaHandlerThreadPoolQueueSize: ${SW_KAFKA_HANDLER_THREAD_POOL_QUEUE_SIZE:-1}
receiver-meter:
  selector: ${SW_RECEIVER_METER:default}
  default:
receiver-otel:
  selector: ${SW_OTEL_RECEIVER:-}
  default:
    enabledHandlers: ${SW_OTEL_RECEIVER_ENABLED_HANDLERS:"oc"}
    enabledOcRules: ${SW_OTEL_RECEIVER_ENABLED_OC_RULES:"istio-controlplane"}
receiver_zipkin:
  selector: ${SW_RECEIVER_ZIPKIN:-}
  default:
    host: ${SW_RECEIVER_ZIPKIN_HOST:0.0.0.0}
    port: ${SW_RECEIVER_ZIPKIN_PORT:9411}
    contextPath: ${SW_RECEIVER_ZIPKIN_CONTEXT_PATH:/}
    jettyMinThreads: ${SW_RECEIVER_ZIPKIN_JETTY_MIN_THREADS:1}
    jettyMaxThreads: ${SW_RECEIVER_ZIPKIN_JETTY_MAX_THREADS:200}
    jettyIdleTimeOut: ${SW_RECEIVER_ZIPKIN_JETTY_IDLE_TIMEOUT:30000}
    jettyAcceptorPriorityDelta: ${SW_RECEIVER_ZIPKIN_JETTY_DELTA:0}
    jettyAcceptQueueSize: ${SW_RECEIVER_ZIPKIN_QUEUE_SIZE:0}
    instanceNameRule: ${SW_RECEIVER_ZIPKIN_INSTANCE_NAME_RULE:[spring.instance_id,node_id]}
receiver_jaeger:
  selector: ${SW_RECEIVER_JAEGER:-}
  default:
    gRPCHost: ${SW_RECEIVER_JAEGER_HOST:0.0.0.0}
    gRPCPort: ${SW_RECEIVER_JAEGER_PORT:14250}
receiver-browser:
  selector: ${SW_RECEIVER_BROWSER:default}
  default:
    sampleRate: ${SW_RECEIVER_BROWSER_SAMPLE_RATE:10000}
receiver-log:
  selector: ${SW_RECEIVER_LOG:default}
  default:
query:
  selector: ${SW_QUERY:graphql}
  graphql:
    path: ${SW_QUERY_GRAPHQL_PATH:/graphql}
alarm:
  selector: ${SW_ALARM:default}
  default:
telemetry:
  selector: ${SW_TELEMETRY:none}
  none:
  prometheus:
    host: ${SW_TELEMETRY_PROMETHEUS_HOST:0.0.0.0}
    port: ${SW_TELEMETRY_PROMETHEUS_PORT:1234}
    sslEnabled: ${SW_TELEMETRY_PROMETHEUS_SSL_ENABLED:false}
    sslKeyPath: ${SW_TELEMETRY_PROMETHEUS_SSL_KEY_PATH:""}
    sslCertChainPath: ${SW_TELEMETRY_PROMETHEUS_SSL_CERT_CHAIN_PATH:""}
configuration:
  zookeeper:		# 表示使用zookeeper做高可用
    period: ${SW_CONFIG_ZK_PERIOD:60} # Unit seconds, sync period. Default fetch every 60 seconds.
    nameSpace: ${SW_CONFIG_ZK_NAMESPACE:/default}
    hostPort: ${SW_CONFIG_ZK_HOST_PORT:localhost:2182}
    # Retry Policy
    baseSleepTimeMs: ${SW_CONFIG_ZK_BASE_SLEEP_TIME_MS:1000} # initial amount of time to wait between retries
    maxRetries: ${SW_CONFIG_ZK_MAX_RETRIES:3} # max number of times to retry
exporter:
  selector: ${SW_EXPORTER:-}
  grpc:
    targetHost: ${SW_EXPORTER_GRPC_HOST:127.0.0.1}
    targetPort: ${SW_EXPORTER_GRPC_PORT:9870}
health-checker:
  selector: ${SW_HEALTH_CHECKER:-}
  default:
    checkIntervalSeconds: ${SW_HEALTH_CHECKER_INTERVAL_SECONDS:5}
configuration-discovery:
  selector: ${SW_CONFIGURATION_DISCOVERY:default}
  default:
    disableMessageDigest: ${SW_DISABLE_MESSAGE_DIGEST:false}
receiver-event:
  selector: ${SW_RECEIVER_EVENT:default}
  default:

skywalking web ui配置文件修改

cat > /opt/apache-skywalking-apm-bin/webapp/webapp.yml << EOF
server:
  port: 5000

spring:
  cloud:
    gateway:
      routes:
        - id: oap-route
          uri: lb://oap-service
          predicates:
            - Path=/graphql/**
    discovery:
      client:
        simple:
          instances:
            oap-service:
              - uri: http://10.0.33.193:12800
              - uri: http://10.0.33.194:12800
              - uri: http://10.0.33.195:12800

  mvc:
    throw-exception-if-no-handler-found: true

  web:
    resources:
      add-mappings: true

management:
  server:
    base-path: /manage

启动

/opt/apache-skywalking-apm-bin/bin/startup.sh
SkyWalking OAP started successfully!
SkyWalking Web Application started successfully!

测试

java
-javaagent:/opt/apache-skywalking-apm-bin/agent/skywalking-agent.jar
-Dskywalking.agent.namespace=prod
-Dskywalking.agent.service_name=spring-boot-docker
-Dskywalking.collector.backend_service=10.0.33.193:5006,10.0.33.194:5006,10.0.33.195:5006
-jar /root/spring-boot-docker/target/spring-boot-docker-0.1.0.jar

客户端配置
在各个微服务侧配配置java 启动参数:
-Dskywalking.collector.backend_service=10.0.33.193:5006,10.0.33.194:5006,10.0.33.195:5006

Logo

华为开发者空间,是为全球开发者打造的专属开发空间,汇聚了华为优质开发资源及工具,致力于让每一位开发者拥有一台云主机,基于华为根生态开发、创新。

更多推荐