本文共 5056 字,大约阅读时间需要 16 分钟。
搭建高可用的mesos时对原来的项目还是有很大规模的修改的,同时还修改了一些以前遗留的bug
机器环境
[all]192.168.50.4192.168.50.5192.168.50.6192.168.50.7[master]192.168.50.4192.168.50.5192.168.50.6[slave]192.168.50.4192.168.50.5192.168.50.6192.168.50.7
zookeepr+mesos-master+marathon均部署在master中,mesos-slave+docker均部署在slave中
具体搭建可以看
接下来说一下搭建中的问题
问题描述:zookeepr需要动态创建每一个zookeepr的myid文件以及其中的内容,接下来就是如何用ansible从zoo.cfg中抽取出当前的主机的id
# 创建myid file- name: Make id file file: path={ {remote_dir}}/zookeeper/data/myid state=touch# 获得本机IP,获得的IP用于从zoo.cfg中匹配id# ps1: grep eth1是我的网卡,你的编号可能不同# ps2: cut命令 -d选项是分隔符,-f是分割后的第几个区间的字符串- name: get ip shell: ip addr|grep eth1|grep inet|awk '{print $2}'| cut -d / -f 1 register: local_ip# 根据ip匹配id# ps: 重点还是cut命令的巧用- name: get id shell: "grep { {local_ip['stdout']}} { {remote_dir}}/zookeeper/conf/zoo.cfg|cut -d \\= -f 1|cut -d \\. -f 2" register: myid# debug用可注释- name: echo debug: msg={ {myid}}# 将id写入- name: write id lineinfile: path={ {remote_dir}}/zookeeper/data/myid line={ {myid['stdout']}}
问题描述: Mesos-master: Shutdown failed on fd=xx: Transport endpoint is not connected [107]
启用mesos的advertise_ip选项
引用:
问题描述: 只有leader的marathon服务的8080端口才可访问,其它机器的8080端口均503
启动marathon时添加hostname选项,非leader节点的服务才可以重定向到leader节点
为了启动以及停止mesos与marathon方便,我编写了它们两个的启动脚本,仓库zookeeper的启动脚本
mesos.sh#!/usr/bin/env bashMESOSBINDIR="$( cd "$( dirname "$0" )" && pwd )"MASTER_WORK_DIR="/data/mesos/master"MASTER_LOG_DIR="/data/mesos/master/log"SLAVE_WORK_DIR="/data/mesos/slave"SLAVE_LOG_DIR="/data/mesos/slave/log"USAGE=" hostname and advertise_ip quorum zk is reuired \n--hostname\n--advertise_ip \n--quorum \n--zk"hostname=""advertise_ip=""quorum=""zk=""master=""case "$1" in start_master ) while [[ -n "$2" ]]; do case "$2" in --hostname ) hostname=$3; shift 2;; --advertise_ip ) advertise_ip=$3; shift 2;; --quorum ) quorum=$3; shift 2;; --zk ) zk=$3; shift 2;; * ) break;; esac done if [ "$advertise_ip" = "" -o "$hostname" = "" -o "$quorum" = "" -o "$zk" = "" ]; then echo "error options" exit -1 fi echo -n "Staring mesos-master ..." nohup "${MESOSBINDIR}/mesos-master" "--hostname=$hostname" "--advertise_ip=$advertise_ip" \ "--quorum=$quorum" "--work_dir=$MASTER_WORK_DIR" "--zk=$zk" "--log_dir=$MASTER_LOG_DIR" & echo "started" ;; stop_master ) pid=`ps -ef|grep mesos-master|grep -v "grep"|awk '{print $2}'` if [ "$pid" = "" ]; then echo "No mesos master server started" exit 0 fi kill -9 $pid echo "Mesos master server stoped" ;; restart_master ) shift "$0" stop_master ${ @} sleep 5 "$0" start_master ${ @} ;; start_slave ) while [[ -n "$2" ]]; do case "$2" in --hostname ) hostname=$3; shift 2;; --advertise_ip ) advertise_ip=$3; shift 2;; --master ) master=$3; shift 2;; * ) break;; esac done if [ "$advertise_ip" = "" -o "$hostname" = "" -o "$master" = "" ]; then echo -n "error options" exit -1 fi echo "Starting mesos slave server ..." nohup "${MESOSBINDIR}/mesos-agent" "--hostname=$hostname" "--advertise_ip=$advertise_ip" \ "--work_dir=$SLAVE_WORK_DIR" "--master=$master" "--log_dir=$SLAVE_WORK_DIR" & echo "started" ;; stop_slave ) pid=`ps -ef|grep mesos-agent|grep -v "grep"|awk '{print $2}'` if [ "$pid" = "" ]; then echo "No mesos slave server started" exit 0 fi kill -9 $pid echo "Mesos slave server stoped" ;; restart_slave ) shift "$0" stop_slave ${ @} sleep 5 "$0" start_slave ${ @} ;; * ) echo -e $USAGE ;;esac
marathon.sh
#!/usr/bin/env bashMARATHONBINDIR="$( cd "$( dirname "$0" )" && pwd )"USAGE=" master and zk is reuired \n--master \n--zk"master=""zk=""libmesos_path=""hostname=""case "$1" in start ) while [[ -n "$2" ]]; do case "$2" in --master ) master=$3; shift 2;; --zk ) zk=$3; shift 2;; --libmesos_path ) libmesos_path=$3; shift 2;; --hostname ) hostname=$3; shift 2;; * ) break;; esac done if [ "$master" = "" -o "$zk" = "" -o "$hostname" = ""]; then echo "error options" exit -1 fi echo -n "Staring mesos-master ..." if [ ["$libmesos_path" = ""] ]; then nohup "${MARATHONBINDIR}/marathon" "--master" "$master" "--zk" "$zk" "--hostname" "$hostname"& else export MESOS_NATIVE_JAVA_LIBRARY=${libmesos_path} nohup "${MARATHONBINDIR}/marathon" "--master" "$master" "--zk" "$zk" "--hostname" "$hostname"& fi echo "started" ;; stop ) pid=`ps -ef|grep marathon|grep -v "grep"|awk '{print $2}'` if [ "$pid" = "" ]; then echo "No marathon server started" exit 0 fi kill -9 $pid echo "Mesos master server stoped" ;; restart ) shift "$0" stop ${ @} sleep 5 "$0" start ${ @} ;;esac
脚本还有一点小bug,即启动前没判断是否已存在进程,下次commit时应该会一并更改吧,接下来的文章就是在HA模式的环境下的应用部署操作了
转载地址:http://tigmi.baihongyu.com/