Skynet is a lightweight online game framework

skynet logo

Skynet is a lightweight online game framework which can be used in many other fields.

Build

For Linux, install autoconf first for jemalloc:

git clone https://github.com/cloudwu/skynet.git
cd skynet
make 'PLATFORM'  # PLATFORM can be linux, macosx, freebsd now

Or:

export PLAT=linux
make

For FreeBSD , use gmake instead of make.

Test

Run these in different consoles:

./skynet examples/config	# Launch first skynet node  (Gate server) and a skynet-master (see config for standalone option)
./3rd/lua/lua examples/client.lua 	# Launch a client, and try to input hello.

About Lua version

Skynet now uses a modified version of lua 5.4.2 ( https://github.com/ejoy/lua/tree/skynet54 ) for multiple lua states.

Official Lua versions can also be used as long as the Makefile is edited.

How To Use

Owner
云风
coder ( c , lua , open source )
云风
Comments
  • SSM版本实践

    SSM版本实践

    我之前在空闲时间一直对ssm分支做测试,优化,在昨天最后一次提交 #1026 后。压测了一天后,内存还是稳定的,之前的内存错误和泄漏已经没有了,这个版本已经趋近稳定。

    测试过程发现ssm版本有一个最重要的问题是回收效率(可能有之前的bug影响),我尝试了一个优化方案去减少SSM短字符串产生,有很多临时字符串是不需要进入SSM的,如

    • 比如说二进制字符串(skynet.packstring,sproto.encode,string.pack,...),
    • 不会用于比较的字符串(skynet.error传递的字符串)

    方法是单独提供一个接口使用luaS_createlngstrobj(对外的接口是lua_pushtmpstring)创建字符串,如果字符串长度大于LUAI_MAXSHORTLEN 那么就和LUA_TLNGSTR表现一致,小于等于LUAI_MAXSHORTLEN就会有问题,但是这些字符串不用于比较所以也没有问题,主要是这些字符串不会进入SSM。

    • 最开始的没有统计

    • 使用了lua_pushtmpstring替换了lua-sproto中打包,压缩,解压代码中使用的lua_pushlstring 1000机器人压测SSM字符串新增是80万/分钟

    • 然后继续优化了 lua-seri中的packstring,string库中添加了一些列接口tpack,tformat(替换对应的接口中的lua_pushlstring) 1000机器人压测SSM字符串新增是18万/分钟

    初步达到了效果,但是我今天中午整理代码的我觉得改动的有掉多了,坑肯定更多。不小心用错了都很难发现。但是可不可以支持lua_pushtmpstring和LUA_TLNGSTR的兼容呢。

    我能想到的就是修改luaV_equalobj,但是有了这个思路,那么我可以不可以直接支持不同虚拟机产生的LUA_TLNGSTR,那么sharedtable就可以直接使用了,SSM可以不用了,那么就不需要lua_pushtmpstring,上面的优化都不需要了。

    中午我试了一下这个想法,把

    #define eqshrstr(a,b)	check_exp((a)->tt == LUA_TSHRSTR, (a) == (b))
    

    替换成

    #define eqshrstr(a,b)	check_exp((a)->tt == LUA_TSHRSTR, (a) == (b)?1:((isshared(a)||isshared(b))?luaS_eqshrstr(a, b):0))
    

    luaS_eqshrstr直接实现不同指针的短字符串比较 把LUA_TSHRSTR字符串加入了共享来支持sharetable 删除了SSM相关的代码

    初步实现了该想法 提交(31bc54c3af413f3ee673aab1062f703b5619ed7f) 分支(https://github.com/hongling0/skynet/tree/sharedshrstr)

    @cloudwu 看要不要采纳这个修改,还有没有什么问题,如果采纳的话我再提交pr

  • socket处于close_wait状态无法关闭

    socket处于close_wait状态无法关闭

    skynet更新到最新代码遇到了一个问题。

    之前版本我经常使用 ehco "help" | nc localhost port 这样的命令来执行一些 debug 命令。因为 nc 运行完会自动退出,所以socket也就断开了,skynet那边也能正常的关闭掉这条连接。但新的skynet代码运行后,发现通过这样的命令开启的socket连接无法正确断开了。nc的端口这边处于 FIN_WAIT2 状态,skynet的端口处于 CLOSE_WAIT 状态,说明 nc 这边发送了 FIN,skynet 收到了FIN,但是 nc 这边没有收到 skynet 回发的 FIN。怀疑是新加的半连接状态哪里处理的有问题。我是用 debug_console 测试的。其他类型服务开启的端口应该也会有类似问题。

    QQ截图20210309151404

  • Protect against spurious wakeups

    Protect against spurious wakeups

  • 关于skynet网络收发问题

    关于skynet网络收发问题

    在网络收发量较大的时候(MMO大量场景广播消息),skynet socket线程那里表面看起来反应缓慢,accept和正常网络收发的反应都有较大延迟,下面描述下:

    服务器负荷: us 用户态占比 sy 系统态占比 id 空闲态占比 hi 硬中断 si 软中断 wa 等待IO 16核16G,每核心大致都是 (5-25us) + (5-30sy),且每核基本都能保持50id,偶尔会有一两个核有5-15si,没发现hi和wa 服务器出入口带宽使用: 出:30Mb/s 入:20Mb/s 内存使用: 较为平稳,关闭了swap

    在出现卡点的时候,在socket线程epoll抛ACCEPT事件那里打log,发现在服务器本地nc上去,大概等了5秒才有反应并成功出log,如果是nc其他进程的端口则正常,证明有某些原因导致skynet epoll对事件的侦测出现了延缓,暂不确定是内核提交事件的延缓(系统抑或进程的部分缓冲区的限制?),还是skynet网络收发线程过于繁忙。但如上所述,如果是skynet线程问题,不应该出现16个核心有稳定的空闲态,这里比较费解。

  • 服务器在线上跑了几天,遇到了一个网络线程卡死的情况,但是又很难重现。

    服务器在线上跑了几天,遇到了一个网络线程卡死的情况,但是又很难重现。

    1,发现有一条线程占用CPU到90多左右 PID为21042 2,用pstack看了一下该进程,发现PID为21042的线程的堆栈为 write() _IO_new_file_write() _GI__IO_file_write() buffered_vfprintf() vfprintf() fprintf() send_list_udp() send_list() send_buffer() send_buffer() socket_server_poll() skynet_socket_poll() thread_socket() start_thread() ... 3,用strace查看了一下,发现该网络线程一直在做下面事情 write(2, "socket-server":udp(3974) sendt"..., 58) select(6, [5], NULL, NULL, {0,0}) = 0 (TimeOut) sendto(14, '\0\4\1\0\0\1\0\0\0\0\0\0\3api\10music440\3com\0\0\1".... 34, 0, 0x7fbbfc3ff590, 0) = -1 EINVAL (Invalid argument) 重复上面的输出

    4,用strace查出一直在重复的调用sendto失败了 static int send_list_udp(struct socket_server *ss, struct socket *s, struct wb_list *list, struct socket_message result) { while (list->head) { struct write_buffer * tmp = list->head; union sockaddr_all sa; socklen_t sasz = udp_socket_address(s, tmp->udp_address, &sa); int err = sendto(s->fd, tmp->ptr, tmp->sz, 0, &sa.s, sasz); if (err < 0) { switch(errno) { case EINTR: case AGAIN_WOULDBLOCK: return -1; } fprintf(stderr, "socket-server : udp (%d) sendto error %s.\n",s->id, strerror(errno)); return -1; / // ignore udp sendto error

    		result->opaque = s->opaque;
    		result->id = s->id;
    		result->ud = 0;
    		result->data = NULL;
    
    		return SOCKET_ERR;
    

    */ }

    	s->wb_size -= tmp->sz;
    	list->head = tmp->next;
    	write_buffer_free(ss,tmp);
    }
    list->tail = NULL;
    
    return -1;
    

    } 5,理论上sendto(14, '\0\4\1\0\0\1\0\0\0\0\0\0\3api\10music440\3com\0\0\1".... 34, 0, 0x7fbbfc3ff590, 0)最后一个参数不应为0的,但是调用时最后一个参数却是0,但是一直报Invalid argument错误

    6,业务层用有用了dns解析,所以这里用了udp

    7,没有更多的信息了,因为是在线上运行,发现有死循环后大致看了一下就把服务器重启了,在线下又很难重现出来

  • 这代码真不忍看

    这代码真不忍看

    什么东西都自己造个轮子,这样真的好么?代码还没有严格的工程化风格。各种不同含义的int不能用typedef重命名一下?看的不是一般的难受。c里面还硬编码了lua代码。用c++可以不超过2000行代码,用go可以不超过500行代码搞定的事,结果整出数万行代码。

    c调用那么多lua vm的代码好么?lua vm和c互相调用,简直和机房的布线一样乱。连个流程文档都没有。

    现在为止,没有一个ide支持makefile,该换了。要不要我pr一个cmake?

  • socket 重连重入可能导致协程挂起不返回

    socket 重连重入可能导致协程挂起不返回

    我有一个 db 服务,服务中有一个连接 mysql 的连接和一个 redis 连接(可忽略),mysql 设置了 interactive_timeout=1800, wait_timeout=1800, 超时连接断开时,服务有打印 socket closed by peer : 127.0.0.1 3306, 在连接断开的情况下,db 服务收到了连续两条消息,都执行 skynet.db.mysql.query

    function _M.query(self, query)
        local querypacket = _compose_query(self, query)
        local sockchannel = self.sockchannel
        if not self.query_resp then
            self.query_resp = _query_resp(self)
        end
        return sockchannel:request(querypacket, self.query_resp)
    end
    

    服务的协程列表

    task :0000001d
    thread: 0x7fa9700bc7e8 session: 51482	stack traceback:
    	[C]: in function 'coroutine.yield'
    	.../framework_one_world/skynet_ext/skynet/lualib/skynet.lua:390: in upvalue 'suspend_sleep'
    	.../framework_one_world/skynet_ext/skynet/lualib/skynet.lua:416: in function 'skynet.wait'
    	..._world/skynet_ext/skynet/lualib/skynet/socketchannel.lua:127: in upvalue 'pop_response'
    	..._world/skynet_ext/skynet/lualib/skynet/socketchannel.lua:177: in function <..._world/skynet_ext/skynet/lualib/skynet/socketchannel.lua:175>
    	[C]: in function 'skynet.pcall'
    	..._world/skynet_ext/skynet/lualib/skynet/socketchannel.lua:324: in upvalue 'f'
    	.../framework_one_world/skynet_ext/skynet/lualib/skynet.lua:280: in function <.../framework_one_world/skynet_ext/skynet/lualib/skynet.lua:252>
    thread: 0x7fa974106018 session: 24167	stack traceback:
    	[C]: in function 'coroutine.yield'
    	.../framework_one_world/skynet_ext/skynet/lualib/skynet.lua:390: in upvalue 'suspend_sleep'
    	.../framework_one_world/skynet_ext/skynet/lualib/skynet.lua:416: in function 'skynet.wait'
    	..._world/skynet_ext/skynet/lualib/skynet/socketchannel.lua:422: in upvalue 'block_connect'
    	..._world/skynet_ext/skynet/lualib/skynet/socketchannel.lua:480: in function <..._world/skynet_ext/skynet/lualib/skynet/socketchannel.lua:479>
    	(...tail calls...)
    	[C]: in function 'skynet.pcall'
    	...p_server/framework_one_world/skynet_ext/lualib/mysql.lua:26: in method 'execute'
    	...p_server/framework_one_world/skynet_ext/lualib/mysql.lua:147: in method 'query'
    	..._server/framework_one_world/service/dbd/mysqlHandler.lua:26: in function 'mysqlHandler.query'
    	.../pp_server/framework_one_world/service/dbd/dbStorage.lua:41: in upvalue 'readFromMysql'
    	.../pp_server/framework_one_world/service/dbd/dbStorage.lua:118: in function 'dbStorage.get'
    	...er/framework_one_world/service/dbd/strings/stringsDB.lua:75: in method 'get'
    	...amework_one_world/service/dbd/strings/stringsHandler.lua:54: in function 'strings.stringsHandler.get'
    	...p/pp_server/framework_one_world/service/dbd/luaproto.lua:40: in function <...p/pp_server/framework_one_world/service/dbd/luaproto.lua:31>
    	(...tail calls...)
    	[C]: in function 'xpcall'
    	...d/skynet_ext/lualib/serviceFramework/serviceLauncher.lua:151: in upvalue 'f'
    	.../framework_one_world/skynet_ext/skynet/lualib/skynet.lua:253: in function <.../framework_one_world/skynet_ext/skynet/lualib/skynet.lua:252>
    thread: 0x7fa984151128 session: 51485	stack traceback:
    	[C]: in function 'coroutine.yield'
    	.../framework_one_world/skynet_ext/skynet/lualib/skynet.lua:390: in upvalue 'suspend_sleep'
    	.../framework_one_world/skynet_ext/skynet/lualib/skynet.lua:416: in function 'skynet.wait'
    	..._world/skynet_ext/skynet/lualib/skynet/socketchannel.lua:127: in upvalue 'pop_response'
    	..._world/skynet_ext/skynet/lualib/skynet/socketchannel.lua:177: in function <..._world/skynet_ext/skynet/lualib/skynet/socketchannel.lua:175>
    	[C]: in function 'skynet.pcall'
    	..._world/skynet_ext/skynet/lualib/skynet/socketchannel.lua:324: in upvalue 'f'
    	.../framework_one_world/skynet_ext/skynet/lualib/skynet.lua:280: in function <.../framework_one_world/skynet_ext/skynet/lualib/skynet.lua:252>
    

    mysql 对象:

    {
    	"mysqlObj" = {
    		"_server_status" = 2,
    		"packet_no" = 1,
    		"_server_capabilities" = 3254779903,
    		"_server_ver" = "5.7.30-33-log",
    		"protocol_ver" = 10,
    		"query_resp" = function: 0x7fa9841a70d0,
    		"_max_packet_size" = 1048576,
    		"sockchannel" = {
    			"__dispatch_thread" = thread: 0x7fa9700bc7e8,
    			"__authcoroutine" = false,
    			"__auth" = function: 0x7fa968001040,
    			"__thread" = {},
    			"__request" = {},
    			"__overload" = false,
    			"__sock" = false,
    			"__closed" = false,
    			"__result" = {},
    			"__connecting" = {
    				2 = thread: 0x7fa974106018,
    			},
    			"__host" = "127.0.0.1",
    			"__port" = 3306,
    			"__wait_response" = thread: 0x7fa9700bc7e8,
    			"__result_data" = {},
    		},
    		"_server_lang" = 45,
    	},
    }
    

    __connecting 中的协程没有被 wakeup

    我在本地把 mysql 连接超时设置为 10s,连接断开后注入一个测试脚本到 db 服,连续 fork 两个协程调用 mysql.query 不是必现,有概率出现

  • lua5.4中lua-bson中判断table是否是数组is_rawarray的问题

    lua5.4中lua-bson中判断table是否是数组is_rawarray的问题

    这篇文章的方式好像不适用lua5.4, https://blog.codingnow.com/2016/06/seri_lua_object.html 这个table用is_rawarray判断会认为是数组

    local t = {
        [1] = "a",
        [2] = "b",
        [100] = "c",
    }
    

    发现 luaH_next函数5.4有所改动: unsigned int asize = luaH_realasize(t);. 上面示例执行时直接从hash part开始了

    lua5.4

    int luaH_next (lua_State *L, Table *t, StkId key) {
      unsigned int asize = luaH_realasize(t);
      unsigned int i = findindex(L, t, s2v(key), asize);  /* find original key */
      for (; i < asize; i++) {  /* try first array part */
        if (!isempty(&t->array[i])) {  /* a non-empty entry? */
          setivalue(s2v(key), i + 1);
          setobj2s(L, key + 1, &t->array[i]);
          return 1;
        }
      }
      for (i -= asize; cast_int(i) < sizenode(t); i++) {  /* hash part */
        if (!isempty(gval(gnode(t, i)))) {  /* a non-empty entry? */
          Node *n = gnode(t, i);
          getnodekey(L, s2v(key), n);
          setobj2s(L, key + 1, gval(n));
          return 1;
        }
      }
      return 0;  /* no more elements */
    }
    

    lua5.3

    int luaH_next (lua_State *L, Table *t, StkId key) {
      unsigned int i = findindex(L, t, key);  /* find original element */
      for (; i < t->sizearray; i++) {  /* try first array part */
        if (!ttisnil(&t->array[i])) {  /* a non-nil value? */
          setivalue(key, i + 1);
          setobj2s(L, key+1, &t->array[i]);
          return 1;
        }
      }
      for (i -= t->sizearray; cast_int(i) < sizenode(t); i++) {  /* hash part */
        if (!ttisnil(gval(gnode(t, i)))) {  /* a non-nil value? */
          setobj2s(L, key, gkey(gnode(t, i)));
          setobj2s(L, key+1, gval(gnode(t, i)));
          return 1;
        }
      }
      return 0;  /* no more elements */
    }
    
  • mongodb模块是不是从来没测试过呢,还是使用问题呢?

    mongodb模块是不是从来没测试过呢,还是使用问题呢?

    配置好testmongodb.lua 然后用skynet拉起来 一直连接失败,已确认网络无问题。延迟也可以接受 然跟代码发现, lualib/skynet/db/mongo.lua 162

    obj.__sock = socketchannel.channel {
    		host = obj.host,
    		port = obj.port,
    		response = dispatch_reply,
    		auth = mongo_auth(obj),
    		backup = backup,
    		nodelay = true,
    	}
    	setmetatable(obj, client_meta)
    	obj.__sock:connect(true)	-- try connect only	once
    	return obj
    

    是不是mongo_auth(obj)在没有连接的情况下就被调用了。 因为obj.__sock:connect(true)在后面才被执行。

  • 升级1.4.0,发现在强杀客户端的情况下有概率出现gateserver.lua中的MSG.close(fd)函数没调用到

    升级1.4.0,发现在强杀客户端的情况下有概率出现gateserver.lua中的MSG.close(fd)函数没调用到

    说明:1.3 版本的没有这个问题。同样都有用nginx做转发业务。

    MSG.close 事件由 lua-netpack.c 中的 lfilter 函数触发,类型是 SKYNET_SOCKET_TYPE_CLOSE 而该事件,是由 skynet_socket_poll 触发,类型是 SOCKET_CLOSE

    而 Commit: 56c70dc7ace1fb668debc254b30adddd711f7e48 这次修复 fix socket half close (#1263) 问题有改动到代码。 如下 if (n==0) 的情况

    static int
    forward_message_tcp(struct socket_server *ss, struct socket *s, struct socket_lock *l, struct socket_message * result) {
    	int sz = s->p.size;
    	char * buffer = MALLOC(sz);
    	int n = (int)read(s->fd, buffer, sz);
    	if (n<0) {
    		FREE(buffer);
    		switch(errno) {
    		case EINTR:
    			break;
    		case AGAIN_WOULDBLOCK:
    			skynet_error(NULL, "socket-server: EAGAIN capture.");
    			break;
    		default:
    			// close when error
    			force_close(ss, s, l, result);
    			result->data = strerror(errno);
    			return SOCKET_ERR;
    		}
    		return -1;
    	}
    	if (n==0) {
    		FREE(buffer);
    		if (nomore_sending_data(s)) {
    			force_close(ss,s,l,result); 
    			return SOCKET_CLOSE;
    		} else { 
    			s->type = SOCKET_TYPE_HALFCLOSE;
    			shutdown(s->fd, SHUT_RD);
    			return -1;  // 怀疑此处返回值有问题
    		}
    	}
    
    	if (s->type == SOCKET_TYPE_HALFCLOSE) {
    		// discard recv data
    		FREE(buffer);
    		return -1;
    	}
    
    	stat_read(ss,s,n);
    
    	if (n == sz) {
    		s->p.size *= 2;
    	} else if (sz > MIN_READ_BUFFER && n*2 < sz) {
    		s->p.size /= 2;
    	}
    
    	result->opaque = s->opaque;
    	result->id = s->id;
    	result->ud = n;
    	result->data = buffer;
    	return SOCKET_DATA;
    }
    
  • Websocket

    Websocket

    #1053 使用之前加的的ltls实现了websocket wswss协议。提供接口如下:

    websocket.accept(socket_id, handle, protocol) 接受客户端socket_id发送过来的websocket协议。 handle为通知接口,protocolwswss,默认为ws

    websocket.connect(url, header) 发起一个websocket连接,返回socket id。

    websocket.read(id) 以websocket协议读取一个socket id。

    websocket.write(id, data, fmt) 以websocket协议写入一个socket id。data为写入的内容,fmttext或者binary

    websocket.ping(id) 以websocket协议发送ping frame。

    websocket.close(id, code, reason) 以websocket协议关闭socket id,同时可选发送错误码code和错误原因reason

    具体的使用示例可以在simplewebsocket.lua代码中查看。

  • skynet升级到1.5.0后,skynet服务退出时,偶现crash.

    skynet升级到1.5.0后,skynet服务退出时,偶现crash.

    云大,请教一下,希望您能解答这个问题,skynet升级到1.5.0后,skynet服务退出时,偶现crash,目前出现一次。

    1. 进程crash的相关信息:
    gdb skynet/skynet core.11691
    GNU gdb (GDB) Red Hat Enterprise Linux 7.6.1-120.el7
    Copyright (C) 2013 Free Software Foundation, Inc.
    License GPLv3+: GNU GPL version 3 or later http://gnu.org/licenses/gpl.html
    This is free software: you are free to change and redistribute it.
    There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
    and "show warranty" for details.
    This GDB was configured as "x86_64-redhat-linux-gnu".
    For bug reporting instructions, please see:
    http://www.gnu.org/software/gdb/bugs/...
    Reading symbols from /data/mybonline/skynet/skynet...done.
    ...
    [Thread debugging using libthread_db enabled]
    Using host libthread_db library "/lib64/libthread_db.so.1".
    Core was generated by `/data/mybonline/sh/subgame/../../skynet/skynet /data/mybonline/sh/subgame/gameserver/'.
    Program terminated with signal 11, Segmentation fault.
    #0  0x000000000042faa3 in luaS_remove (L=0x7f727d1dff08, [email protected]=0x7f727d7d0640) at lstring.c:210
    210         p = &(*p)->u.hnext;
    Missing separate debuginfos, use: debuginfo-install glibc-2.17-324.el7_9.x86_64 libgcc-4.8.5-44.el7.x86_64
    (gdb) bt
    #0  0x000000000042faa3 in luaS_remove (L=0x7f727d1dff08, [email protected]=0x7f727d7d0640) at lstring.c:210
    #1  0x000000000042828a in freeobj ([email protected]=0x7f727d1dff08, o=0x7f727d7d0640) at lgc.c:795
    #2  0x000000000042a43f in deletelist (limit=0x7f727d1dff08, p=<optimized out>, L=0x7f727d1dff08) at lgc.c:1499
    #3  luaC_freeallobjects ([email protected]=0x7f727d1dff08) at lgc.c:1515
    #4  0x000000000042eb97 in close_state (L=0x7f727d1dff08) at lstate.c:275
    #5  0x000000000042f320 in lua_close (L=<optimized out>) at lstate.c:412
    #6  0x00007f7284ee64ec in snlua_release (l=0x7f727ce38680) at service-src/service_snlua.c:515
    #7  0x000000000041a2cf in delete_context (ctx=0x7f727c1e4480) at skynet-src/skynet_server.c:214
    #8  skynet_context_release ([email protected]=0x7f727c1e4480) at skynet-src/skynet_server.c:224
    #9  0x000000000041a8c0 in skynet_context_message_dispatch ([email protected]=0x7f7283808100, q=0x7f727ac9edc0, [email protected]=-1) at skynet-src/skynet_server.c:350
    #10 0x000000000041b0ab in thread_worker (p=<optimized out>) at skynet-src/skynet_start.c:163
    #11 0x00007f7284ac1ea5 in start_thread () from /lib64/libpthread.so.0
    #12 0x00007f7283ec69fd in clone () from /lib64/libc.so.6
    
    1. 进程crash前的相关日志:
    [:000084d4] [2022-02-24 02:41:07.988] [info] Nt204534_ai_next_1974_Don't Break My Heart_lv3, circle_finish,table_srvid: 34003 tableid: 204534 ai_serverid: 34004 uid: 1974
    [:000084d4] [2022-02-24 02:41:07.998] [warning] service/subgame/base/ai_service.lua:230:Wt204534_ai_next_1974_Don't Break My Heart_lv3, destroy_ai_other
    [:000084d4] [2022-02-24 02:41:07.998] KILL self
    [:000084d3] [2022-02-24 02:41:07.998] KILL self
    
    1. 之前用的旧版本的skynet,lua版本是5.3 一直很稳定,升级后才出现的这个问题。不太确定是否是业务流程有问题导致的。希望能指导一下,谢谢!
  • cluster 集群组网

    cluster 集群组网

    云大侠,我在某业务场景中需要实现 cluster 组网 skynet 进程都是等价的slave(不是使用master-slave模式) 目前cluser的配置只能声明

        nodename = "nodeaddress"
    

    我期望的输入方式

        slave = {
            "address",
            "address"
        }
    

    loadconfig中config已经规定了cluser的配置格式

        for name,address in pairs(tmp) do
            if name:sub(1,2) == "__" then
                name = name:sub(3)
                config[name] = address
                skynet.error(string.format("Config %s = %s", name, address))
            else
                assert(address == false or type(address) == "string")
                if node_address[name] ~= address then
                    -- address changed
                    if node_sender[name] then
                        -- reset connection if node_sender[name] exist
                        node_channel[name] = nil 
                        table.insert(reload, name)
                    end 
                    node_address[name] = address
                end 
                local ct = connecting[name]
                if ct and ct.namequery and not config.nowaiting then
                    skynet.error(string.format("Cluster node [%s] resloved : %s", name, address))
                    skynet.wakeup(ct.namequery)
                end 
            end 
        end 
    

    #Question: 我想问的是: 业务层不关心目标node在哪的情况下

    可以对cluster进行扩展 支持 array吗?

    还是自行将cluster包装一次 在wrapper中调度cluster (wrapper 中可以实现load level) ,例如: wrapper.lua:

    local config = {
    	slave = {
    		"slave1",	--cluster config key
    		"slave2"
    	}
    }
    
    local wrapper = {}
    
    function wrapper.call(name, ...)
    	local slave = assert(config[name], name)
    	local name = slave[math.random(1, #slave)] --load level
    	cluserd.call(name, ...)
    end
    
    wrapper.call( slave,  ...)
    

    这样就存在一个问题 cluster配置会变得臃肿 name将变成一个无意义的key cluser

        slave1 = "address"
        slave2 = "address"
    
  • 调用mqtt.ioloop时被卡住,追踪到函数client_mt:_io_iteration(),recv()处卡住,报错

    调用mqtt.ioloop时被卡住,追踪到函数client_mt:_io_iteration(),recv()处卡住,报错"A message from [ :00000000 ] to [ :0000000e ] maybe in an endless loop (version = 57)"

    云大,请教一下,希望您能解答这个问题,我是一个新手,不胜感激 环境: gcc——toolchain-mipsel_24kec+dsp_gcc-4.8-linaro_uClibc-0.9.33.2 lua:5.1.5 skynet:1.5.0

    local skynet = require("skynet") local js = require("cjson.safe") local mqtt = require("mqtt") local blocking = require("blocking") local ioloop = require("mqtt.ioloop") -- skynet/src/lualib/mqtt/ioloop.lua

    local function start_mqtt_client_loop()

    local mq_loop = ioloop.create {
        sleep = 1,
        timeout = 1,
        sleep_function = skynet.sleep
    }
    
    local s, err = blocking.readfile(mq_cfg_path)
    if err then
        logger:error("readfile %s fail. %s", mq_cfg_path, err)
        return false
    end
    
    local m = js.decode(s) -- m为table
    if not (m and m.mwd_mqtt) then
        logger:error("bad json in %s", mq_cfg_path)
        return false
    end
    
    local cfg = m.mwd_mqtt
    local host_str = cfg.product_id.."."..cfg.host
    
    local conn_params = {
        uri = string.format("%s:%s", host_str, mq_cfg.port),
        id = mq_cfg.client_id,
        username = mq_cfg.username,
        password = mq_cfg.password,
        keep_alive = mq_cfg.keeplive,
        reconnect = 10,
        clean = true
    }
    
    local topics = {}
    for k, v in pairs(topic_key_map) do
        --k: "sub_topic", v[1]: "config/ota/remote", v[2]: "pub_topic"
        table.insert(topics, {topic = k, module = v[1], pub_topic = v[2]})
    end
    
    logger:info("MQTT client connect params: %s", js.encode(conn_params))
    
    local client = mqtt.client(conn_params)
    
    client:on{
        connect = function(connack)
            assert(not mqtt_client)
    
            if not (connack and connack.rc == 0) then
                logger:error("connection to mqtt fail: %s %s", connack and connack:reason_string() or "", connack)
                return
            end
    
            logger:info("MQTT client connected server success!!!") -- successful connection
            state_table.connect_ok = true
            local response_count = 0
            for _, item in ipairs(topics) do
                local module, topic = item.module, item.topic
                client:subscribe{
                    topic = topic,
                    qos = 0,
                    callback = function(suback, args)
                        response_count = response_count + 1
                        if suback.rc[1] ~= 0 then
                            client:disconnect(0)
                            logger:error("subscribe [%s %s] fail %s", module, topic, suback.rc[1])
                            return
                        end
    
                        logger:info("subscribe from cloud [%s, %s] ok", module, topic)
                        if response_count >= #topics then
                            mqtt_client = client
                            logger:info("All topic subscribes OK, MQTT client is ready!!!")
                            --Not support, upload software version with 5mins timer.
                            --[[
                            local mqtt_dispatcher = skynet.queryservice("mqtt_dispatcher")
                            skynet.send(mqtt_dispatcher, "lua", "post")
                            ]]
                        end
                    end
                }
            end
        end,
        message = function(msg)
            local err = client:acknowledge(msg)
            if not err then
                logger:error("received a error msg!")
                return
            end
            table.insert(work_queue, {
                msg.topic,
                msg.payload
            }) -- 如果有消息,则把它插入表
    
            logger:info("Recv from cloud, topic: %s, payload: %s", msg.topic, msg.payload)
            skynet.wakeup(work_loop_co) -- 唤醒消息工作线程
            skynet.yield()
        end,
        error = function(err, args, connack)
            logger:error("mqtt client err: %s, args: %s, connack: %s", err, args, connack)
            if not connack then
                logger:error("MQTT client error maybe unable to connect to network")
                return
            end
    
            if connack.rc == 4 then
                state_table.connect_ok = false
                client.args.reconnect = client.args.reconnect + 5
            end
        end,
        close = function()
            state_table.connect_ok = false
            mqtt_client = nil -- set to nil mqtt_client = nil -- set to nil
            client = nil
            logger:error("mqtt client is close, set to not ready!!!")
        end
    }
    
    mq_loop:add(client)
    client:start_connecting()
    while true do
       if not client then
            break
        end
        mq_loop:iteration() ---------------------------------------------卡在了这里,最后追踪到函数client_mt:_io_iteration(),是recv()的地方卡住了
        skynet.yield()
    end
    logger:debug("connect failed, ready to reconnect")
    init_pri_key()
    
    mq_loop = nil
    client = nil
    collectgarbage("collect")
    return start_mqtt_client_loop()
    

    end

    这可能什么原因呢?

  • redis连接时,有可能因为在进行auth或selectdb时连接断开而被阻塞

    redis连接时,有可能因为在进行auth或selectdb时连接断开而被阻塞

    stack traceback: [C]: in function 'coroutine.yield' .././skynet/lualib/skynet.lua:390: in upvalue 'suspend_sleep' .././skynet/lualib/skynet.lua:416: in function 'skynet.manager.wait' .././skynet/lualib/skynet/socketchannel.lua:424: in upvalue 'block_connect' .././skynet/lualib/skynet/socketchannel.lua:482: in method 'request' .././skynet/lualib/skynet/db/redis.lua:155: in function <.././skynet/lualib/skynet/db/redis.lua:153> [C]: in function 'pcall' .././skynet/lualib/skynet/socketchannel.lua:333: in function <.././skynet/lualib/skynet/socketchannel.lua:269> (...tail calls...) .././skynet/lualib/skynet/socketchannel.lua:369: in upvalue 'try_connect' .././skynet/lualib/skynet/socketchannel.lua:427: in upvalue 'block_connect' .././skynet/lualib/skynet/socketchannel.lua:482: in function <.././skynet/lualib/skynet/socketchannel.lua:481>

    我先暂时在进行connect时保存当前协程,在触发block_connect时进行一次对比避免阻塞来解决了

  • redis 报错后,db操作不返回,一直卡着协程

    redis 报错后,db操作不返回,一直卡着协程

    skynet 版本 1.3.0 (日志等代码有一些小改动,加了毫秒打印,其它底层没改)

    错误信息: lualib/skynet/socketchannel.lua:411: LOADING Redis is loading the dataset in memory

    代码A: ########################## function CMD.keepalive()

    if DATA.connection then
    	local ok, ret = skynet.pcall(DATA.connection.get, DATA.connection, "foo")
    	if not ok then
    		skynet.error("ping failed reconnect it", ret)
    
    		local ok, ret = skynet.pcall(DATA.connection.disconnect, DATA.connection)
    
    		if not ok then
    			skynet.error("disconnect failed", ret)
    		end
    
    		DATA.connection = nil
    
    		CMD.connect()
    	end
    else
    	CMD.connect()
    end
    

    end ########################## 代码说明: 只要redisdb 返回错误,我的代码 就会执行断开redis,再重新连接redis

    代码B: ############################ 1 skynet.info("roomRunn", cid, room.runningState) 2 local ok, ret = skynet.pcall(StateRun.run, cid) 3 if not ok then 4 skynet.error(" runRoomStat failed ", cid, ret) 5 end 6 skynet.info("roomRunn", cid, room.runningState) ############################ 代码说明: StateRun.run 内部有很多操作,包括redis操作。

    看代码B的 日志,只输出了 line1的 “roomRunn“,line6的没有打印了。

    后面有其它请求redisdb 操作,redisdb又有返回了。

    感觉是: StateRun.run 里调用redisdb 失败后 ,一直卡着了。

Related tags
Improved version of the X-Ray Engine, the game engine used in the world-famous S.T.A.L.K.E.R. game series by GSC Game World.
Improved version of the X-Ray Engine, the game engine used in the world-famous S.T.A.L.K.E.R. game series by GSC Game World.

OpenXRay OpenXRay is an improved version of the X-Ray Engine, the game engine used in the world-famous S.T.A.L.K.E.R. game series by GSC Game World. S

May 20, 2022
Stealthy way to hijack the existing game process handle within the game launcher (currently supports Steam and Battle.net). Achieve external game process read/write with minimum footprint.
Stealthy way to hijack the existing game process handle within the game launcher (currently supports Steam and Battle.net). Achieve external game process read/write with minimum footprint.

Launcher Abuser Stealthy way to hijack the existing game process handle within the game launcher (currently supports Steam and Battle.net). Achieve ex

May 8, 2022
Game Boy, Game Boy Color, and Game Boy Advanced Emulator
Game Boy, Game Boy Color, and Game Boy Advanced Emulator

SkyEmu SkyEmu is low level cycle accurate GameBoy, GameBoy Color and Game Boy Advance emulator that I have been developing in my spare time. Its prima

May 12, 2022
A lightweight game engine written in modern C++
A lightweight game engine written in modern C++

Halley Game Engine A lightweight game engine written in C++17. It has been used to ship Wargroove, a turn-based strategy game, on Windows, Mac (experi

May 18, 2022
A Tiny 2D OpenGL based C++ Game Engine that is fast, lightweight and comes with a level editor.
A Tiny 2D OpenGL based C++ Game Engine that is fast, lightweight and comes with a level editor.

A Tiny 2D OpenGL based C++ Game Engine that is fast, lightweight and comes with a level editor.

Apr 12, 2022
A lightweight, self-contained library for gizmo editing commonly found in many game engines
A lightweight, self-contained library for gizmo editing commonly found in many game engines

This project is a lightweight, self-contained library for gizmo editing commonly found in many game engines. It includes mechanisms for manipulating 3d position, rotation, and scale. Implemented in C++11, the library does not perform rendering directly and instead provides a per-frame buffer of world-space triangles.

May 7, 2022
Cute Framework (CF for short) is the cutest framework available for making 2D games in C/C++
Cute Framework (CF for short) is the cutest framework available for making 2D games in C/C++

Cute Framework (CF for short) is the cutest framework available for making 2D games in C/C++. CF comprises of different features, where the various features avoid inter-dependencies. In this way using CF is about picking and choosing which pieces are needed for your game

May 10, 2022
dos-like is a programming library/framework, kind of like a tiny game engin
dos-like is a programming library/framework, kind of like a tiny game engin

dos-like is a programming library/framework, kind of like a tiny game engine, for writing games and programs with a similar feel to MS-DOS productions from the early 90s. But rather than writing code that would run on a real DOS machine, dos-like is about making programs which runs on modern platforms like Windows, Mac and Linux, but which attempts to recreate the look, feel, and sound of old DOS programs.

May 15, 2022
sdl based game framework

Hallow Engine an sdl-based game framework Install Procedure Install SDL2[https://libsdl.org] and take the .h(header) files from the include/ folder, t

Nov 24, 2021
A simple game framework written in C++ using SDL

SGF SGF (Simple Game Framework) is, as the name implies; a very simple and easy to use game framework written in C++ using SDL. Currently the project

Nov 4, 2021
A cycle-accurate Game Boy and Game Boy Color Emulator, with rewind feature.
A cycle-accurate Game Boy and Game Boy Color Emulator, with rewind feature.

Azayaka is a free and open-source Game Boy and Game Boy Color emulator written in C++. Features Cycle-Accurate emulation. Console based Debugg

Dec 16, 2021
The Game Boy ROM of the Game Boy bitcoin miner!

game-boy-bitcoin-miner The Game Boy ROM of the Game Boy bitcoin miner! To build this, currently this patch needs to be applied to GBDK: https://gist.g

Mar 30, 2022
CLUSEK-RT is a complex game engine written in C++ and the successor of the CLUSEK game engine
CLUSEK-RT is a complex game engine written in C++ and the successor of the CLUSEK game engine

CLUSEK-RT is a complex game engine written in C++ and the successor of the CLUSEK game engine. This engine has been designed with a cross-platform design in mind. Thanks to Vulkan API it delivers a next-gen experience with ray tracing to both Linux and Windows platforms

Feb 22, 2022
Ground Engine is an easy to use Game Engine for 3D Game Development written in C++
Ground Engine is an easy to use Game Engine for 3D Game Development written in C++

Ground Engine is an easy to use Game Engine Framework for 3D Game Development written in C++. It's currently under development and its creation will b

Apr 30, 2022
Minetest is an open source voxel game engine with easy modding and game creation

Minetest is an open source voxel game engine with easy modding and game creation

May 11, 2022
A game made for the Game (Engineless) Jam using Raylib
A game made for the Game (Engineless) Jam using Raylib

Fastest Pizza Delivery A fun little 3D game made for the Game (Engineless) Jam. It is still is development but the basic gameplay is something l

Apr 3, 2022
SameBoy DX is a Qt-based interface of SameBoy, a free, highly accurate Game Boy and Game Boy Color emulator.

SameBoy DX SameBoy DX is a Qt-based interface of SameBoy, a free, highly accurate Game Boy and Game Boy Color emulator. Build requirements: CMake Pyth

Mar 6, 2022
To recreate the board game Scotland yard and enable a single player to play the game by letting one of the roles being played by the computer based on written algorithm
To recreate the board game Scotland yard and  enable a single player to play the game by letting one of the roles being played by the computer based on written algorithm

Scotland Yard GAME OF SCOTLAND YARD This is a custom version of the classic board game, Scotland Yard .The game uses the London map used in the origin

Nov 11, 2021
Ncurses based omok game, execute omok game in your terminal
Ncurses based omok game, execute omok game in your terminal

omok_game execute omok game in your terminal Omok game played by two people. 한국어 버전(korean version)

Dec 6, 2021