Skip to content

debug console implemented by cfadmin.

License

Notifications You must be signed in to change notification settings

cfadmin-cn/debug

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

30 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Debug Console

基于cfadmin实现的终端调试库.

安装介绍

  1. 将代码克隆到3rd目录下.

  2. 使用local console = require "debug.console"导入.

启动方式

内部支持以下两种连接方式:

  1. 监听端口 - console.start("127.0.0.1", 6666)

  2. 监听文件 - console.start("local.sock")

1种只能支持单进程模式, 第2种可自行配置文件名后支持多进程模式.

使用方法

我们在script/main.lua内写入以下内容:

local console  = require "debug.console"
console.startx("local.sock")

然后运行./cfadmin -e script/main.lua启动即可.(实际业务里只需要把代码写在最终启动之前即可)

最后我们在命令行运行nc -U local.sock, 如看到如下输出则代表连接成功.

[candy@MacBookPro:~/Documents/cfadmin] $ nc -U local.sock

Welcome! This is cfadmin Debug Console:

  gc     -  Can run/stop/modify/count garbage collectors.

  run    -  Execute the lua script like `main` coroutine.

  dump   -  Prints more information about the specified data structure.

  stat   -  Process usage data analysis report.

>>>
  • stat - 输出进程使用状态

  • run - 启动指定文件名的脚本

  • dump - 可以格式化输出一些指定数据结构

  • gc - 允许用户手动操作GC

1. 查看进程状态

我们尝试运行stat命令来获得一些使用帮助:

>>> stat

stat [command] :

  [cpu]    -   CPU kernel space and user space usage of the current process.

  [mem]    -   Memory usage report of the current process.

  [page]   -   `hard_page_fault` and `soft_page_fault` of the current process.

  [all]    -   Return all of the above information.

>>>

现在根据提示使用stat all则可以输出所有内容. 如下所示:

>>> stat all

CPU(User): 0.40%

CPU(Kernel): 0.33%

Lua Memory: 239.7256/KB

Swap Memory: 0.0000/KB

Total Memory: 2.1720/MB

Hard Page Faults: 0

Soft Page Faults: 739

>>>

2. 查看内部数据

有时候我们需要查看Lua内部的一些数据, 这时候可以使用dump来完成:

>>> dump

dump [command] [key1] [key1] [keyN] :

  [global] - dump global table (`_G`).

  [registery] - dump lua debug registery table.

  [filename] - dump already loaded package and its return table .

  --

  `keyX` means we can get `deep value` like `table[key1][key2]..[keyN]`

  e.g :
   1. dump cf wait
   2. dump global string

>>>

比如我们要打印全局表_G,看下内部有Key存在. 那么我们可以这样:

>>> dump g

global{
  ['tonumber'] = function: 0x107b22ec0
  ['error'] = function: 0x107b22550
  ['setmetatable'] = function: 0x107b22e20
  ['string'] = table: 0x7ffd8b508120
  ['pcall'] = function: 0x107b229b0
  ['rawset'] = function: 0x107b22d10
  ['rawget'] = function: 0x107b22cc0
  ['print'] = function: 0x107b22a40
  ['os'] = table: 0x7ffd8b5070f0
  ['io'] = table: 0x7ffd8b507620
  ['loadfile'] = function: 0x107b22670
  ['require'] = function: 0x7ffd8b506bb0
  ['coroutine'] = table: 0x7ffd8b5071b0
  ['utf8'] = table: 0x7ffd8b506870
  ['assert'] = function: 0x107b22280
  ['pairs'] = function: 0x107b22920
  ['rawequal'] = function: 0x107b22c10
  ['collectgarbage'] = function: 0x107b22300
  ['warn'] = function: 0x107b22b50
  ['table'] = table: 0x7ffd8b507420
  ['NULL'] = userdata: 0x0
  ['null'] = userdata: 0x0
  ['debug'] = table: 0x7ffd8b5073c0
  ['tostring'] = function: 0x107b23110
  ['math'] = table: 0x7ffd8b508850
  ['load'] = function: 0x107b22750
  ['ipairs'] = function: 0x107b22620
  ['_G'] = table: 0x7ffd8b505c30
  ['rawlen'] = function: 0x107b22c60
  ['type'] = function: 0x107b23140
  ['next'] = function: 0x107b228c0
  ['_VERSION'] = 'Lua 5.4'
  ['dofile'] = function: 0x107b224e0
  ['select'] = function: 0x107b22d70
  ['package'] = table: 0x7ffd8b506510
  ['getmetatable'] = function: 0x107b225d0
  ['xpcall'] = function: 0x107b231a0
}

counter:
  total keys count: 37
  string value count: 1
  function value count: 24
  usedata value count: 2
  table value count: 10

Done.
>>>

是的! 你没有看错. 如果打印的是一个table则会对内部进行统计完成数据化返回.

那么如果是一个函数呢? 如果函数是lua编写的, 那么dump可以定位到文件位置:

>>> dump g package loaded debug.console

debug.console{
  ['startx'] = function: 0x7ffd8b4118a0(3rd/debug/console.lua:86)
  ['start'] = function: 0x7ffd8b415760(3rd/debug/console.lua:76)
}

counter:
  total keys count: 2
  function value count: 2

Done.
>>>

那如果想看一下注册表呢? 可以把g改为r来查看注册表的内容:

>>> dump r

registery{
  [1] = thread: 0x7ffd8c009a08
  [2] = table: 0x7ffd8b505c30
  ['__Task__'] = table: 0x7ffd8b406620
  ['FILE*'] = table: 0x7ffd8b507920
  ['_IO_input'] = file (0x7fff975c5d90)
  ['__G_UDP__'] = table: 0x7ffd8b510360
  ['_LOADED'] = table: 0x7ffd8b5062f0
  ['_UBOX*'] = table: 0x7ffd8b708c30
  ['_PRELOAD'] = table: 0x7ffd8b507090
  ['_IO_output'] = file (0x7fff975c5e28)
  ['__G_TCP__'] = table: 0x7ffd8b413ca0
  ['__TCP__'] = table: 0x7ffd8b5145f0
  ['__TIMER__'] = table: 0x7ffd8b409880
  ['_CLIBS'] = table: 0x7ffd8b506b70
  ['__G_TIMER__'] = table: 0x7ffd8b40a0a0
  ['__UDP__'] = table: 0x7ffd8b5102e0
}

counter:
  total keys count: 16
  usedata value count: 2
  thread value count: 1
  table value count: 13

Done.
>>>

从示例可以看出语法就是keyname + 空格的方式, 使用者熟练掌握后可以快速定位.

那如果我指向定位require过的包, 应该怎么做呢?

>>> dump cf

cf{
  ['yield'] = function: 0x7fbc0f609360(lualib/cf/init.lua:35)
  ['at'] = function: 0x7fbc0f6095d0(lualib/cf/init.lua:30)
  ['sleep'] = function: 0x7fbc0f609780(lualib/cf/init.lua:46)
  ['wakeup'] = function: 0x7fbc0f609970(lualib/cf/init.lua:75)
  ['timeout'] = function: 0x7fbc0f609570(lualib/cf/init.lua:22)
  ['fork'] = function: 0x7fbc0f609940(lualib/cf/init.lua:68)
  ['wait'] = function: 0x7fbc0f609910(lualib/cf/init.lua:61)
  ['self'] = function: 0x7fbc0f609820(lualib/cf/init.lua:56)
  ['join'] = function: 0x7fbc0f609a10(lualib/cf/init.lua:93)
}

counter:
  total keys count: 9
  function value count: 9

Done.
>>>

这能提供使用者快速定位问题的能力, 也可以简化开发者的快速上手难度.

3. 运行调试代码

假设我们的代码有一个隐藏的bug, 但是每次重启后就无法定位了.

并且每次启动一段时间内也没问题, 而一旦某个时间点某个特殊条件成立就出现了.

这时候我们就需要更多运行时调试的能力, 但是这时候我们并不attach来影响进程的执行能力.

所以我们的框架必须提供一种任何时候都能安全执行代码的能力!

现在让我们编写一个script/demo.lua的文件并写入如下的代码:

local function f1()
  print("f1")
end

local function f2()
  print("f2")
end


local function f()
  f1()
  f2()
end

f()

编写完成后, 我们就尝试在运行中的框架内执行这个脚本:

>>> run script/demo.lua

Total Running Time: 0.000
Done.
>>>

然后你会发现之前我们启动的框架那边输出了2行内容.

[candy@MacBookPro:~/Documents/cfadmin] $ ./cfadmin
f1
f2

这就说明我们的代码运行成功了!

但是这并不够! 因为有时候我们还需要运行的这段脚本只执行过程是什么.

这时候我们可以在最后加上一个参数, 则会补充输出运行的脚本调用栈.

>>> run script/demo.lua true
callstack traceback:
 └----> [OK] [NEXT LINE] [script/demo.lua:3]
 └----> [OK] [NEXT LINE] [script/demo.lua:7]
 └----> [OK] [NEXT LINE] [script/demo.lua:13]
 └----> [OK] [NEXT LINE] [script/demo.lua:15]
 └--------> [OK] [NEXT LINE] [script/demo.lua:11]
 └------------> [OK] [NEXT LINE] [script/demo.lua:2]
 └------------> [OK] [NEXT LINE] [script/demo.lua:3]
 └------------> [OK] [GOTO BACK] [script/demo.lua:3]
 └--------> [OK] [NEXT LINE] [script/demo.lua:12]
 └------------> [OK] [NEXT LINE] [script/demo.lua:6]
 └------------> [OK] [NEXT LINE] [script/demo.lua:7]
 └------------> [OK] [GOTO BACK] [script/demo.lua:7]
 └--------> [OK] [NEXT LINE] [script/demo.lua:13]
 └--------> [OK] [GOTO BACK] [script/demo.lua:13]
 └----> [OK] [GOTO BACK] [script/demo.lua:15]
 └----> [OK] [NEXT LINE] [3rd/debug/run.lua:83]
 └----> [OK] [NEXT LINE] [3rd/debug/run.lua:84]

Total Running Time: 0.000
Done.
>>>

4. 开始调试GC

但有时候我们想尝试对GC进行一些特殊操作, 以借助这些修改来观察服务的整体运行差异.

这时候我们就需要利用到下面的一些调试命令:

>>> gc

gc [command] [args]:

  [count]   -  Let the garbage collector report memory usage.

  [step]    -  Let the garbage collector do a step garbage collection.

  [collect] -  Let the garbage collector do a full garbage collection.

  [start]   -  Let the garbage collector (re)start.

  [stop]    -  Let the garbage collector stop working.

  [mode]    -  Let the garbage change work mode(`incremental` or `generational`).

>>>

上述包括:暂停、重启、完整回收、修改运行模式等功能, 这些命令赋予我们运行时调试GC的能力.

请注意: 在性能要求较高的场景下请谨慎使用, 某些操作可能会对造成进程无法提供对外提供服务.

补充说明

上述的演示功能只是冰山一角, 并且由于篇幅与其它原因我们无法在此给大家全部展示更多的特性.

但是通过开发者合理的组合与自行编写的脚本, 完全可以完成灰度测试、热修复、热更新、在线调试等功能.

获取帮助

如果还有其它任何的疑问, 请到我们的交流群内咨询.

Releases

No releases published

Packages

No packages published

Languages