CLOS make-instance非常慢,导致SBCL中的堆耗尽

 曾经沧海难为水文杰59552066 发布于 2022-12-09 20:47

我正在使用Common Lisp(64位Debian GNU/Linux中的SBCL 1.1.5)编写一个多体系结构汇编程序/反汇编程序,目前汇编程序为x86-64的子集生成正确的代码.为了组装x86-64汇编代码,我使用一个哈希表,其中汇编指令助记符(字符串)如"jc-rel8"和,"stosb"是返回一个或多个编码函数列表的键,如下所示:

(defparameter *emit-function-hash-table-x64* (make-hash-table :test 'equalp))
(setf (gethash "jc-rel8" *emit-function-hash-table-x64*) (list #'jc-rel8-x86))
(setf (gethash "stosb"   *emit-function-hash-table-x64*) (list #'stosb-x86))

编码函数就像这些(有些更复杂):

(defun jc-rel8-x86 (arg1 &rest args)
  (jcc-x64 #x72 arg1))

(defun stosb-x86 (&rest args)
  (list #xaa))

现在我试图通过使用转换为Common Lisp CLOS语法的NASM(NASM 2.11.06)指令编码数据(文件insns.dat)来合并完整的x86-64指令集.这意味着代替用于发射的二进制码(如上面的函数)与自定义的情况下正则函数x86-asm-instruction类(到目前为止一个非常基本的类,某些20个时隙:initarg,:reader,:initform等等),其中emit将用于带有参数的方法用于发出给定指令(助记符)和参数的二进制代码.转换后的指令数据看起来像这样(但它超过40'000行,正好是7193 make-instance和7193 setf).


;; first mnemonic + operand combination instances (:is-variant t).
;; there are 4928 such instances for x86-64 generated from NASM's insns.dat.

(eval-when (:compile-toplevel :load-toplevel :execute)

(setf Jcc-imm-near (make-instance 'x86-asm-instruction
:name "Jcc"
:operands "imm|near"
:code-string "[i: odf 0f 80+c rel]"
:arch-flags (list "386" "BND")
:is-variant t))

(setf STOSB-void (make-instance 'x86-asm-instruction
:name "STOSB"
:operands "void"
:code-string "[ aa]"
:arch-flags (list "8086")
:is-variant t))

;; then, container instances which contain (or could be refer to instead)
;; the possible variants of each instruction.
;; there are 2265 such instances for x86-64 generated from NASM's insns.dat.

(setf Jcc (make-instance 'x86-asm-instruction
                         :name "Jcc"
                         :is-container t
                         :variants (list Jcc-imm-near
                                         Jcc-imm64-near
                                         Jcc-imm-short
                                         Jcc-imm
                                         Jcc-imm
                                         Jcc-imm
                                         Jcc-imm)))

(setf STOSB (make-instance 'x86-asm-instruction
                           :name "STOSB"
                           :is-container t
                           :variants (list STOSB-void)))

;; thousands of objects more here...

) ; this bracket closes (eval-when (:compile-toplevel :load-toplevel :execute)

我已经insns.dat使用一个简单的Perl脚本将NASM转换为Common Lisp语法(如上所述)(下面进一步说明,但脚本本身并没有任何兴趣),原则上它可以工作.所以它可以工作,但编译这些7193对象真的很慢,通常会导致堆耗尽.在具有16G内存的Linux Core i7-2760QM笔记本电脑上,(eval-when (:compile-toplevel :load-toplevel :execute)使用上述7193个对象编译代码块需要7分钟以上,有时会导致堆耗尽,如下所示:

;; Swank started at port: 4005.
* Heap exhausted during garbage collection: 0 bytes available, 32 requested.
 Gen StaPg UbSta LaSta LUbSt Boxed Unboxed LB   LUB  !move  Alloc  Waste   Trig    WP  GCs Mem-age
   0:     0     0     0     0     0     0     0     0     0        0     0 41943040    0   0  0.0000
   1:     0     0     0     0     0     0     0     0     0        0     0 41943040    0   0  0.0000
   2:     0     0     0     0     0     0     0     0     0        0     0 41943040    0   0  0.0000
   3: 38805 38652     0     0 49474 15433   389   416     0 2144219760 9031056 1442579856    0   1  1.5255
   4: 127998 127996     0     0 45870 14828   106   143   199 1971682720 25428576  2000000    0   0  0.0000
   5:     0     0     0     0     0     0     0     0     0        0     0  2000000    0   0  0.0000
   6:     0     0     0     0  1178   163     0     0     0 43941888     0  2000000  985   0  0.0000
   Total bytes allocated    = 4159844368
   Dynamic-space-size bytes = 4194304000
GC control variables:
   *GC-INHIBIT* = true
   *GC-PENDING* = in progress
   *STOP-FOR-GC-PENDING* = false
fatal error encountered in SBCL pid 9994(tid 46912556431104):
Heap exhausted, game over.

Welcome to LDB, a low-level debugger for the Lisp runtime environment.
ldb>

我不得不--dynamic-space-size 4000为SBCL 添加参数以使其完全编译,但仍然在分配4千兆字节的动态空间后,堆有时会耗尽.即使堆耗尽了,只需在类中添加一个槽('x86-asm-instruction用于这些实例的类)后编译7193个实例超过7分钟对于REPL中的交互式开发来说太多了(我使用slimv,如果这很重要) .

这是(time (compile-file输出:

;   caught 18636 WARNING conditions


; insns.fasl written
; compilation finished in 0:07:11.329
Evaluation took:
  431.329 seconds of real time
  238.317000 seconds of total run time (234.972000 user, 3.345000 system)
  [ Run times consist of 6.073 seconds GC time, and 232.244 seconds non-GC time. ]
  55.25% CPU
  50,367 forms interpreted
  784,044 lambdas converted
  1,031,842,900,608 processor cycles
  19,402,921,376 bytes consed

使用OOP(CLOS)将允许合并指令助记符(例如jcstosb以上),指令的:name允许操作数(:operands),指令的二进制编码(例如#xaafor stosb,:code-string)和:arch-flags一个对象中指令的可能的体系结构限制().但似乎至少我3岁的计算机效率不足以快速编译大约7000个CLOS对象实例.

我的问题是:有没有办法使SBCL make-instance更快,或者我应该在常规函数中保持汇编代码生成,如上面的例子?我也很高兴知道任何其他可能的解决方案.

这是Perl脚本,以防万一:

#!/usr/bin/env perl
use strict;
use warnings;

# this program converts NASM's `insns.dat` to Common Lisp Object System (CLOS) syntax.

my $firstchar;
my $line_length;
my $are_there_square_brackets;
my $mnemonic_and_operands;
my $mnemonic;
my $operands;
my $code_string;
my $flags;
my $mnemonic_of_current_mnemonic_array;

my $clos_object_name;
my $clos_mnemonic;
my $clos_operands;
my $clos_code_string;
my $clos_flags;

my @object_name_array = ();
my @mnemonic_array = ();
my @operands_array = ();
my @code_string_array = ();
my @flags_array = ();

my @each_mnemonic_only_once_array = ();

my @instruction_variants_array = ();
my @instruction_variants_for_current_instruction_array = ();

open(FILE, 'insns.dat');

$mnemonic_of_current_mnemonic_array = "";

# read one line at once.
while ()
{
    $firstchar = substr($_, 0, 1);
    $line_length = length($_);
    $are_there_square_brackets = ($_ =~ /\[.*\]/);
    chomp;
    if (($line_length > 1) && ($firstchar =~ /[^\t ;]/))
    {
        if ($are_there_square_brackets)
        {
            ($mnemonic_and_operands, $code_string, $flags) = split /[\[\]]+/, $_;
            $code_string = "[" . $code_string . "]";
            ($mnemonic, $operands) = split /[\t ]+/, $mnemonic_and_operands;
        }
        else
        {
            ($mnemonic, $operands, $code_string, $flags) = split /[\t ]+/, $_;
        }
        $mnemonic =~ s/[\t ]+/ /g;
        $operands =~ s/[\t ]+/ /g;
        $code_string =~ s/[\t ]+/ /g;
        $flags =~ s/[\t ]+//g;

        # we don't want non-x86-64 instructions here.
        unless ($flags =~ "NOLONG")
        {
            # ok, the content of each field is now filtered,
            # let's convert them to a suitable Common Lisp format.
            $clos_object_name = $mnemonic . "-" . $operands;

            # in Common Lisp object names `|`, `,`, and `:` must be escaped with a backslash `\`,
            # but that would get too complicated.
            # so we'll simply replace them:
            # `|` -> `-`.
            # `,` -> `.`.
            # `:` -> `.`.
            $clos_object_name =~ s/\|/-/g;              
            $clos_object_name =~ s/,/./g;              
            $clos_object_name =~ s/:/./g;              

            $clos_mnemonic    = "\"" . $mnemonic . "\"";
            $clos_operands    = "\"" . $operands . "\"";
            $clos_code_string = "\"" . $code_string . "\"";

            $clos_flags = "\"" . $flags . "\"";        # add first and last double quotes.
            $clos_flags =~ s/,/" "/g;                  # make each flag its own Common Lisp string.
            $clos_flags = "(list " . $clos_flags. ")"; # convert to `list` syntax.

            push @object_name_array, $clos_object_name;
            push @mnemonic_array, $clos_mnemonic;
            push @operands_array, $clos_operands;
            push @code_string_array, $clos_code_string;
            push @flags_array, $clos_flags;

            if ($mnemonic eq $mnemonic_of_current_mnemonic_array)
            {
                # ok, same mnemonic as the previous one,
                # so the current object name goes to the list.
                push @instruction_variants_for_current_instruction_array, $clos_object_name;
            }
            else
            {
                # ok, this is a new mnemonic.
                # so we'll mark this as current mnemonic.
                $mnemonic_of_current_mnemonic_array = $mnemonic;
                push @each_mnemonic_only_once_array, $mnemonic;

                # we first push the old array (unless it's empty), then clear it,
                # and then push the current object name to the cleared array.

                if (@instruction_variants_for_current_instruction_array)
                {
                    # push the variants array, unless it's empty.
                    push @instruction_variants_array, [ @instruction_variants_for_current_instruction_array ];
                }
                @instruction_variants_for_current_instruction_array = ();
                push @instruction_variants_for_current_instruction_array, $clos_object_name;
            }
        }
    }
}

# the last instruction's instruction variants must be pushed too.
if (@instruction_variants_for_current_instruction_array)
{
    # push the variants array, unless it's empty.
    push @instruction_variants_array, [ @instruction_variants_for_current_instruction_array ];
}

close(FILE);

# these objects need be created already during compilation.
printf("(eval-when (:compile-toplevel :load-toplevel :execute)\n");

# print the code to create each instruction + operands combination object.

for (my $i=0; $i <= $#mnemonic_array; $i++)
{
    $clos_object_name = $object_name_array[$i];
    $mnemonic         = $mnemonic_array[$i];
    $operands         = $operands_array[$i];
    $code_string      = $code_string_array[$i];
    $flags            = $flags_array[$i];

    # print the code to create a variant object.
    # each object here is a variant of a single instruction (or a single mnemonic).
    # actually printed as 6 lines to make it easier to read (for us humans, I mean), with an empty line in the end.
    printf("(setf %s (make-instance 'x86-asm-instruction\n:name %s\n:operands %s\n:code-string %s\n:arch-flags %s\n:is-variant t))",
        $clos_object_name,
        $mnemonic,
        $operands,
        $code_string,
        $flags);
    printf("\n\n");
}

# print the code to create each instruction + operands combination object.

# for (my $i=0; $i <= $#each_mnemonic_only_once_array; $i++)
for my $i (0 .. $#instruction_variants_array)
{
    $mnemonic = $each_mnemonic_only_once_array[$i];

    # print the code to create a container object.
    printf("(setf %s (make-instance 'x86-asm-instruction :name \"%s\" :is-container t :variants (list \n", $mnemonic, $mnemonic);
    @instruction_variants_for_current_instruction_array = $instruction_variants_array[$i];

    # for (my $j=0; $j <= $#instruction_variants_for_current_instruction_array; $j++)
    for my $j (0 .. $#{$instruction_variants_array[$i]} )
    {
        printf("%s", $instruction_variants_array[$i][$j]);

        # print 3 closing brackets if this is the last variant.
        if ($j == $#{$instruction_variants_array[$i]})
        {
            printf(")))");
        }
        else
        {
            printf(" ");
        }
    }

    # if this is not the last instruction, print two newlines.
    if ($i < $#instruction_variants_array)
    {
        printf("\n\n");
    }
}

# print the closing bracket to close `eval-when`.
print(")");

exit;

Rainer Joswi.. 10

18636警告看起来非常糟糕,首先要删除所有警告.

我会从摆脱EVAL-WHEN所有这一切开始.对我来说没有多大意义.直接加载文件,或编译并加载文件.

另请注意,(setf STOSB-void ...)当变量未定义时,SBCL不喜欢.使用DEFVAR或引入了新的顶级变量DEFPARAMETER.SETF只是设置它们,但不定义它们.这应该有助于摆脱警告.

此外:is-container t,:is-variant t这些属性的气味应该转换为继承的类(例如作为mixin).容器有变种.变体没有变体.

1 个回答
  • 18636警告看起来非常糟糕,首先要删除所有警告.

    我会从摆脱EVAL-WHEN所有这一切开始.对我来说没有多大意义.直接加载文件,或编译并加载文件.

    另请注意,(setf STOSB-void ...)当变量未定义时,SBCL不喜欢.使用DEFVAR或引入了新的顶级变量DEFPARAMETER.SETF只是设置它们,但不定义它们.这应该有助于摆脱警告.

    此外:is-container t,:is-variant t这些属性的气味应该转换为继承的类(例如作为mixin).容器有变种.变体没有变体.

    2022-12-11 03:12 回答
撰写答案
今天,你开发时遇到什么问题呢?
立即提问
热门标签
PHP1.CN | 中国最专业的PHP中文社区 | PNG素材下载 | DevBox开发工具箱 | json解析格式化 |PHP资讯 | PHP教程 | 数据库技术 | 服务器技术 | 前端开发技术 | PHP框架 | 开发工具 | 在线工具
Copyright © 1998 - 2020 PHP1.CN. All Rights Reserved 京公网安备 11010802041100号 | 京ICP备19059560号-4 | PHP1.CN 第一PHP社区 版权所有