java - Java 优化器是否记忆计算值？

Question

哪个fizzbuzz实现更有效？

   public static void fizzBuzz1(int n)
    {
        boolean fizzed, buzzed;
        for(int i = 1; i <= n; i++)
        {
            fizzed = buzzed = false;
            if(i % 3 == 0)
                fizzed = true;
            if(i % 5 == 0)
                buzzed = true;
            if(fizzed  && buzzed)
                System.out.println("FizzBuzz");
            else if(fizzed)
                System.out.println("Fizz");
            else if(buzzed)
                System.out.println("Buzz");
            else
                System.out.println(i);
        }
    }

    public static void fizzBuzz2(int n)
    {
        for(int i = 1; i <= n; i++)
        {
            if(i % 3 == 0  && i % 5 == 0)
                System.out.println("FizzBuzz");
            else if(i % 3 == 0)
                System.out.println("Fizz");
            else if(i % 5 == 0)
                System.out.println("Buzz");
            else
                System.out.println(i);
        }
    }

score 12 · Accepted Answer

This one took some interesting turns.

First I tried to have a look at the assembly that is generated with the original method. However, the JIT does quite some inlining and optimizations, and this includes the System.out.println calls, so the resulting assembly output was too huge (for me) to sensibly analyze it in reasonable time.

So I simplified the whole thing, in order to be able to focus on the actual question. Finally, I ran the following program:

class Test04
{
    public static void main(String args[])
    {
        long sum = 0;
        for (int i=1000; i<12000; i++)
        {
            sum += fizzBuzz1(i);
            sum += fizzBuzz2(i);
        }
        System.out.println(sum);
    }

    public static long fizzBuzz1(int n)
    {
        long sum = 0;
        for(int i = 1; i <= n; i++)
        {
            sum += fizzBuzzCore1(i);
        }
        return sum;
    }

    public static long fizzBuzzCore1(int i)
    {
        boolean fizzed = false;
        boolean buzzed = false;
        if(i % 3 == 0)
            fizzed = true;
        if(i % 5 == 0)
            buzzed = true;
        if(fizzed  && buzzed)
            return 4;
        else if(fizzed)
            return 3;
        else if(buzzed)
            return 2;
        else
            return 1;
    }


    public static long fizzBuzz2(int n)
    {
        long sum = 0;
        for(int i = 1; i <= n; i++)
        {
            sum += fizzBuzzCore2(i);
        }
        return sum;
    }

    public static long fizzBuzzCore2(int i)
    {
        if(i % 3 == 0  && i % 5 == 0)
            return 4;
        else if(i % 3 == 0)
            return 3;
        else if(i % 5 == 0)
            return 2;
        else
            return 1;
    }

}

The return values are intended to prevent him from optimizing away the calls completely, and extracting the "core" methods aimed at keeping the size of the assembly output that has to be compared as small as possible.

(Note: Of course, these modifications may affect the optimization. For example, the JIT has a limit of the number of bytecode instructions that a method may have before it is considered to be too large to be inlined, -XX:MaxInlineSize=35. But it the effects should roughly be the same on both methods, so it should still be possible to derive the desired information about the actual question).

And, not such a great surprise: After the last optimization, the assembly code for both methods contains equal instructions - here the assembly for fizzBuzzCore1 as a reference:

Decoding compiled method 0x00000000026c0090:
Code:
[Entry Point]
[Verified Entry Point]
[Constants]
  # {method} {0x0000000057260528} &apos;fizzBuzzCore1&apos; &apos;(I)J&apos; in &apos;Test04&apos;
  # parm0:    rdx       = int
  #           [sp+0x20]  (sp of caller)
  0x00000000026c01c0: sub    $0x18,%rsp
  0x00000000026c01c7: mov    %rbp,0x10(%rsp)    ;*synchronization entry
                                                ; - Test04::fizzBuzzCore1@-1 (line 27)

  0x00000000026c01cc: movslq %edx,%r10
  0x00000000026c01cf: mov    %edx,%r11d
  0x00000000026c01d2: sar    $0x1f,%r11d        ;*irem
                                                ; - Test04::fizzBuzzCore1@6 (line 29)

  0x00000000026c01d6: imul   $0x66666667,%r10,%r8
  0x00000000026c01dd: imul   $0x55555556,%r10,%r10
  0x00000000026c01e4: sar    $0x21,%r8
  0x00000000026c01e8: sar    $0x20,%r10
  0x00000000026c01ec: mov    %r8d,%r8d
  0x00000000026c01ef: sub    %r11d,%r8d         ;*irem
                                                ; - Test04::fizzBuzzCore1@14 (line 31)

  0x00000000026c01f2: mov    %r10d,%r10d
  0x00000000026c01f5: sub    %r11d,%r10d        ;*irem
                                                ; - Test04::fizzBuzzCore1@6 (line 29)

  0x00000000026c01f8: mov    %r8d,%r11d
  0x00000000026c01fb: shl    $0x2,%r11d
  0x00000000026c01ff: add    %r8d,%r11d         ;*irem
                                                ; - Test04::fizzBuzzCore1@14 (line 31)

  0x00000000026c0202: mov    %r10d,%r9d
  0x00000000026c0205: shl    %r9d
  0x00000000026c0208: add    %r10d,%r9d         ;*irem
                                                ; - Test04::fizzBuzzCore1@6 (line 29)

  0x00000000026c020b: cmp    %r9d,%edx
  0x00000000026c020e: jne    0x00000000026c021c  ;*ifeq
                                                ; - Test04::fizzBuzzCore1@21 (line 33)

  0x00000000026c0210: cmp    %r11d,%edx
  0x00000000026c0213: jne    0x00000000026c021c  ;*ifeq
                                                ; - Test04::fizzBuzzCore1@25 (line 33)

  0x00000000026c0215: mov    $0x4,%eax
  0x00000000026c021a: jmp    0x00000000026c0239  ;*iload_1
                                                ; - Test04::fizzBuzzCore1@32 (line 35)

  0x00000000026c021c: cmp    %r9d,%edx
  0x00000000026c021f: jne    0x00000000026c0228  ;*ifeq
                                                ; - Test04::fizzBuzzCore1@33 (line 35)

  0x00000000026c0221: mov    $0x3,%eax
  0x00000000026c0226: jmp    0x00000000026c0239
  0x00000000026c0228: cmp    %r11d,%edx
  0x00000000026c022b: jne    0x00000000026c0234  ;*ifeq
                                                ; - Test04::fizzBuzzCore1@41 (line 37)

  0x00000000026c022d: mov    $0x2,%eax
  0x00000000026c0232: jmp    0x00000000026c0239
  0x00000000026c0234: mov    $0x1,%eax          ;*irem
                                                ; - Test04::fizzBuzzCore1@6 (line 29)

  0x00000000026c0239: add    $0x10,%rsp
  0x00000000026c023d: pop    %rbp
  0x00000000026c023e: test   %eax,-0x2470244(%rip)        # 0x0000000000250000
                                                ;   {poll_return}
  0x00000000026c0244: retq   
  0x00000000026c0245: hlt    
  0x00000000026c0246: hlt    
  0x00000000026c0247: hlt    
  0x00000000026c0248: hlt    
  0x00000000026c0249: hlt    
  0x00000000026c024a: hlt    
  0x00000000026c024b: hlt    
  0x00000000026c024c: hlt    
  0x00000000026c024d: hlt    
  0x00000000026c024e: hlt    
  0x00000000026c024f: hlt    
  0x00000000026c0250: hlt    
  0x00000000026c0251: hlt    
  0x00000000026c0252: hlt    
  0x00000000026c0253: hlt    
  0x00000000026c0254: hlt    
  0x00000000026c0255: hlt    
  0x00000000026c0256: hlt    
  0x00000000026c0257: hlt    
  0x00000000026c0258: hlt    
  0x00000000026c0259: hlt    
  0x00000000026c025a: hlt    
  0x00000000026c025b: hlt    
  0x00000000026c025c: hlt    
  0x00000000026c025d: hlt    
  0x00000000026c025e: hlt    
  0x00000000026c025f: hlt    
[Exception Handler]
[Stub Code]
  0x00000000026c0260: jmpq   0x000000000261c560  ;   {no_reloc}
[Deopt Handler Code]
  0x00000000026c0265: callq  0x00000000026c026a
  0x00000000026c026a: subq   $0x5,(%rsp)
  0x00000000026c026f: jmpq   0x00000000025f6f40  ;   {runtime_call}
  0x00000000026c0274: hlt    
  0x00000000026c0275: hlt    
  0x00000000026c0276: hlt    
  0x00000000026c0277: hlt

But...

... what may be surprising, however, is: It does not compute a modulo operation at all!

At least, not explicitly: There is no idiv instruction appearing in this code! So the JIT really tries hard to avoid the costly divisions, by doing some nasty, nasty bit twiddling hacks: The instructions

  0x00000000026c01d6: imul   $0x66666667,%r10,%r8
  0x00000000026c01dd: imul   $0x55555556,%r10,%r10
  0x00000000026c01e4: sar    $0x21,%r8
  0x00000000026c01e8: sar    $0x20,%r10
  (and following...)

are a "division-free" implementation of a division. For example, the method

private static int divideBy3(int n)
{
    long r10 = n;
    r10 *= 0x55555556L;
    r10 >>>= 0x20;
    long r10d = r10 & 0xFFFFFFFFL;
    return (int)r10d;
}

uses these magic constants and shifts to compute a division by 3 (similarly, for 5 with the other constant). I did not do the math myself, but an explaination of how the modulo operation is derived from that can be found at Page 32 of the "INTEGER DIVISION BY CONSTANTS" document from Hacker's Delight.

score 2 · Accepted Answer

如果您忽略成本，您可以使用一个开关println

public static void fizzBuzz(int n) {
    for(int i = 1; i <= n; i++) {
         switch(i % 15) {
             case 0: 
                 System.out.println("FizzBuzz"); 
                 break;
             case 3: case 6: case 9: case 12: 
                 System.out.println("Fizz"); 
                 break;
             case 5: case 10: 
                 System.out.println("Buzz"); 
                 break;
             default:
                 System.out.println(i);
                 break;
         }
    }
}

java - Java 优化器是否记忆计算值？

2 回答 2

But...

Related

Reference