| # | Model | Date | Type | Source | Params | OCRGenScore ↑ | T2I (VIEScore) ↑ | T2I (AR) ↑ | Edit (1-LPIPS) ↑ | Edit (AR) ↑ | Dewarping (DD) ↑ | Deshadow (MSSSIM) ↑ | Deblur (MSSSIM) ↑ |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Closed-Source Models | |||||||||||||
| 1 | Nano Banana Pro | 2025.11 | Unified U&G | Closed | – | 77.19 | 92.24 | 76.96 | 85.22 | 71.46 | 42.52 | 85.93 | 61.37 |
| 3 | Seedream 4.5 | 2025.12 | Specialized | Closed | – | 63.35 | 90.45 | 74.09 | 65.75 | 45.19 | 23.85 | 57.99 | 55.09 |
| 9 | GPT Image 1.5 | 2025.03 | Specialized | Closed | – | 54.00 | 93.41 | 68.72 | 57.36 | 42.73 | 29.55 | 36.10 | 32.09 |
| Open-Source · Unified Understanding & Generation | |||||||||||||
| 5 | BAGEL | 2025.05 | Unified U&G | Open | 14B | 59.11 | 57.08 | 14.95 | 87.03 | 15.07 | 20.80 | 76.18 | 76.27 |
| 8 | OmniGen2 | 2025.06 | Unified U&G | Open | 7B | 54.24 | 59.96 | 26.37 | 65.83 | 13.82 | 21.21 | 64.26 | 65.17 |
| 11 | InternVL-U | 2026.03 | Unified U&G | Open | 4B | 43.64 | 66.25 | 44.17 | 53.87 | 17.21 | 28.69 | 49.45 | 30.80 |
| 13 | Janus-4o | 2025.06 | Unified U&G | Open | 7B | 29.58 | 53.59 | 13.63 | 45.53 | 5.80 | 24.88 | 34.28 | 39.33 |
| 15 | ILLUME+ | 2025.04 | Unified U&G | Open | 7B | 28.39 | 43.15 | 5.46 | 42.71 | 2.68 | 27.21 | 42.78 | 38.28 |
| Open-Source · Specialized Generation | |||||||||||||
| 2 | FLUX.2-dev | 2025.11 | Specialized | Open | 32B | 70.19 | 88.88 | 66.37 | 83.92 | 41.56 | 41.41 | 67.97 | 71.87 |
| 4 | LongCat-Image | 2025.12 | Specialized | Open | 6B | 66.39 | 84.02 | 67.51 | 85.56 | 51.53 | 28.04 | 72.12 | 56.62 |
| 6 | FLUX.2-Klein-9B | 2026.01 | Specialized | Open | 9B | 59.28 | 82.84 | 39.63 | 81.18 | 31.10 | 24.44 | 57.98 | 56.74 |
| 7 | Qwen-Image | 2025.12 | Specialized | Open | 20B | 56.29 | 84.41 | 65.21 | 65.65 | 41.20 | 25.14 | 50.05 | 38.81 |
| 10 | GLM-Image | 2026.01 | Specialized | Open | 9B | 50.12 | 83.53 | 69.36 | 70.97 | 21.54 | 24.99 | 43.15 | 41.13 |
| 12 | FLUX.1-Kontext-dev | 2025.06 | Specialized | Open | 12B | 36.51 | 39.58 | 21.69 | 53.76 | 15.13 | 24.30 | 30.36 | 30.80 |
| 14 | SD-3.5-Large | 2024.10 | Specialized | Open | 8B | 29.53 | 50.94 | 27.51 | 47.43 | 5.77 | 30.99 | 29.07 | 32.64 |
| # | Model | OCRGenScore ↑ | Text Removal ↑ | Style Transfer: Artistic Text ↑ | Style Transfer: Hist. Doc. (VIEScore) ↑ | Hist. Doc. Rest. (1-LPIPS) ↑ | Scene Text SR (MSSSIM) ↑ | Layout-Aware Text Gen. ↑ | |||
|---|---|---|---|---|---|---|---|---|---|---|---|
| Handwriting (MSSSIM) | Scene Text (MSSSIM) | VIEScore | AR | VIEScore | AR | ||||||
| Closed-Source Models | |||||||||||
| 1 | Nano Banana Pro | 77.19 | 84.16 | 91.66 | 78.27 | 89.00 | 77.66 | 74.15 | 64.67 | 88.87 | 100.00 |
| 9 | GPT Image 1.5 | 54.00 | 47.59 | 51.38 | 87.16 | 94.22 | 80.57 | 46.18 | 23.23 | 85.69 | 97.77 |
| 3 | Seedream 4.5 | 64.69 | 69.97 | 83.93 | 81.52 | 99.34 | 78.07 | 59.88 | 42.02 | 54.50 | 95.00 |
| Open-Source · Unified Understanding & Generation | |||||||||||
| 5 | BAGEL | 59.11 | 80.23 | 79.78 | 33.19 | 12.80 | 44.49 | 73.71 | 87.67 | 75.31 | 32.99 |
| 8 | OmniGen2 | 54.24 | 47.28 | 82.11 | 35.17 | 47.59 | 64.66 | 50.64 | 83.75 | 72.58 | 43.33 |
| 11 | InternVL-U | 43.64 | 57.19 | 61.13 | 53.41 | 59.99 | 20.75 | 37.45 | 17.74 | 53.94 | 85.83 |
| 15 | ILLUME+ | 28.39 | 48.93 | 68.42 | 0.50 | 2.26 | 1.00 | 25.84 | 21.88 | 7.67 | 3.29 |
| 13 | Janus-4o | 29.58 | 43.69 | 35.77 | 25.08 | 8.17 | 16.53 | 28.60 | 19.05 | 41.55 | 17.32 |
| Open-Source · Specialized Generation | |||||||||||
| 7 | Qwen-Image | 56.29 | 49.22 | 44.56 | 70.63 | 95.47 | 72.53 | 59.48 | 27.65 | 84.44 | 98.93 |
| 4 | LongCat-Image | 66.39 | 78.11 | 89.56 | 69.22 | 94.27 | 69.74 | 66.66 | 24.37 | 83.57 | 96.14 |
| 14 | SD-3.5-Large | 29.53 | 44.64 | 26.11 | 56.64 | 44.01 | 0.00 | 42.79 | 17.37 | 18.52 | 4.84 |
| 12 | FLUX.1-Kontext-dev | 36.51 | 58.34 | 31.87 | 58.66 | 52.59 | 52.80 | 43.24 | 22.49 | 49.00 | 28.51 |
| 2 | FLUX.2-dev | 70.19 | 78.75 | 93.96 | 79.86 | 91.32 | 68.48 | 74.62 | 52.78 | 76.53 | 44.61 |
| 6 | FLUX.2-Klein-9B | 59.28 | 65.71 | 80.59 | 78.48 | 93.39 | 67.48 | 63.81 | 46.46 | 74.05 | 41.61 |
| 10 | GLM-Image | 50.12 | 64.61 | 16.21 | 59.77 | 51.90 | 77.81 | 67.98 | 11.80 | 80.81 | 66.67 |
| # | Model | Document ↑ | Handwriting ↑ | Scene Text ↑ | Artistic Text ↑ | Layout-Rich Text ↑ | |||
|---|---|---|---|---|---|---|---|---|---|
| Modern | Historical | Slide | Poster | Layout-Aware | |||||
| Closed-Source Models | |||||||||
| 1 | Nano Banana Pro | 70.87 | 71.37 | 78.28 | 84.67 | 89.11 | 92.86 | 80.85 | 94.43 |
| 3 | Seedream 4.5 | 48.74 | 61.82 | 60.30 | 70.34 | 89.07 | 77.20 | 74.23 | 74.75 |
| 9 | GPT Image 1.5 | 36.31 | 56.47 | 57.16 | 60.37 | 86.26 | 80.67 | 63.21 | 91.73 |
| Open-Source · Unified Understanding & Generation | |||||||||
| 5 | BAGEL | 56.33 | 47.63 | 55.47 | 63.83 | 40.25 | 47.79 | 44.91 | 54.15 |
| 8 | OmniGen2 | 45.97 | 46.84 | 40.66 | 63.81 | 47.45 | 42.84 | 42.59 | 57.96 |
| 11 | InternVL-U | 36.83 | 29.96 | 47.58 | 49.94 | 57.94 | 52.56 | 41.60 | 69.88 |
| 13 | Janus-4o | 31.75 | 25.88 | 33.28 | 33.51 | 29.09 | 45.85 | 26.94 | 29.43 |
| 15 | ILLUME+ | 33.87 | 19.48 | 32.55 | 36.05 | 13.40 | 34.90 | 21.57 | 5.48 |
| Open-Source · Specialized Generation | |||||||||
| 2 | FLUX.2-dev | 60.17 | 64.36 | 69.75 | 75.50 | 87.04 | 80.17 | 70.68 | 80.52 |
| 4 | LongCat-Image | 54.79 | 63.47 | 70.92 | 72.21 | 85.95 | 79.04 | 82.48 | 89.86 |
| 6 | FLUX.2-Klein-9B | 47.85 | 56.55 | 60.73 | 66.89 | 72.56 | 68.01 | 57.04 | 57.83 |
| 7 | Qwen-Image | 45.22 | 56.37 | 52.12 | 58.59 | 87.26 | 75.63 | 70.40 | 91.68 |
| 10 | GLM-Image | 41.42 | 62.62 | 57.83 | 49.74 | 68.66 | 66.84 | 64.12 | 73.74 |
| 12 | FLUX.1-Kontext-dev | 31.10 | 35.72 | 33.39 | 32.38 | 52.95 | 38.09 | 27.40 | 38.76 |
| 14 | SD-3.5-Large | 30.11 | 22.86 | 33.23 | 30.56 | 47.90 | 43.93 | 36.96 | 11.68 |
| # | Model | OCRGenScoreT2I ↑ | Hist. Doc. ↑ | Handwriting ↑ | Scene Text ↑ | Artistic Text ↑ | Slide ↑ | Poster ↑ | |||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| VIEScore | AR | VIEScore | AR | VIEScore | AR | VIEScore | AR | VIEScore | VIEScore | AR | |||
| Closed-Source Models | |||||||||||||
| 1 | Nano Banana Pro | 85.53 | 92.02 | 61.76 | 91.36 | 74.35 | 94.06 | 90.89 | 90.91 | 91.32 | 94.82 | 91.87 | 58.22 |
| 2 | Seedream 4.5 | 83.83 | 89.60 | 53.38 | 89.48 | 53.92 | 90.81 | 84.58 | 90.98 | 95.42 | 90.72 | 92.50 | 83.87 |
| 3 | GPT Image 1.5 | 82.63 | 92.35 | 49.63 | 94.10 | 67.74 | 93.20 | 83.19 | 92.37 | 77.29 | 94.25 | 93.62 | 59.43 |
| Open-Source · Unified Understanding & Generation | |||||||||||||
| 11 | InternVL-U | 58.44 | 58.26 | 10.27 | 56.66 | 40.59 | 69.78 | 80.09 | 83.89 | 46.83 | 76.11 | — | — |
| 12 | OmniGen2 | 45.58 | 73.25 | 18.98 | 59.34 | 24.48 | 63.12 | 32.71 | 58.37 | 24.57 | 48.42 | 72.77 | 12.08 |
| 14 | Janus-4o | 38.79 | 72.78 | 10.01 | 49.22 | 14.59 | 53.66 | 15.70 | 42.48 | 19.20 | 62.31 | 62.30 | 0.07 |
| 15 | BAGEL | 37.34 | 54.34 | 8.59 | 54.11 | 16.05 | 65.46 | 9.53 | 55.08 | 25.76 | 43.69 | 70.40 | 1.40 |
| 17 | ILLUME+ | 27.26 | 59.42 | 8.37 | 43.87 | 7.81 | 51.07 | 2.72 | 24.62 | 4.12 | 36.13 | 52.72 | 0.14 |
| 18 | Show-o2* | 19.44 | 28.71 | 5.01 | 25.83 | 11.26 | 24.79 | 5.47 | 35.14 | 23.67 | 20.07 | 33.22 | 0.07 |
| Open-Source · Specialized Generation | |||||||||||||
| 4 | FLUX.2-dev | 79.28 | 85.59 | 31.41 | 81.89 | 29.94 | 79.47 | 99.99 | 85.70 | 92.47 | 81.14 | 89.75 | 67.62 |
| 5 | LongCat-Image | 78.92 | 82.91 | 42.03 | 78.30 | 51.74 | 87.47 | 95.99 | 88.55 | 88.05 | 83.82 | 88.41 | 75.96 |
| 6 | GLM-Image | 77.95 | 79.85 | 60.99 | 77.95 | 64.16 | 88.51 | 82.17 | 88.64 | 75.77 | 78.70 | 90.49 | 69.48 |
| 7 | Qwen-Image | 76.03 | 87.40 | 26.36 | 60.28 | 88.00 | 77.76 | 99.53 | 86.49 | 95.47 | 85.91 | 84.32 | 58.30 |
| 8 | Z-Image* | 74.46 | 66.22 | 57.31 | 68.52 | 72.78 | 75.87 | 93.08 | 77.29 | 89.80 | 66.44 | 79.48 | 77.30 |
| 9 | Ovis-Image* | 69.02 | 52.45 | 52.75 | 55.30 | 54.00 | 76.67 | 85.82 | 77.38 | 92.70 | 57.14 | 75.97 | 90.87 |
| 10 | FLUX.2-Klein-9B | 59.28 | 82.84 | 39.63 | 88.91 | 31.10 | 80.47 | 84.11 | 83.86 | 45.04 | 51.34 | 51.37 | 37.23 |
| 13 | SD-3.5-Large | 43.36 | 52.39 | 7.40 | 41.20 | 19.86 | 47.48 | 42.44 | 59.53 | 43.76 | 58.06 | 64.30 | 25.70 |
| 16 | FLUX.1-Kontext-dev | 31.78 | 37.21 | 9.55 | 36.20 | 21.25 | 40.15 | 29.01 | 39.80 | 33.69 | 39.25 | 51.98 | 3.96 |
| # | Model | OCRGenScoreEdit ↑ | Modern Doc. ↑ | Historical Doc. ↑ | Handwriting ↑ | Scene Text ↑ | Artistic Text ↑ | Slide ↑ | Poster ↑ | |||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1-LPIPS | AR | 1-LPIPS | AR | 1-LPIPS | AR | 1-LPIPS | AR | 1-LPIPS | AR | 1-LPIPS | AR | 1-LPIPS | AR | |||
| Closed-Source Models | ||||||||||||||||
| 2 | Nano Banana Pro | 80.30 | 88.15 | 80.28 | 74.66 | 38.87 | 77.42 | 58.21 | 92.03 | 73.36 | 92.15 | 94.00 | 85.92 | 96.90 | 93.50 | 79.80 |
| 6 | Seedream 4.5 | 56.74 | 65.14 | 41.20 | 60.95 | 13.77 | 53.77 | 24.48 | 68.67 | 45.31 | 67.15 | 100.00 | 64.06 | 64.31 | 59.84 | 69.71 |
| 7 | GPT Image 1.5 | 52.32 | 55.66 | 28.30 | 48.73 | 8.80 | 54.08 | 31.89 | 59.92 | 45.78 | 67.45 | 65.31 | 60.28 | 67.98 | 55.36 | 44.42 |
| Open-Source · Unified Understanding & Generation | ||||||||||||||||
| 8 | BAGEL | 50.57 | 87.58 | 9.00 | 79.45 | 2.50 | 86.27 | 15.95 | 90.55 | 20.66 | 87.63 | 17.63 | 83.69 | 20.09 | 92.57 | 15.28 |
| 11 | OmniGen2 | 39.86 | 61.76 | 9.32 | 68.48 | 3.66 | 62.11 | 11.90 | 73.55 | 10.21 | 74.24 | 24.66 | 49.88 | 24.11 | 51.50 | 6.13 |
| 13 | InternVL-U | 35.03 | 51.20 | 5.59 | 43.98 | 10.75 | 58.54 | 15.32 | 58.50 | 21.29 | 58.39 | 45.15 | 44.29 | 13.72 | 49.59 | 14.15 |
| 14 | Janus-4o | 28.09 | 43.82 | 2.42 | 32.06 | 1.92 | 44.59 | 3.89 | 47.29 | 6.24 | 52.61 | 27.68 | 56.12 | 2.68 | 43.76 | 0.82 |
| 16 | ILLUME+ | 22.68 | 39.25 | 1.25 | 41.83 | 1.19 | 41.46 | 3.89 | 61.53 | 3.46 | 48.29 | 0.62 | 63.53 | 3.78 | 42.74 | 0.69 |
| Open-Source · Specialized Generation | ||||||||||||||||
| 1 | FireRed-Image-Edit-v1.1* | 81.27 | 89.75 | 60.08 | 80.46 | 44.70 | 90.68 | 76.18 | 91.97 | 66.97 | 21.18 | 93.26 | 88.02 | 98.66 | 96.19 | 79.47 |
| 3 | LongCat-Image | 70.97 | 84.35 | 29.83 | 76.75 | 32.06 | 86.55 | 52.73 | 87.56 | 54.12 | 83.62 | 91.95 | 83.81 | 64.72 | 92.98 | 72.66 |
| 4 | FLUX.2-dev | 64.14 | 83.53 | 34.86 | 74.61 | 13.08 | 80.93 | 31.32 | 87.57 | 48.38 | 87.57 | 89.77 | 84.51 | 56.67 | 91.28 | 33.86 |
| 5 | Qwen-Image | 57.68 | 67.63 | 33.60 | 63.37 | 21.50 | 51.40 | 25.06 | 72.08 | 58.58 | 85.95 | 72.33 | 59.48 | 27.65 | 84.44 | 98.93 |
| 9 | FLUX.2-Klein-9B | 48.32 | 76.15 | 16.35 | 67.51 | 5.09 | 59.92 | 15.77 | 82.84 | 24.61 | 67.98 | 47.48 | 75.98 | 33.96 | 79.19 | 16.77 |
| 10 | GLM-Image | 46.72 | 41.42 | 14.42 | 54.20 | 14.42 | 65.15 | 44.39 | 59.40 | 77.81 | 67.98 | 47.48 | 78.31 | 84.08 | 58.31 | 74.23 |
| 12 | FLUX.1-Kontext-dev | 35.48 | 51.49 | 6.82 | 45.34 | 10.75 | 54.66 | 15.52 | 62.04 | 11.27 | 62.81 | 20.97 | 57.29 | 16.53 | 52.86 | 5.39 |
| 15 | SD-3.5-Large | 26.74 | 35.32 | 4.45 | 37.15 | 0.38 | 43.19 | 5.87 | 42.86 | 5.56 | 64.87 | 18.46 | 59.60 | 0.00 | 51.50 | 6.13 |
| # | Model | OCRGenScore ↑ | T2I (VIEScore) ↑ | T2I (AR) ↑ | Edit (1-LPIPS) ↑ | Edit (AR) ↑ | Dewarping (DD) ↑ | Deshadow (MSSSIM) ↑ | Deblur (MSSSIM) ↑ |
|---|---|---|---|---|---|---|---|---|---|
| Closed-Source Models | |||||||||
| 1 | Nano Banana Pro | 76.02 | 91.61 | 82.27 | 83.43 | 67.96 | 43.99 | 93.82 | 60.26 |
| 4 | Seedream 4.5 | 63.55 | 91.09 | 66.89 | 61.87 | 40.88 | 21.45 | 83.87 | 49.51 |
| 8 | GPT Image 1.5 | 53.04 | 93.50 | 69.24 | 55.35 | 34.08 | 31.69 | 34.34 | 33.60 |
| Open-Source · Unified Understanding & Generation | |||||||||
| 7 | BAGEL | 55.87 | 53.09 | 2.32 | 86.09 | 8.97 | 19.25 | 74.19 | 59.08 |
| 9 | OmniGen2 | 47.82 | 51.27 | 0.87 | 63.12 | 4.99 | 19.56 | 68.38 | 46.71 |
| 11 | InternVL-U | 43.88 | 70.18 | 44.30 | 50.93 | 9.34 | 23.52 | 64.68 | 33.54 |
| 12 | ILLUME+ | 29.17 | 53.17 | 0.56 | 42.34 | 1.69 | 22.80 | 44.82 | 42.50 |
| 15 | Janus-4o | 25.50 | 44.85 | 0.19 | 44.02 | 1.17 | 25.94 | 31.45 | 39.39 |
| Open-Source · Specialized Generation | |||||||||
| 2 | FLUX.2-dev | 69.30 | 89.60 | 58.42 | 81.81 | 28.48 | 37.52 | 81.94 | 61.19 |
| 3 | LongCat-Image | 65.97 | 87.39 | 68.92 | 85.83 | 52.70 | 28.00 | 79.45 | 59.62 |
| 5 | FLUX.2-Klein-9B | 56.53 | 80.47 | 11.39 | 78.95 | 12.37 | 23.16 | 83.86 | 51.34 |
| 6 | Qwen-Image | 56.44 | 88.11 | 68.43 | 60.69 | 33.06 | 19.27 | 60.62 | 42.47 |
| 10 | GLM-Image | 46.72 | 87.98 | 63.47 | 70.68 | 13.82 | 21.44 | 25.73 | 44.02 |
| 13 | FLUX.1-Kontext-dev | 28.72 | 3.85 | 0.22 | 53.46 | 2.90 | 19.78 | 26.00 | 34.36 |
| 14 | SD-3.5-Large | 26.74 | 22.55 | 0.60 | 45.29 | 1.01 | 31.62 | 22.71 | 37.63 |
| # | Model | OCRGenScore ↑ | Text Removal ↑ | Style Transfer: Artistic Text ↑ | Hist. Doc. (VIEScore) ↑ | Hist. Doc. Rest. (1-LPIPS) ↑ | Scene Text SR (MSSSIM) ↑ | Layout-Aware Text Gen. ↑ | |||
|---|---|---|---|---|---|---|---|---|---|---|---|
| Handwriting (MSSSIM) | Scene Text (MSSSIM) | VIEScore | AR | VIEScore | AR | ||||||
| Closed-Source Models | |||||||||||
| 1 | Nano Banana Pro | 76.02 | 84.16 | 93.53 | 64.21 | 79.23 | 77.66 | 74.15 | 52.98 | 88.02 | 100.00 |
| 8 | GPT Image 1.5 | 53.04 | 47.59 | 50.75 | 81.63 | 90.11 | 80.57 | 46.18 | 19.68 | 86.05 | 95.30 |
| 4 | Seedream 4.5 | 63.55 | 69.97 | 88.99 | 76.60 | 99.23 | 78.07 | 59.88 | 29.30 | 55.26 | 89.47 |
| Open-Source · Unified Understanding & Generation | |||||||||||
| 7 | BAGEL | 55.87 | 80.23 | 87.20 | 12.74 | 1.30 | 44.49 | 73.71 | 88.05 | 82.51 | 1.74 |
| 9 | OmniGen2 | 47.82 | 47.28 | 83.19 | 5.66 | 6.52 | 64.66 | 50.64 | 82.67 | 67.37 | 0.00 |
| 11 | InternVL-U | 43.88 | 57.19 | 53.47 | 44.41 | 57.21 | 20.75 | 37.45 | 19.48 | 62.28 | 83.30 |
| 12 | ILLUME+ | 29.17 | 48.93 | 71.65 | 0.00 | 0.00 | 1.00 | 25.84 | 16.19 | 3.08 | 0.00 |
| 15 | Janus-4o | 28.50 | 43.69 | 37.33 | 0.00 | 0.00 | 16.53 | 28.60 | 16.21 | 32.96 | 0.00 |
| Open-Source · Specialized Generation | |||||||||||
| 6 | Qwen-Image | 56.44 | 49.22 | 38.52 | 68.27 | 96.01 | 72.53 | 59.48 | 27.24 | 82.44 | 99.00 |
| 3 | LongCat-Image | 65.97 | 78.11 | 90.32 | 58.02 | 91.30 | 69.74 | 66.66 | 18.05 | 82.47 | 92.90 |
| 14 | SD-3.5-Large | 26.74 | 44.64 | 25.62 | 48.05 | 3.90 | 0.00 | 42.79 | 18.16 | 17.93 | 0.00 |
| 13 | FLUX.1-Kontext-dev | 28.52 | 58.34 | 27.83 | 37.85 | 3.96 | 52.80 | 43.24 | 21.34 | 22.61 | 0.00 |
| 2 | FLUX.2-dev | 69.30 | 78.75 | 93.96 | 73.96 | 82.64 | 68.48 | 74.62 | 33.10 | 75.85 | 36.57 |
| 5 | FLUX.2-Klein-9B | 56.53 | 65.71 | 92.63 | 67.54 | 86.79 | 67.48 | 63.81 | 44.49 | 75.16 | 10.47 |
| 10 | GLM-Image | 46.72 | 64.61 | 18.00 | 54.20 | 44.39 | 77.81 | 67.98 | 11.07 | 79.31 | 58.31 |
| # | Model | OCRGenScore ↑ | T2I (VIEScore) ↑ | T2I (AR) ↑ | Edit (1-LPIPS) ↑ | Edit (AR) ↑ | Dewarping (DD) ↑ | Deshadow (MSSSIM) ↑ | Deblur (MSSSIM) ↑ |
|---|---|---|---|---|---|---|---|---|---|
| Closed-Source Models | |||||||||
| 1 | Nano Banana Pro | 78.75 | 92.87 | 71.71 | 87.00 | 74.96 | 41.53 | 81.99 | 62.92 |
| 4 | Seedream 4.5 | 62.73 | 89.82 | 75.22 | 61.62 | 49.51 | 25.45 | 45.05 | 62.90 |
| 9 | GPT Image 1.5 | 53.95 | 93.31 | 68.21 | 59.36 | 51.37 | 28.13 | 36.99 | 29.98 |
| Open-Source · Unified Understanding & Generation | |||||||||
| 5 | OmniGen2 | 62.26 | 68.55 | 51.47 | 68.54 | 22.66 | 22.11 | 62.20 | 39.44 |
| 7 | BAGEL | 60.96 | 61.02 | 27.37 | 87.96 | 21.16 | 21.84 | 77.17 | 61.57 |
| 11 | InternVL-U | 45.45 | 62.36 | 44.04 | 56.82 | 25.08 | 32.13 | 41.83 | 26.97 |
| 14 | Janus-4o | 33.54 | 63.19 | 26.86 | 47.04 | 10.43 | 24.18 | 35.70 | 33.95 |
| 15 | ILLUME+ | 31.02 | 33.17 | 10.34 | 43.09 | 3.67 | 30.16 | 41.75 | 38.54 |
| Open-Source · Specialized Generation | |||||||||
| 2 | FLUX.2-dev | 70.74 | 88.17 | 74.19 | 86.04 | 54.64 | 43.81 | 60.98 | 59.35 |
| 3 | LongCat-Image | 65.76 | 80.70 | 66.12 | 85.29 | 56.37 | 28.07 | 68.45 | 60.66 |
| 6 | FLUX.2-Klein-9B | 61.95 | 85.18 | 67.43 | 83.40 | 49.82 | 25.28 | 45.04 | 51.37 |
| 8 | Qwen-Image | 55.14 | 80.76 | 62.04 | 70.61 | 49.34 | 29.06 | 44.77 | 33.67 |
| 10 | GLM-Image | 47.17 | 81.11 | 75.16 | 71.25 | 29.26 | 27.36 | 51.96 | 37.08 |
| 12 | FLUX.1-Kontext-dev | 43.59 | 72.99 | 42.82 | 54.67 | 27.37 | 37.31 | 33.54 | 25.53 |
| 13 | SD-3.5-Large | 35.32 | 75.96 | 54.00 | 49.65 | 10.30 | 30.37 | 32.24 | 25.66 |
| # | Model | OCRGenScore ↑ | Text Removal ↑ | Style Transfer: Artistic Text ↑ | Style Transfer: Hist. Doc. (VIEScore) ↑ | Hist. Doc. Rest. (1-LPIPS) ↑ | Scene Text SR (MSSSIM) ↑ | Layout-Aware Text Gen. (VIEScore) ↑ | |||
|---|---|---|---|---|---|---|---|---|---|---|---|
| Handwriting (MSSSIM) | Scene Text (MSSSIM) | VIEScore | AR | VIEScore | AR | ||||||
| Closed-Source Models | |||||||||||
| 1 | Nano Banana Pro | 78.75 | - | 89.79 | 92.33 | 98.77 | - | - | 76.36 | 89.72 | 100.00 |
| 9 | GPT Image 1.5 | 53.95 | - | 52.01 | 92.69 | 98.33 | - | - | 26.78 | 85.33 | 100.00 |
| 4 | Seedream 4.5 | 62.73 | - | 78.87 | 86.44 | 99.45 | - | - | 54.74 | 53.74 | 100.00 |
| Open-Source · Unified Understanding & Generation | |||||||||||
| 5 | BAGEL | 60.96 | - | 72.36 | 53.64 | 24.30 | - | - | 87.29 | 68.11 | 64.24 |
| 7 | OmniGen2 | 62.26 | - | 81.03 | 64.68 | 88.66 | - | - | 84.83 | 77.79 | 86.66 |
| 11 | InternVL-U | 45.45 | - | 68.79 | 62.41 | 62.77 | - | - | 16.00 | 45.60 | 88.36 |
| 15 | ILLUME+ | 31.02 | - | 65.19 | 1.00 | 4.52 | - | - | 27.57 | 12.26 | 6.58 |
| 14 | Janus-4o | 33.54 | - | 34.21 | 50.16 | 16.34 | - | - | 21.89 | 50.14 | 34.64 |
| Open-Source · Specialized Generation | |||||||||||
| 8 | Qwen-Image | 55.14 | - | 50.60 | 72.99 | 94.93 | - | - | 28.06 | 86.44 | 98.86 |
| 3 | LongCat-Image | 65.76 | - | 88.80 | 80.42 | 97.24 | - | - | 30.69 | 84.67 | 99.38 |
| 13 | SD-3.5-Large | 35.32 | - | 26.60 | 65.23 | 84.12 | - | - | 16.58 | 19.11 | 9.68 |
| 12 | FLUX.1-Kontext-dev | 43.59 | - | 35.91 | 79.47 | 100.00 | - | - | 23.64 | 75.39 | 57.02 |
| 2 | FLUX.2-dev | 70.74 | - | 99.71 | 85.76 | 100.00 | - | - | 72.46 | 77.21 | 52.65 |
| 6 | FLUX.2-Klein-9B | 61.95 | - | 68.55 | 89.42 | 100.00 | - | - | 48.43 | 72.94 | 72.75 |
| 10 | GLM-Image | 47.17 | - | 14.42 | 65.34 | 59.41 | - | - | 12.53 | 82.31 | 75.03 |
OCRGenBench is the most comprehensive benchmark to date for evaluating the OCR generative capabilities of generative models. It pioneers in the unification of T2I generation, text editing, and OCR-related image-to-image translation to universally reflect a model's visual text synthesis abilities, i.e., OCR generative capabilities. The benchmark covers 5 common text categories and 33 OCR generative tasks, including 1,060 challenging, human-annotated samples with dense text, varied layouts, multiple aspect ratios, and bilingual content. We also design a unified metric OCRGenScore, assessing text accuracy, instruction following, visual quality, and structural consistency in visual text synthesis.
Evaluated your model on OCRGenBench? Upload your results Excel to request a leaderboard entry.
Download the template, fill in your model's scores, then upload to preview the format. The submission portal will open soon — follow the GitHub repo for updates.
@article{zhang2025ocrgenbench,
title = {{OCRGenBench: A Comprehensive Benchmark for Evaluating OCR Generative Capabilities}},
author = {Zhang, Peirong and Xu, Haowei and Zhang, Jiaxin and Zheng, Xuhan and Xu, Guitao and Zhang, Yuyi and Liu, Junle and Yang, Zhenhua and Zhou, Wei and Jin, Lianwen},
journal = {arXiv preprint arXiv:2507.15085},
year = {2025},
}