Scale AI logo
SEAL Logo

VISTA

Visual Language Understanding

Last updated: April 4, 2025

Performance Comparison

1

54.65±1.46

2

51.79±0.63

2

51.66±1.08

2

50.78±0.57

2

50.07±1.14

4

49.59±0.66

5

47.32±1.78

6

48.23±0.70

7

46.97±1.29

8

45.50±1.20

8

45.34±0.91

9

45.25±0.40

11

43.25±1.26

13

43.02±1.14

13

42.11±1.39

15

41.14±0.58

15

39.95±0.80

16

39.85±0.71

17

Claude 3.5 Sonnet (October 2024)

38.72±0.51

19

Claude 3.5 Sonnet (June 2024)

38.37±0.70

19

38.33±0.55

19

ChatGPT-4o-latest (November 2024)

37.99±0.48

19

Gemini 1.5 Pro

37.07±1.34

24

GPT-4o (August 2024)

34.94±0.23

24

34.59±1.12

24

Gemini 1.5 Flash 002

34.03±1.41

25

Pixtral Large (November 2024)

33.89±0.69

25

32.69±1.40

29

Qwen2-VL-72B-Instruct

28.56±1.37

29

Claude 3 Opus

27.82±0.55

31

26.55±0.35

31

Nova Pro

26.27±0.61

31

Pixtral 12B (September 2024)

25.97±0.74

31

Nova Lite

25.50±0.77

33

Llama 3.2 90B Vision Instruct

24.61±0.80

36

Llama 3.2 11B Vision-Instruct

20.47±0.15

37

Phi 3.5 Vision-Instruct

15.18±0.81