Raspbian Package Auto-Building

Build log for gemmlowp (0.0~git20211220.e844ffd-1) on armhf

gemmlowp0.0~git20211220.e844ffd-1armhf → 2022-06-29 13:09:14

sbuild (Debian sbuild) 0.72.0 (25 Oct 2016) on mb-lxc-02

+==============================================================================+
| gemmlowp 0.0~git20211220.e844ffd-1 (armhf)   Wed, 29 Jun 2022 12:51:12 +0000 |
+==============================================================================+

Package: gemmlowp
Version: 0.0~git20211220.e844ffd-1
Source Version: 0.0~git20211220.e844ffd-1
Distribution: bookworm-staging
Machine Architecture: armhf
Host Architecture: armhf
Build Architecture: armhf

I: NOTICE: Log filtering will replace 'var/lib/schroot/mount/bookworm-staging-armhf-sbuild-29d97485-a6f0-4284-a196-890a5ae625fb' with '<<CHROOT>>'

+------------------------------------------------------------------------------+
| Update chroot                                                                |
+------------------------------------------------------------------------------+

Get:1 http://172.17.4.1/private bookworm-staging InRelease [11.3 kB]
Get:2 http://172.17.4.1/private bookworm-staging/main Sources [13.1 MB]
Get:3 http://172.17.4.1/private bookworm-staging/main armhf Packages [14.0 MB]
Fetched 27.1 MB in 10s (2793 kB/s)
Reading package lists...
W: No sandbox user '_apt' on the system, can not drop privileges
W: http://172.17.4.1/private/dists/bookworm-staging/InRelease: Key is stored in legacy trusted.gpg keyring (/etc/apt/trusted.gpg), see the DEPRECATION section in apt-key(8) for details.

+------------------------------------------------------------------------------+
| Fetch source files                                                           |
+------------------------------------------------------------------------------+


Check APT
---------

Checking available source versions...

Download source files with APT
------------------------------

Reading package lists...
NOTICE: 'gemmlowp' packaging is maintained in the 'Git' version control system at:
https://salsa.debian.org/science-team/gemmlowp.git
Please use:
git clone https://salsa.debian.org/science-team/gemmlowp.git
to retrieve the latest (possibly unreleased) updates to the package.
Need to get 550 kB of source archives.
Get:1 http://172.17.4.1/private bookworm-staging/main gemmlowp 0.0~git20211220.e844ffd-1 (dsc) [2093 B]
Get:2 http://172.17.4.1/private bookworm-staging/main gemmlowp 0.0~git20211220.e844ffd-1 (tar) [544 kB]
Get:3 http://172.17.4.1/private bookworm-staging/main gemmlowp 0.0~git20211220.e844ffd-1 (diff) [3372 B]
Fetched 550 kB in 0s (7062 kB/s)
Download complete and in download only mode
I: NOTICE: Log filtering will replace 'build/gemmlowp-hekJWY/gemmlowp-0.0~git20211220.e844ffd' with '<<PKGBUILDDIR>>'
I: NOTICE: Log filtering will replace 'build/gemmlowp-hekJWY' with '<<BUILDDIR>>'

+------------------------------------------------------------------------------+
| Install build-essential                                                      |
+------------------------------------------------------------------------------+


Setup apt archive
-----------------

Merged Build-Depends: build-essential, fakeroot
Filtered Build-Depends: build-essential, fakeroot
dpkg-deb: building package 'sbuild-build-depends-core-dummy' in '/<<BUILDDIR>>/resolver-EnGU3r/apt_archive/sbuild-build-depends-core-dummy.deb'.
dpkg-scanpackages: warning: Packages in archive but missing from override file:
dpkg-scanpackages: warning:   sbuild-build-depends-core-dummy
dpkg-scanpackages: info: Wrote 1 entries to output Packages file.
gpg: keybox '/<<BUILDDIR>>/resolver-EnGU3r/gpg/pubring.kbx' created
gpg: /<<BUILDDIR>>/resolver-EnGU3r/gpg/trustdb.gpg: trustdb created
gpg: key 37145E60F90AF620: public key "Sbuild Signer (Sbuild Build Dependency Archive Key) <buildd-tools-devel@lists.alioth.debian.org>" imported
gpg: Total number processed: 1
gpg:               imported: 1
gpg: key 37145E60F90AF620: "Sbuild Signer (Sbuild Build Dependency Archive Key) <buildd-tools-devel@lists.alioth.debian.org>" not changed
gpg: key 37145E60F90AF620: secret key imported
gpg: Total number processed: 1
gpg:              unchanged: 1
gpg:       secret keys read: 1
gpg:   secret keys imported: 1
gpg: using "Sbuild Signer" as default secret key for signing
Ign:1 copy:/<<BUILDDIR>>/resolver-EnGU3r/apt_archive ./ InRelease
Get:2 copy:/<<BUILDDIR>>/resolver-EnGU3r/apt_archive ./ Release [957 B]
Get:3 copy:/<<BUILDDIR>>/resolver-EnGU3r/apt_archive ./ Release.gpg [370 B]
Get:4 copy:/<<BUILDDIR>>/resolver-EnGU3r/apt_archive ./ Sources [349 B]
Get:5 copy:/<<BUILDDIR>>/resolver-EnGU3r/apt_archive ./ Packages [431 B]
Fetched 2107 B in 0s (7981 B/s)
Reading package lists...
W: No sandbox user '_apt' on the system, can not drop privileges
Reading package lists...

Install core build dependencies (apt-based resolver)
----------------------------------------------------

Installing build dependencies
Reading package lists...
Building dependency tree...
Reading state information...
The following packages were automatically installed and are no longer required:
  krb5-locales libpam-cap netbase sensible-utils
Use 'apt autoremove' to remove them.
The following NEW packages will be installed:
  sbuild-build-depends-core-dummy
0 upgraded, 1 newly installed, 0 to remove and 6 not upgraded.
Need to get 852 B of archives.
After this operation, 0 B of additional disk space will be used.
Get:1 copy:/<<BUILDDIR>>/resolver-EnGU3r/apt_archive ./ sbuild-build-depends-core-dummy 0.invalid.0 [852 B]
debconf: delaying package configuration, since apt-utils is not installed
Fetched 852 B in 0s (74.9 kB/s)
Selecting previously unselected package sbuild-build-depends-core-dummy.
(Reading database ... 12794 files and directories currently installed.)
Preparing to unpack .../sbuild-build-depends-core-dummy_0.invalid.0_armhf.deb ...
Unpacking sbuild-build-depends-core-dummy (0.invalid.0) ...
Setting up sbuild-build-depends-core-dummy (0.invalid.0) ...
W: No sandbox user '_apt' on the system, can not drop privileges

+------------------------------------------------------------------------------+
| Check architectures                                                          |
+------------------------------------------------------------------------------+

Arch check ok (armhf included in any)

+------------------------------------------------------------------------------+
| Install package build dependencies                                           |
+------------------------------------------------------------------------------+


Setup apt archive
-----------------

Merged Build-Depends: debhelper-compat (= 13), cmake
Filtered Build-Depends: debhelper-compat (= 13), cmake
dpkg-deb: building package 'sbuild-build-depends-gemmlowp-dummy' in '/<<BUILDDIR>>/resolver-EnGU3r/apt_archive/sbuild-build-depends-gemmlowp-dummy.deb'.
dpkg-scanpackages: warning: Packages in archive but missing from override file:
dpkg-scanpackages: warning:   sbuild-build-depends-core-dummy sbuild-build-depends-gemmlowp-dummy
dpkg-scanpackages: info: Wrote 2 entries to output Packages file.
gpg: using "Sbuild Signer" as default secret key for signing
Ign:1 copy:/<<BUILDDIR>>/resolver-EnGU3r/apt_archive ./ InRelease
Get:2 copy:/<<BUILDDIR>>/resolver-EnGU3r/apt_archive ./ Release [963 B]
Get:3 copy:/<<BUILDDIR>>/resolver-EnGU3r/apt_archive ./ Release.gpg [370 B]
Get:4 copy:/<<BUILDDIR>>/resolver-EnGU3r/apt_archive ./ Sources [496 B]
Get:5 copy:/<<BUILDDIR>>/resolver-EnGU3r/apt_archive ./ Packages [577 B]
Fetched 2406 B in 0s (12.1 kB/s)
Reading package lists...
W: No sandbox user '_apt' on the system, can not drop privileges
Reading package lists...

Install gemmlowp build dependencies (apt-based resolver)
--------------------------------------------------------

Installing build dependencies
Reading package lists...
Building dependency tree...
Reading state information...
The following packages were automatically installed and are no longer required:
  krb5-locales libpam-cap netbase
Use 'apt autoremove' to remove them.
The following additional packages will be installed:
  autoconf automake autopoint autotools-dev bsdextrautils cmake cmake-data
  debhelper dh-autoreconf dh-elpa-helper dh-strip-nondeterminism dwz
  emacsen-common file gettext gettext-base groff-base intltool-debian
  libarchive-zip-perl libarchive13 libbrotli1 libcurl4 libdebhelper-perl
  libelf1 libexpat1 libfile-stripnondeterminism-perl libicu71 libjsoncpp25
  libmagic-mgc libmagic1 libncurses6 libnghttp2-14 libpipeline1 libprocps8
  libpsl5 librhash0 librtmp1 libsigsegv2 libssh2-1 libsub-override-perl
  libtool libuchardet0 libuv1 libxml2 m4 man-db po-debconf procps
Suggested packages:
  autoconf-archive gnu-standards autoconf-doc cmake-doc ninja-build
  cmake-format dh-make gettext-doc libasprintf-dev libgettextpo-dev groff
  lrzip libtool-doc gfortran | fortran95-compiler gcj-jdk m4-doc apparmor less
  www-browser libmail-box-perl
Recommended packages:
  curl | wget | lynx ca-certificates libarchive-cpio-perl libgpm2 publicsuffix
  libltdl-dev libmail-sendmail-perl psmisc
The following NEW packages will be installed:
  autoconf automake autopoint autotools-dev bsdextrautils cmake cmake-data
  debhelper dh-autoreconf dh-elpa-helper dh-strip-nondeterminism dwz
  emacsen-common file gettext gettext-base groff-base intltool-debian
  libarchive-zip-perl libarchive13 libbrotli1 libcurl4 libdebhelper-perl
  libelf1 libexpat1 libfile-stripnondeterminism-perl libicu71 libjsoncpp25
  libmagic-mgc libmagic1 libncurses6 libnghttp2-14 libpipeline1 libprocps8
  libpsl5 librhash0 librtmp1 libsigsegv2 libssh2-1 libsub-override-perl
  libtool libuchardet0 libuv1 libxml2 m4 man-db po-debconf procps
  sbuild-build-depends-gemmlowp-dummy
0 upgraded, 49 newly installed, 0 to remove and 6 not upgraded.
Need to get 25.9 MB of archives.
After this operation, 106 MB of additional disk space will be used.
Get:1 copy:/<<BUILDDIR>>/resolver-EnGU3r/apt_archive ./ sbuild-build-depends-gemmlowp-dummy 0.invalid.0 [864 B]
Get:2 http://172.17.4.1/private bookworm-staging/main armhf libuchardet0 armhf 0.0.7-1 [65.0 kB]
Get:3 http://172.17.4.1/private bookworm-staging/main armhf groff-base armhf 1.22.4-8 [793 kB]
Get:4 http://172.17.4.1/private bookworm-staging/main armhf bsdextrautils armhf 2.38-4 [137 kB]
Get:5 http://172.17.4.1/private bookworm-staging/main armhf libpipeline1 armhf 1.5.6-1 [33.7 kB]
Get:6 http://172.17.4.1/private bookworm-staging/main armhf man-db armhf 2.10.2-1 [1362 kB]
Get:7 http://172.17.4.1/private bookworm-staging/main armhf libncurses6 armhf 6.3+20220423-2 [79.6 kB]
Get:8 http://172.17.4.1/private bookworm-staging/main armhf libprocps8 armhf 2:3.3.17-7 [60.7 kB]
Get:9 http://172.17.4.1/private bookworm-staging/main armhf procps armhf 2:3.3.17-7 [475 kB]
Get:10 http://172.17.4.1/private bookworm-staging/main armhf libmagic-mgc armhf 1:5.41-4 [295 kB]
Get:11 http://172.17.4.1/private bookworm-staging/main armhf libmagic1 armhf 1:5.41-4 [120 kB]
Get:12 http://172.17.4.1/private bookworm-staging/main armhf file armhf 1:5.41-4 [65.8 kB]
Get:13 http://172.17.4.1/private bookworm-staging/main armhf gettext-base armhf 0.21-6 [171 kB]
Get:14 http://172.17.4.1/private bookworm-staging/main armhf libsigsegv2 armhf 2.14-1 [36.6 kB]
Get:15 http://172.17.4.1/private bookworm-staging/main armhf m4 armhf 1.4.18-5 [186 kB]
Get:16 http://172.17.4.1/private bookworm-staging/main armhf autoconf all 2.71-2 [343 kB]
Get:17 http://172.17.4.1/private bookworm-staging/main armhf autotools-dev all 20220109.1 [51.6 kB]
Get:18 http://172.17.4.1/private bookworm-staging/main armhf automake all 1:1.16.5-1.3 [823 kB]
Get:19 http://172.17.4.1/private bookworm-staging/main armhf autopoint all 0.21-6 [510 kB]
Get:20 http://172.17.4.1/private bookworm-staging/main armhf libicu71 armhf 71.1-3 [8855 kB]
Get:21 http://172.17.4.1/private bookworm-staging/main armhf libxml2 armhf 2.9.14+dfsg-1 [591 kB]
Get:22 http://172.17.4.1/private bookworm-staging/main armhf libarchive13 armhf 3.6.0-1 [306 kB]
Get:23 http://172.17.4.1/private bookworm-staging/main armhf libbrotli1 armhf 1.0.9-2+b2 [260 kB]
Get:24 http://172.17.4.1/private bookworm-staging/main armhf libnghttp2-14 armhf 1.47.0-1+b1 [65.3 kB]
Get:25 http://172.17.4.1/private bookworm-staging/main armhf libpsl5 armhf 0.21.0-1.2 [56.2 kB]
Get:26 http://172.17.4.1/private bookworm-staging/main armhf librtmp1 armhf 2.4+20151223.gitfa8646d.1-2+b2 [54.2 kB]
Get:27 http://172.17.4.1/private bookworm-staging/main armhf libssh2-1 armhf 1.10.0-3+b1 [161 kB]
Get:28 http://172.17.4.1/private bookworm-staging/main armhf libcurl4 armhf 7.83.1-2 [317 kB]
Get:29 http://172.17.4.1/private bookworm-staging/main armhf libexpat1 armhf 2.4.8-1 [84.1 kB]
Get:30 http://172.17.4.1/private bookworm-staging/main armhf libjsoncpp25 armhf 1.9.5-4 [66.7 kB]
Get:31 http://172.17.4.1/private bookworm-staging/main armhf librhash0 armhf 1.4.2-1 [141 kB]
Get:32 http://172.17.4.1/private bookworm-staging/main armhf libuv1 armhf 1.44.1-2+rpi1 [124 kB]
Get:33 http://172.17.4.1/private bookworm-staging/main armhf dh-elpa-helper all 2.0.10 [11.3 kB]
Get:34 http://172.17.4.1/private bookworm-staging/main armhf emacsen-common all 3.0.4 [19.3 kB]
Get:35 http://172.17.4.1/private bookworm-staging/main armhf cmake-data all 3.23.2-1 [1939 kB]
Get:36 http://172.17.4.1/private bookworm-staging/main armhf cmake armhf 3.23.2-1 [3551 kB]
Get:37 http://172.17.4.1/private bookworm-staging/main armhf libdebhelper-perl all 13.7.1 [195 kB]
Get:38 http://172.17.4.1/private bookworm-staging/main armhf libtool all 2.4.7-4 [526 kB]
Get:39 http://172.17.4.1/private bookworm-staging/main armhf dh-autoreconf all 20 [17.1 kB]
Get:40 http://172.17.4.1/private bookworm-staging/main armhf libarchive-zip-perl all 1.68-1 [104 kB]
Get:41 http://172.17.4.1/private bookworm-staging/main armhf libsub-override-perl all 0.09-3 [10.4 kB]
Get:42 http://172.17.4.1/private bookworm-staging/main armhf libfile-stripnondeterminism-perl all 1.13.0-1 [26.6 kB]
Get:43 http://172.17.4.1/private bookworm-staging/main armhf dh-strip-nondeterminism all 1.13.0-1 [15.8 kB]
Get:44 http://172.17.4.1/private bookworm-staging/main armhf libelf1 armhf 0.187-1 [175 kB]
Get:45 http://172.17.4.1/private bookworm-staging/main armhf dwz armhf 0.14-1 [83.0 kB]
Get:46 http://172.17.4.1/private bookworm-staging/main armhf gettext armhf 0.21-6 [1214 kB]
Get:47 http://172.17.4.1/private bookworm-staging/main armhf intltool-debian all 0.35.0+20060710.5 [26.8 kB]
Get:48 http://172.17.4.1/private bookworm-staging/main armhf po-debconf all 1.0.21+nmu1 [248 kB]
Get:49 http://172.17.4.1/private bookworm-staging/main armhf debhelper all 13.7.1 [1071 kB]
debconf: delaying package configuration, since apt-utils is not installed
Fetched 25.9 MB in 2s (10.7 MB/s)
Selecting previously unselected package libuchardet0:armhf.
(Reading database ... 12794 files and directories currently installed.)
Preparing to unpack .../00-libuchardet0_0.0.7-1_armhf.deb ...
Unpacking libuchardet0:armhf (0.0.7-1) ...
Selecting previously unselected package groff-base.
Preparing to unpack .../01-groff-base_1.22.4-8_armhf.deb ...
Unpacking groff-base (1.22.4-8) ...
Selecting previously unselected package bsdextrautils.
Preparing to unpack .../02-bsdextrautils_2.38-4_armhf.deb ...
Unpacking bsdextrautils (2.38-4) ...
Selecting previously unselected package libpipeline1:armhf.
Preparing to unpack .../03-libpipeline1_1.5.6-1_armhf.deb ...
Unpacking libpipeline1:armhf (1.5.6-1) ...
Selecting previously unselected package man-db.
Preparing to unpack .../04-man-db_2.10.2-1_armhf.deb ...
Unpacking man-db (2.10.2-1) ...
Selecting previously unselected package libncurses6:armhf.
Preparing to unpack .../05-libncurses6_6.3+20220423-2_armhf.deb ...
Unpacking libncurses6:armhf (6.3+20220423-2) ...
Selecting previously unselected package libprocps8:armhf.
Preparing to unpack .../06-libprocps8_2%3a3.3.17-7_armhf.deb ...
Unpacking libprocps8:armhf (2:3.3.17-7) ...
Selecting previously unselected package procps.
Preparing to unpack .../07-procps_2%3a3.3.17-7_armhf.deb ...
Unpacking procps (2:3.3.17-7) ...
Selecting previously unselected package libmagic-mgc.
Preparing to unpack .../08-libmagic-mgc_1%3a5.41-4_armhf.deb ...
Unpacking libmagic-mgc (1:5.41-4) ...
Selecting previously unselected package libmagic1:armhf.
Preparing to unpack .../09-libmagic1_1%3a5.41-4_armhf.deb ...
Unpacking libmagic1:armhf (1:5.41-4) ...
Selecting previously unselected package file.
Preparing to unpack .../10-file_1%3a5.41-4_armhf.deb ...
Unpacking file (1:5.41-4) ...
Selecting previously unselected package gettext-base.
Preparing to unpack .../11-gettext-base_0.21-6_armhf.deb ...
Unpacking gettext-base (0.21-6) ...
Selecting previously unselected package libsigsegv2:armhf.
Preparing to unpack .../12-libsigsegv2_2.14-1_armhf.deb ...
Unpacking libsigsegv2:armhf (2.14-1) ...
Selecting previously unselected package m4.
Preparing to unpack .../13-m4_1.4.18-5_armhf.deb ...
Unpacking m4 (1.4.18-5) ...
Selecting previously unselected package autoconf.
Preparing to unpack .../14-autoconf_2.71-2_all.deb ...
Unpacking autoconf (2.71-2) ...
Selecting previously unselected package autotools-dev.
Preparing to unpack .../15-autotools-dev_20220109.1_all.deb ...
Unpacking autotools-dev (20220109.1) ...
Selecting previously unselected package automake.
Preparing to unpack .../16-automake_1%3a1.16.5-1.3_all.deb ...
Unpacking automake (1:1.16.5-1.3) ...
Selecting previously unselected package autopoint.
Preparing to unpack .../17-autopoint_0.21-6_all.deb ...
Unpacking autopoint (0.21-6) ...
Selecting previously unselected package libicu71:armhf.
Preparing to unpack .../18-libicu71_71.1-3_armhf.deb ...
Unpacking libicu71:armhf (71.1-3) ...
Selecting previously unselected package libxml2:armhf.
Preparing to unpack .../19-libxml2_2.9.14+dfsg-1_armhf.deb ...
Unpacking libxml2:armhf (2.9.14+dfsg-1) ...
Selecting previously unselected package libarchive13:armhf.
Preparing to unpack .../20-libarchive13_3.6.0-1_armhf.deb ...
Unpacking libarchive13:armhf (3.6.0-1) ...
Selecting previously unselected package libbrotli1:armhf.
Preparing to unpack .../21-libbrotli1_1.0.9-2+b2_armhf.deb ...
Unpacking libbrotli1:armhf (1.0.9-2+b2) ...
Selecting previously unselected package libnghttp2-14:armhf.
Preparing to unpack .../22-libnghttp2-14_1.47.0-1+b1_armhf.deb ...
Unpacking libnghttp2-14:armhf (1.47.0-1+b1) ...
Selecting previously unselected package libpsl5:armhf.
Preparing to unpack .../23-libpsl5_0.21.0-1.2_armhf.deb ...
Unpacking libpsl5:armhf (0.21.0-1.2) ...
Selecting previously unselected package librtmp1:armhf.
Preparing to unpack .../24-librtmp1_2.4+20151223.gitfa8646d.1-2+b2_armhf.deb ...
Unpacking librtmp1:armhf (2.4+20151223.gitfa8646d.1-2+b2) ...
Selecting previously unselected package libssh2-1:armhf.
Preparing to unpack .../25-libssh2-1_1.10.0-3+b1_armhf.deb ...
Unpacking libssh2-1:armhf (1.10.0-3+b1) ...
Selecting previously unselected package libcurl4:armhf.
Preparing to unpack .../26-libcurl4_7.83.1-2_armhf.deb ...
Unpacking libcurl4:armhf (7.83.1-2) ...
Selecting previously unselected package libexpat1:armhf.
Preparing to unpack .../27-libexpat1_2.4.8-1_armhf.deb ...
Unpacking libexpat1:armhf (2.4.8-1) ...
Selecting previously unselected package libjsoncpp25:armhf.
Preparing to unpack .../28-libjsoncpp25_1.9.5-4_armhf.deb ...
Unpacking libjsoncpp25:armhf (1.9.5-4) ...
Selecting previously unselected package librhash0:armhf.
Preparing to unpack .../29-librhash0_1.4.2-1_armhf.deb ...
Unpacking librhash0:armhf (1.4.2-1) ...
Selecting previously unselected package libuv1:armhf.
Preparing to unpack .../30-libuv1_1.44.1-2+rpi1_armhf.deb ...
Unpacking libuv1:armhf (1.44.1-2+rpi1) ...
Selecting previously unselected package dh-elpa-helper.
Preparing to unpack .../31-dh-elpa-helper_2.0.10_all.deb ...
Unpacking dh-elpa-helper (2.0.10) ...
Selecting previously unselected package emacsen-common.
Preparing to unpack .../32-emacsen-common_3.0.4_all.deb ...
Unpacking emacsen-common (3.0.4) ...
Selecting previously unselected package cmake-data.
Preparing to unpack .../33-cmake-data_3.23.2-1_all.deb ...
Unpacking cmake-data (3.23.2-1) ...
Selecting previously unselected package cmake.
Preparing to unpack .../34-cmake_3.23.2-1_armhf.deb ...
Unpacking cmake (3.23.2-1) ...
Selecting previously unselected package libdebhelper-perl.
Preparing to unpack .../35-libdebhelper-perl_13.7.1_all.deb ...
Unpacking libdebhelper-perl (13.7.1) ...
Selecting previously unselected package libtool.
Preparing to unpack .../36-libtool_2.4.7-4_all.deb ...
Unpacking libtool (2.4.7-4) ...
Selecting previously unselected package dh-autoreconf.
Preparing to unpack .../37-dh-autoreconf_20_all.deb ...
Unpacking dh-autoreconf (20) ...
Selecting previously unselected package libarchive-zip-perl.
Preparing to unpack .../38-libarchive-zip-perl_1.68-1_all.deb ...
Unpacking libarchive-zip-perl (1.68-1) ...
Selecting previously unselected package libsub-override-perl.
Preparing to unpack .../39-libsub-override-perl_0.09-3_all.deb ...
Unpacking libsub-override-perl (0.09-3) ...
Selecting previously unselected package libfile-stripnondeterminism-perl.
Preparing to unpack .../40-libfile-stripnondeterminism-perl_1.13.0-1_all.deb ...
Unpacking libfile-stripnondeterminism-perl (1.13.0-1) ...
Selecting previously unselected package dh-strip-nondeterminism.
Preparing to unpack .../41-dh-strip-nondeterminism_1.13.0-1_all.deb ...
Unpacking dh-strip-nondeterminism (1.13.0-1) ...
Selecting previously unselected package libelf1:armhf.
Preparing to unpack .../42-libelf1_0.187-1_armhf.deb ...
Unpacking libelf1:armhf (0.187-1) ...
Selecting previously unselected package dwz.
Preparing to unpack .../43-dwz_0.14-1_armhf.deb ...
Unpacking dwz (0.14-1) ...
Selecting previously unselected package gettext.
Preparing to unpack .../44-gettext_0.21-6_armhf.deb ...
Unpacking gettext (0.21-6) ...
Selecting previously unselected package intltool-debian.
Preparing to unpack .../45-intltool-debian_0.35.0+20060710.5_all.deb ...
Unpacking intltool-debian (0.35.0+20060710.5) ...
Selecting previously unselected package po-debconf.
Preparing to unpack .../46-po-debconf_1.0.21+nmu1_all.deb ...
Unpacking po-debconf (1.0.21+nmu1) ...
Selecting previously unselected package debhelper.
Preparing to unpack .../47-debhelper_13.7.1_all.deb ...
Unpacking debhelper (13.7.1) ...
Selecting previously unselected package sbuild-build-depends-gemmlowp-dummy.
Preparing to unpack .../48-sbuild-build-depends-gemmlowp-dummy_0.invalid.0_armhf.deb ...
Unpacking sbuild-build-depends-gemmlowp-dummy (0.invalid.0) ...
Setting up libexpat1:armhf (2.4.8-1) ...
Setting up libpipeline1:armhf (1.5.6-1) ...
Setting up libicu71:armhf (71.1-3) ...
Setting up libpsl5:armhf (0.21.0-1.2) ...
Setting up bsdextrautils (2.38-4) ...
Setting up libmagic-mgc (1:5.41-4) ...
Setting up libarchive-zip-perl (1.68-1) ...
Setting up libdebhelper-perl (13.7.1) ...
Setting up libbrotli1:armhf (1.0.9-2+b2) ...
Setting up libnghttp2-14:armhf (1.47.0-1+b1) ...
Setting up libmagic1:armhf (1:5.41-4) ...
Setting up gettext-base (0.21-6) ...
Setting up file (1:5.41-4) ...
Setting up autotools-dev (20220109.1) ...
Setting up libuv1:armhf (1.44.1-2+rpi1) ...
Setting up emacsen-common (3.0.4) ...
Setting up librtmp1:armhf (2.4+20151223.gitfa8646d.1-2+b2) ...
Setting up dh-elpa-helper (2.0.10) ...
Setting up libncurses6:armhf (6.3+20220423-2) ...
Setting up libsigsegv2:armhf (2.14-1) ...
Setting up autopoint (0.21-6) ...
Setting up libjsoncpp25:armhf (1.9.5-4) ...
Setting up librhash0:armhf (1.4.2-1) ...
Setting up libuchardet0:armhf (0.0.7-1) ...
Setting up libsub-override-perl (0.09-3) ...
Setting up libssh2-1:armhf (1.10.0-3+b1) ...
Setting up cmake-data (3.23.2-1) ...
Setting up libelf1:armhf (0.187-1) ...
Setting up libxml2:armhf (2.9.14+dfsg-1) ...
Setting up libprocps8:armhf (2:3.3.17-7) ...
Setting up libfile-stripnondeterminism-perl (1.13.0-1) ...
Setting up gettext (0.21-6) ...
Setting up libtool (2.4.7-4) ...
Setting up libarchive13:armhf (3.6.0-1) ...
Setting up m4 (1.4.18-5) ...
Setting up intltool-debian (0.35.0+20060710.5) ...
Setting up autoconf (2.71-2) ...
Setting up dh-strip-nondeterminism (1.13.0-1) ...
Setting up dwz (0.14-1) ...
Setting up groff-base (1.22.4-8) ...
Setting up procps (2:3.3.17-7) ...
Setting up libcurl4:armhf (7.83.1-2) ...
Setting up automake (1:1.16.5-1.3) ...
update-alternatives: using /usr/bin/automake-1.16 to provide /usr/bin/automake (automake) in auto mode
Setting up po-debconf (1.0.21+nmu1) ...
Setting up man-db (2.10.2-1) ...
Not building database; man-db/auto-update is not 'true'.
Setting up dh-autoreconf (20) ...
Setting up cmake (3.23.2-1) ...
Setting up debhelper (13.7.1) ...
Setting up sbuild-build-depends-gemmlowp-dummy (0.invalid.0) ...
Processing triggers for libc-bin (2.33-7+rpi1) ...
W: No sandbox user '_apt' on the system, can not drop privileges

+------------------------------------------------------------------------------+
| Build environment                                                            |
+------------------------------------------------------------------------------+

Kernel: Linux 4.15.0-187-generic armhf (armv8l)
Toolchain package versions: binutils_2.38-4+rpi1 dpkg-dev_1.21.8+rpi1 g++-11_11.3.0-1+rpi1 gcc-11_11.3.0-1+rpi1 libc6-dev_2.33-7+rpi1 libstdc++-11-dev_11.3.0-1+rpi1 libstdc++6_12.1.0-2+rpi1 linux-libc-dev_5.18.2-1+rpi1
Package versions: adduser_3.121 apt_2.5.0 autoconf_2.71-2 automake_1:1.16.5-1.3 autopoint_0.21-6 autotools-dev_20220109.1 base-files_12.2+rpi1 base-passwd_3.5.52 bash_5.1-6.1 binutils_2.38-4+rpi1 binutils-arm-linux-gnueabihf_2.38-4+rpi1 binutils-common_2.38-4+rpi1 bsdextrautils_2.38-4 bsdutils_1:2.38-4 build-essential_12.9 bzip2_1.0.8-5+b2 cmake_3.23.2-1 cmake-data_3.23.2-1 coreutils_8.32-4.1 cpp_4:11.2.0-2+rpi1 cpp-11_11.3.0-1+rpi1 dash_0.5.11+git20210903+057cd650a4ed-8 debconf_1.5.79 debhelper_13.7.1 debianutils_5.7-0.2 dh-autoreconf_20 dh-elpa-helper_2.0.10 dh-strip-nondeterminism_1.13.0-1 diffutils_1:3.7-5 dirmngr_2.2.35-2 dpkg_1.21.8+rpi1 dpkg-dev_1.21.8+rpi1 dwz_0.14-1 e2fsprogs_1.46.5-2 emacsen-common_3.0.4 fakeroot_1.29-1 file_1:5.41-4 findutils_4.9.0-3 g++_4:11.2.0-2+rpi1 g++-11_11.3.0-1+rpi1 gcc_4:11.2.0-2+rpi1 gcc-11_11.3.0-1+rpi1 gcc-11-base_11.3.0-1+rpi1 gcc-12-base_12.1.0-2+rpi1 gcc-7-base_7.5.0-6+rpi1+b2 gcc-8-base_8.4.0-7+rpi1 gcc-9-base_9.4.0-2+rpi1 gettext_0.21-6 gettext-base_0.21-6 gnupg_2.2.35-2 gnupg-l10n_2.2.35-2 gnupg-utils_2.2.35-2 gpg_2.2.35-2 gpg-agent_2.2.35-2 gpg-wks-client_2.2.35-2 gpg-wks-server_2.2.35-2 gpgconf_2.2.35-2 gpgsm_2.2.35-2 gpgv_2.2.35-2 grep_3.7-1 groff-base_1.22.4-8 gzip_1.12-1 hostname_3.23 init-system-helpers_1.63 intltool-debian_0.35.0+20060710.5 iputils-ping_3:20211215-1 krb5-locales_1.19.2-2 libacl1_2.3.1-1 libapt-pkg6.0_2.5.0 libarchive-zip-perl_1.68-1 libarchive13_3.6.0-1 libasan6_11.3.0-1+rpi1 libassuan0_2.5.5-3 libatomic1_12.1.0-2+rpi1 libattr1_1:2.5.1-1 libaudit-common_1:3.0.7-1 libaudit1_1:3.0.7-1+b1 libbinutils_2.38-4+rpi1 libblkid1_2.38-4 libbrotli1_1.0.9-2+b2 libbz2-1.0_1.0.8-5+b2 libc-bin_2.33-7+rpi1 libc-dev-bin_2.33-7+rpi1 libc6_2.33-7+rpi1 libc6-dev_2.33-7+rpi1 libcap-ng0_0.7.9-2.2+b2 libcap2_1:2.44-1 libcap2-bin_1:2.44-1 libcc1-0_12.1.0-2+rpi1 libcom-err2_1.46.5-2 libcrypt-dev_1:4.4.27-1.1 libcrypt1_1:4.4.27-1.1 libctf-nobfd0_2.38-4+rpi1 libctf0_2.38-4+rpi1 libcurl4_7.83.1-2 libdb5.3_5.3.28+dfsg1-0.9 libdebconfclient0_0.263 libdebhelper-perl_13.7.1 libdpkg-perl_1.21.8+rpi1 libelf1_0.187-1 libexpat1_2.4.8-1 libext2fs2_1.46.5-2 libfakeroot_1.29-1 libffi8_3.4.2-4 libfile-stripnondeterminism-perl_1.13.0-1 libgcc-11-dev_11.3.0-1+rpi1 libgcc-s1_12.1.0-2+rpi1 libgcrypt20_1.10.1-2+b2 libgdbm-compat4_1.23-1 libgdbm6_1.23-1 libgmp10_2:6.2.1+dfsg1-1 libgnutls30_3.7.4-2 libgomp1_12.1.0-2+rpi1 libgpg-error0_1.45-2 libgssapi-krb5-2_1.19.2-2+b7 libhogweed6_3.7.3-1 libicu71_71.1-3 libidn2-0_2.3.2-2 libisl23_0.24-2 libjsoncpp25_1.9.5-4 libk5crypto3_1.19.2-2+b7 libkeyutils1_1.6.1-3+rpi1 libkrb5-3_1.19.2-2+b7 libkrb5support0_1.19.2-2+b7 libksba8_1.6.0-2 libldap-2.5-0_2.5.12+dfsg-2 liblz4-1_1.9.3-2 liblzma5_5.2.5-2.1 libmagic-mgc_1:5.41-4 libmagic1_1:5.41-4 libmount1_2.38-4 libmpc3_1.2.1-2 libmpfr6_4.1.0-3 libncurses6_6.3+20220423-2 libncursesw6_6.3+20220423-2 libnettle8_3.7.3-1 libnghttp2-14_1.47.0-1+b1 libnpth0_1.6-3 libnsl-dev_1.3.0-2 libnsl2_1.3.0-2 libp11-kit0_0.24.1-1 libpam-cap_1:2.44-1 libpam-modules_1.4.0-13 libpam-modules-bin_1.4.0-13 libpam-runtime_1.4.0-13 libpam0g_1.4.0-13 libpcre2-8-0_10.40-1+b2 libpcre3_2:8.39-14 libperl5.34_5.34.0-4 libpipeline1_1.5.6-1 libprocps8_2:3.3.17-7 libpsl5_0.21.0-1.2 libreadline8_8.1.2-1.2 librhash0_1.4.2-1 librtmp1_2.4+20151223.gitfa8646d.1-2+b2 libsasl2-2_2.1.28+dfsg-6+b1 libsasl2-modules-db_2.1.28+dfsg-6+b1 libseccomp2_2.5.4-1+rpi1 libselinux1_3.4-1 libsemanage-common_3.4-1 libsemanage2_3.4-1 libsepol1_3.1-1 libsepol2_3.4-2 libsigsegv2_2.14-1 libsmartcols1_2.38-4 libsqlite3-0_3.38.5-1 libss2_1.46.5-2 libssh2-1_1.10.0-3+b1 libssl1.1_1.1.1o-1 libssl3_3.0.3-8 libstdc++-11-dev_11.3.0-1+rpi1 libstdc++6_12.1.0-2+rpi1 libsub-override-perl_0.09-3 libsystemd0_250.4-1+rpi1 libtasn1-6_4.18.0-4 libtinfo6_6.3+20220423-2 libtirpc-common_1.3.2-2 libtirpc-dev_1.3.2-2 libtirpc3_1.3.2-2 libtool_2.4.7-4 libubsan1_12.1.0-2+rpi1 libuchardet0_0.0.7-1 libudev1_250.4-1+rpi1 libunistring2_1.0-1 libuuid1_2.38-4 libuv1_1.44.1-2+rpi1 libxml2_2.9.14+dfsg-1 libxxhash0_0.8.1-1 libzstd1_1.5.2+dfsg-1 linux-libc-dev_5.18.2-1+rpi1 login_1:4.11.1+dfsg1-2 logsave_1.46.5-2 lsb-base_11.2+rpi1 m4_1.4.18-5 make_4.3-4.1 man-db_2.10.2-1 mawk_1.3.4.20200120-3.1 mount_2.38-4 nano_6.3-1 ncurses-base_6.3+20220423-2 ncurses-bin_6.3+20220423-2 netbase_6.3 passwd_1:4.11.1+dfsg1-2 patch_2.7.6-7 perl_5.34.0-4 perl-base_5.34.0-4 perl-modules-5.34_5.34.0-4 pinentry-curses_1.2.0-1 po-debconf_1.0.21+nmu1 procps_2:3.3.17-7 raspbian-archive-keyring_20120528.2 readline-common_8.1.2-1.2 rpcsvc-proto_1.4.2-4 sbuild-build-depends-core-dummy_0.invalid.0 sbuild-build-depends-gemmlowp-dummy_0.invalid.0 sed_4.8-1 sensible-utils_0.0.17 sysvinit-utils_3.03-1 tar_1.34+dfsg-1 tzdata_2022a-1 util-linux_2.38-4 util-linux-extra_2.38-4 xz-utils_5.2.5-2.1 zlib1g_1:1.2.11.dfsg-4+b2

+------------------------------------------------------------------------------+
| Build                                                                        |
+------------------------------------------------------------------------------+


Unpack source
-------------

gpgv: unknown type of key resource 'trustedkeys.kbx'
gpgv: keyblock resource '/tmp/dpkg-verify-sig.9aSSSCS6/trustedkeys.kbx': General error
gpgv: Signature made Fri Jun 24 05:56:40 2022 UTC
gpgv:                using RSA key 638BC75EC1E5C589067E35DE62645EB35F686A8A
gpgv:                issuer "lumin@debian.org"
gpgv: Can't check signature: No public key
dpkg-source: warning: cannot verify signature ./gemmlowp_0.0~git20211220.e844ffd-1.dsc
dpkg-source: info: extracting gemmlowp in /<<PKGBUILDDIR>>
dpkg-source: info: unpacking gemmlowp_0.0~git20211220.e844ffd.orig.tar.xz
dpkg-source: info: unpacking gemmlowp_0.0~git20211220.e844ffd-1.debian.tar.xz
dpkg-source: info: using patch list from debian/patches/series
dpkg-source: info: applying 0001-cmake-build-fix.patch

Check disk space
----------------

Sufficient free space for build

User Environment
----------------

APT_CONFIG=/var/lib/sbuild/apt.conf
DEB_BUILD_OPTIONS=parallel=4
HOME=/sbuild-nonexistent
LC_ALL=POSIX
LOGNAME=buildd
PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games
SCHROOT_ALIAS_NAME=bookworm-staging-armhf-sbuild
SCHROOT_CHROOT_NAME=bookworm-staging-armhf-sbuild
SCHROOT_COMMAND=env
SCHROOT_GID=112
SCHROOT_GROUP=buildd
SCHROOT_SESSION_ID=bookworm-staging-armhf-sbuild-29d97485-a6f0-4284-a196-890a5ae625fb
SCHROOT_UID=107
SCHROOT_USER=buildd
SHELL=/bin/sh
USER=buildd

dpkg-buildpackage
-----------------

dpkg-buildpackage: info: source package gemmlowp
dpkg-buildpackage: info: source version 0.0~git20211220.e844ffd-1
dpkg-buildpackage: info: source distribution unstable
 dpkg-source --before-build .
dpkg-buildpackage: info: host architecture armhf
 fakeroot debian/rules clean
dh clean -Scmake
   debian/rules override_dh_auto_clean
make[1]: Entering directory '/<<PKGBUILDDIR>>'
rm -f CMakeLists.txt
dh_auto_clean
make[1]: Leaving directory '/<<PKGBUILDDIR>>'
   dh_clean -O-Scmake
 debian/rules build-arch
dh build-arch -Scmake
   dh_update_autotools_config -a -O-Scmake
   dh_autoreconf -a -O-Scmake
   debian/rules override_dh_auto_configure
make[1]: Entering directory '/<<PKGBUILDDIR>>'
ln -s contrib/CMakeLists.txt .
dh_auto_configure -- \
	-DCMAKE_C_FLAGS="-g -O2 -ffile-prefix-map=/<<PKGBUILDDIR>>=. -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2" \
	-DCMAKE_CXX_FLAGS="-g -O2 -ffile-prefix-map=/<<PKGBUILDDIR>>=. -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2"
	cd obj-arm-linux-gnueabihf && cmake -DCMAKE_INSTALL_PREFIX=/usr -DCMAKE_BUILD_TYPE=None -DCMAKE_INSTALL_SYSCONFDIR=/etc -DCMAKE_INSTALL_LOCALSTATEDIR=/var -DCMAKE_EXPORT_NO_PACKAGE_REGISTRY=ON -DCMAKE_FIND_USE_PACKAGE_REGISTRY=OFF -DCMAKE_FIND_PACKAGE_NO_PACKAGE_REGISTRY=ON -DCMAKE_INSTALL_RUNSTATEDIR=/run -DCMAKE_SKIP_INSTALL_ALL_DEPENDENCY=ON "-GUnix Makefiles" -DCMAKE_VERBOSE_MAKEFILE=ON -DCMAKE_INSTALL_LIBDIR=lib/arm-linux-gnueabihf "-DCMAKE_C_FLAGS=-g -O2 -ffile-prefix-map=/<<PKGBUILDDIR>>=. -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2" "-DCMAKE_CXX_FLAGS=-g -O2 -ffile-prefix-map=/<<PKGBUILDDIR>>=. -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2" ..
-- The C compiler identification is GNU 11.3.0
-- The CXX compiler identification is GNU 11.3.0
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check for working C compiler: /usr/bin/cc - skipped
-- Detecting C compile features
-- Detecting C compile features - done
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Check for working CXX compiler: /usr/bin/c++ - skipped
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Configuring done
-- Generating done
CMake Warning:
  Manually-specified variables were not used by the project:

    CMAKE_EXPORT_NO_PACKAGE_REGISTRY
    CMAKE_FIND_PACKAGE_NO_PACKAGE_REGISTRY
    CMAKE_FIND_USE_PACKAGE_REGISTRY


-- Build files have been written to: /<<PKGBUILDDIR>>/obj-arm-linux-gnueabihf
make[1]: Leaving directory '/<<PKGBUILDDIR>>'
   dh_auto_build -a -O-Scmake
	cd obj-arm-linux-gnueabihf && make -j4 "INSTALL=install --strip-program=true" VERBOSE=1
make[1]: Entering directory '/<<PKGBUILDDIR>>/obj-arm-linux-gnueabihf'
/usr/bin/cmake -S"/<<PKGBUILDDIR>>" -B"/<<PKGBUILDDIR>>/obj-arm-linux-gnueabihf" --check-build-system CMakeFiles/Makefile.cmake 0
/usr/bin/cmake -E cmake_progress_start "/<<PKGBUILDDIR>>/obj-arm-linux-gnueabihf/CMakeFiles" "/<<PKGBUILDDIR>>/obj-arm-linux-gnueabihf//CMakeFiles/progress.marks"
make  -f CMakeFiles/Makefile2 all
make[2]: Entering directory '/<<PKGBUILDDIR>>/obj-arm-linux-gnueabihf'
make  -f CMakeFiles/eight_bit_int_gemm.dir/build.make CMakeFiles/eight_bit_int_gemm.dir/depend
make  -f CMakeFiles/benchmark.dir/build.make CMakeFiles/benchmark.dir/depend
make  -f CMakeFiles/benchmark_all_sizes.dir/build.make CMakeFiles/benchmark_all_sizes.dir/depend
make  -f CMakeFiles/test_math_helpers.dir/build.make CMakeFiles/test_math_helpers.dir/depend
make[3]: Entering directory '/<<PKGBUILDDIR>>/obj-arm-linux-gnueabihf'
cd "/<<PKGBUILDDIR>>/obj-arm-linux-gnueabihf" && /usr/bin/cmake -E cmake_depends "Unix Makefiles" "/<<PKGBUILDDIR>>" "/<<PKGBUILDDIR>>" "/<<PKGBUILDDIR>>/obj-arm-linux-gnueabihf" "/<<PKGBUILDDIR>>/obj-arm-linux-gnueabihf" "/<<PKGBUILDDIR>>/obj-arm-linux-gnueabihf/CMakeFiles/eight_bit_int_gemm.dir/DependInfo.cmake" --color=
make[3]: Entering directory '/<<PKGBUILDDIR>>/obj-arm-linux-gnueabihf'
cd "/<<PKGBUILDDIR>>/obj-arm-linux-gnueabihf" && /usr/bin/cmake -E cmake_depends "Unix Makefiles" "/<<PKGBUILDDIR>>" "/<<PKGBUILDDIR>>" "/<<PKGBUILDDIR>>/obj-arm-linux-gnueabihf" "/<<PKGBUILDDIR>>/obj-arm-linux-gnueabihf" "/<<PKGBUILDDIR>>/obj-arm-linux-gnueabihf/CMakeFiles/benchmark.dir/DependInfo.cmake" --color=
make[3]: Entering directory '/<<PKGBUILDDIR>>/obj-arm-linux-gnueabihf'
cd "/<<PKGBUILDDIR>>/obj-arm-linux-gnueabihf" && /usr/bin/cmake -E cmake_depends "Unix Makefiles" "/<<PKGBUILDDIR>>" "/<<PKGBUILDDIR>>" "/<<PKGBUILDDIR>>/obj-arm-linux-gnueabihf" "/<<PKGBUILDDIR>>/obj-arm-linux-gnueabihf" "/<<PKGBUILDDIR>>/obj-arm-linux-gnueabihf/CMakeFiles/benchmark_all_sizes.dir/DependInfo.cmake" --color=
make[3]: Entering directory '/<<PKGBUILDDIR>>/obj-arm-linux-gnueabihf'
cd "/<<PKGBUILDDIR>>/obj-arm-linux-gnueabihf" && /usr/bin/cmake -E cmake_depends "Unix Makefiles" "/<<PKGBUILDDIR>>" "/<<PKGBUILDDIR>>" "/<<PKGBUILDDIR>>/obj-arm-linux-gnueabihf" "/<<PKGBUILDDIR>>/obj-arm-linux-gnueabihf" "/<<PKGBUILDDIR>>/obj-arm-linux-gnueabihf/CMakeFiles/test_math_helpers.dir/DependInfo.cmake" --color=
make[3]: Leaving directory '/<<PKGBUILDDIR>>/obj-arm-linux-gnueabihf'
make[3]: Leaving directory '/<<PKGBUILDDIR>>/obj-arm-linux-gnueabihf'
make  -f CMakeFiles/benchmark.dir/build.make CMakeFiles/benchmark.dir/build
make  -f CMakeFiles/test_math_helpers.dir/build.make CMakeFiles/test_math_helpers.dir/build
make[3]: Entering directory '/<<PKGBUILDDIR>>/obj-arm-linux-gnueabihf'
make[3]: Entering directory '/<<PKGBUILDDIR>>/obj-arm-linux-gnueabihf'
make[3]: Leaving directory '/<<PKGBUILDDIR>>/obj-arm-linux-gnueabihf'
make  -f CMakeFiles/eight_bit_int_gemm.dir/build.make CMakeFiles/eight_bit_int_gemm.dir/build
make[3]: Entering directory '/<<PKGBUILDDIR>>/obj-arm-linux-gnueabihf'
make[3]: Leaving directory '/<<PKGBUILDDIR>>/obj-arm-linux-gnueabihf'
make  -f CMakeFiles/benchmark_all_sizes.dir/build.make CMakeFiles/benchmark_all_sizes.dir/build
[  5%] Building CXX object CMakeFiles/benchmark.dir/test/benchmark.cc.o
make[3]: Entering directory '/<<PKGBUILDDIR>>/obj-arm-linux-gnueabihf'
/usr/bin/c++   -g -O2 -ffile-prefix-map=/<<PKGBUILDDIR>>=. -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -std=gnu++11 -MD -MT CMakeFiles/benchmark.dir/test/benchmark.cc.o -MF CMakeFiles/benchmark.dir/test/benchmark.cc.o.d -o CMakeFiles/benchmark.dir/test/benchmark.cc.o -c "/<<PKGBUILDDIR>>/test/benchmark.cc"
[ 11%] Building CXX object CMakeFiles/test_math_helpers.dir/test/test_math_helpers.cc.o
/usr/bin/c++   -g -O2 -ffile-prefix-map=/<<PKGBUILDDIR>>=. -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -std=gnu++11 -MD -MT CMakeFiles/test_math_helpers.dir/test/test_math_helpers.cc.o -MF CMakeFiles/test_math_helpers.dir/test/test_math_helpers.cc.o.d -o CMakeFiles/test_math_helpers.dir/test/test_math_helpers.cc.o -c "/<<PKGBUILDDIR>>/test/test_math_helpers.cc"
[ 17%] Building CXX object CMakeFiles/eight_bit_int_gemm.dir/eight_bit_int_gemm/eight_bit_int_gemm.cc.o
/usr/bin/c++   -g -O2 -ffile-prefix-map=/<<PKGBUILDDIR>>=. -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -std=gnu++11 -MD -MT CMakeFiles/eight_bit_int_gemm.dir/eight_bit_int_gemm/eight_bit_int_gemm.cc.o -MF CMakeFiles/eight_bit_int_gemm.dir/eight_bit_int_gemm/eight_bit_int_gemm.cc.o.d -o CMakeFiles/eight_bit_int_gemm.dir/eight_bit_int_gemm/eight_bit_int_gemm.cc.o -c "/<<PKGBUILDDIR>>/eight_bit_int_gemm/eight_bit_int_gemm.cc"
[ 23%] Building CXX object CMakeFiles/benchmark_all_sizes.dir/test/benchmark_all_sizes.cc.o
/usr/bin/c++   -g -O2 -ffile-prefix-map=/<<PKGBUILDDIR>>=. -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -DBENCHMARK_8bit -DBENCHMARK_QUICK -std=gnu++11 -MD -MT CMakeFiles/benchmark_all_sizes.dir/test/benchmark_all_sizes.cc.o -MF CMakeFiles/benchmark_all_sizes.dir/test/benchmark_all_sizes.cc.o.d -o CMakeFiles/benchmark_all_sizes.dir/test/benchmark_all_sizes.cc.o -c "/<<PKGBUILDDIR>>/test/benchmark_all_sizes.cc"
/<<PKGBUILDDIR>>/test/benchmark.cc:36:2: warning: #warning "Building without NEON support on ARM, check your compiler setup!" [-Wcpp]
   36 | #warning "Building without NEON support on ARM, check your compiler setup!"
      |  ^~~~~~~
In file included from /usr/include/c++/11/bits/stl_algo.h:61,
                 from /usr/include/c++/11/algorithm:62,
                 from /<<PKGBUILDDIR>>/test/../public/../internal/../internal/common.h:24,
                 from /<<PKGBUILDDIR>>/test/../public/../internal/../internal/kernel_default.h:22,
                 from /<<PKGBUILDDIR>>/test/../public/../internal/dispatch_gemm_shape.h:20,
                 from /<<PKGBUILDDIR>>/test/../public/gemmlowp.h:19,
                 from /<<PKGBUILDDIR>>/test/test.h:30,
                 from /<<PKGBUILDDIR>>/test/benchmark.cc:29:
/usr/include/c++/11/bits/stl_heap.h: In function 'void std::__adjust_heap(_RandomAccessIterator, _Distance, _Distance, _Tp, _Compare) [with _RandomAccessIterator = __gnu_cxx::__normal_iterator<double*, std::vector<double> >; _Distance = int; _Tp = double; _Compare = __gnu_cxx::__ops::_Iter_less_iter]':
/usr/include/c++/11/bits/stl_heap.h:223:5: note: parameter passing for argument of type '__gnu_cxx::__normal_iterator<double*, std::vector<double> >' changed in GCC 7.1
  223 |     __adjust_heap(_RandomAccessIterator __first, _Distance __holeIndex,
      |     ^~~~~~~~~~~~~
In file included from /usr/include/c++/11/algorithm:62,
                 from /<<PKGBUILDDIR>>/test/../public/../internal/../internal/common.h:24,
                 from /<<PKGBUILDDIR>>/test/../public/../internal/../internal/kernel_default.h:22,
                 from /<<PKGBUILDDIR>>/test/../public/../internal/dispatch_gemm_shape.h:20,
                 from /<<PKGBUILDDIR>>/test/../public/gemmlowp.h:19,
                 from /<<PKGBUILDDIR>>/test/test.h:30,
                 from /<<PKGBUILDDIR>>/test/benchmark.cc:29:
/usr/include/c++/11/bits/stl_algo.h: In function 'void std::__insertion_sort(_RandomAccessIterator, _RandomAccessIterator, _Compare) [with _RandomAccessIterator = __gnu_cxx::__normal_iterator<double*, std::vector<double> >; _Compare = __gnu_cxx::__ops::_Iter_less_iter]':
/usr/include/c++/11/bits/stl_algo.h:1819:5: note: parameter passing for argument of type '__gnu_cxx::__normal_iterator<double*, std::vector<double> >' changed in GCC 7.1
 1819 |     __insertion_sort(_RandomAccessIterator __first,
      |     ^~~~~~~~~~~~~~~~
/usr/include/c++/11/bits/stl_algo.h:1819:5: note: parameter passing for argument of type '__gnu_cxx::__normal_iterator<double*, std::vector<double> >' changed in GCC 7.1
/usr/include/c++/11/bits/stl_algo.h:1819:5: note: parameter passing for argument of type '__gnu_cxx::__normal_iterator<double*, std::vector<double> >' changed in GCC 7.1
/usr/include/c++/11/bits/stl_algo.h: In function 'void std::__introsort_loop(_RandomAccessIterator, _RandomAccessIterator, _Size, _Compare) [with _RandomAccessIterator = __gnu_cxx::__normal_iterator<double*, std::vector<double> >; _Size = int; _Compare = __gnu_cxx::__ops::_Iter_less_iter]':
/usr/include/c++/11/bits/stl_algo.h:1925:5: note: parameter passing for argument of type '__gnu_cxx::__normal_iterator<double*, std::vector<double> >' changed in GCC 7.1
 1925 |     __introsort_loop(_RandomAccessIterator __first,
      |     ^~~~~~~~~~~~~~~~
/usr/include/c++/11/bits/stl_algo.h:1925:5: note: parameter passing for argument of type '__gnu_cxx::__normal_iterator<double*, std::vector<double> >' changed in GCC 7.1
/usr/include/c++/11/bits/stl_algo.h:1939:32: note: parameter passing for argument of type '__gnu_cxx::__normal_iterator<double*, std::vector<double> >' changed in GCC 7.1
 1939 |           std::__introsort_loop(__cut, __last, __depth_limit, __comp);
      |           ~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In file included from /usr/include/c++/11/vector:72,
                 from /<<PKGBUILDDIR>>/test/benchmark.cc:24:
/usr/include/c++/11/bits/vector.tcc: In member function 'void std::vector<_Tp, _Alloc>::_M_realloc_insert(std::vector<_Tp, _Alloc>::iterator, _Args&& ...) [with _Args = {double&}; _Tp = double; _Alloc = std::allocator<double>]':
/usr/include/c++/11/bits/vector.tcc:426:7: note: parameter passing for argument of type 'std::vector<double>::iterator' changed in GCC 7.1
  426 |       vector<_Tp, _Alloc>::
      |       ^~~~~~~~~~~~~~~~~~~
[ 29%] Linking CXX executable test_math_helpers
/usr/bin/cmake -E cmake_link_script CMakeFiles/test_math_helpers.dir/link.txt --verbose=1
/usr/bin/c++ -g -O2 -ffile-prefix-map=/<<PKGBUILDDIR>>=. -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -Wl,-z,relro -Wl,-z,now -Wl,--as-needed CMakeFiles/test_math_helpers.dir/test/test_math_helpers.cc.o -o test_math_helpers 
make[3]: Leaving directory '/<<PKGBUILDDIR>>/obj-arm-linux-gnueabihf'
[ 29%] Built target test_math_helpers
make  -f CMakeFiles/test_blocking_counter.dir/build.make CMakeFiles/test_blocking_counter.dir/depend
make[3]: Entering directory '/<<PKGBUILDDIR>>/obj-arm-linux-gnueabihf'
cd "/<<PKGBUILDDIR>>/obj-arm-linux-gnueabihf" && /usr/bin/cmake -E cmake_depends "Unix Makefiles" "/<<PKGBUILDDIR>>" "/<<PKGBUILDDIR>>" "/<<PKGBUILDDIR>>/obj-arm-linux-gnueabihf" "/<<PKGBUILDDIR>>/obj-arm-linux-gnueabihf" "/<<PKGBUILDDIR>>/obj-arm-linux-gnueabihf/CMakeFiles/test_blocking_counter.dir/DependInfo.cmake" --color=
make[3]: Leaving directory '/<<PKGBUILDDIR>>/obj-arm-linux-gnueabihf'
make  -f CMakeFiles/test_blocking_counter.dir/build.make CMakeFiles/test_blocking_counter.dir/build
make[3]: Entering directory '/<<PKGBUILDDIR>>/obj-arm-linux-gnueabihf'
[ 35%] Building CXX object CMakeFiles/test_blocking_counter.dir/test/test_blocking_counter.cc.o
/usr/bin/c++   -g -O2 -ffile-prefix-map=/<<PKGBUILDDIR>>=. -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -std=gnu++11 -MD -MT CMakeFiles/test_blocking_counter.dir/test/test_blocking_counter.cc.o -MF CMakeFiles/test_blocking_counter.dir/test/test_blocking_counter.cc.o.d -o CMakeFiles/test_blocking_counter.dir/test/test_blocking_counter.cc.o -c "/<<PKGBUILDDIR>>/test/test_blocking_counter.cc"
[ 41%] Linking CXX executable test_blocking_counter
/usr/bin/cmake -E cmake_link_script CMakeFiles/test_blocking_counter.dir/link.txt --verbose=1
/usr/bin/c++ -g -O2 -ffile-prefix-map=/<<PKGBUILDDIR>>=. -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -Wl,-z,relro -Wl,-z,now -Wl,--as-needed CMakeFiles/test_blocking_counter.dir/test/test_blocking_counter.cc.o -o test_blocking_counter  -lpthread 
make[3]: Leaving directory '/<<PKGBUILDDIR>>/obj-arm-linux-gnueabihf'
[ 41%] Built target test_blocking_counter
make  -f CMakeFiles/test_allocator.dir/build.make CMakeFiles/test_allocator.dir/depend
make[3]: Entering directory '/<<PKGBUILDDIR>>/obj-arm-linux-gnueabihf'
cd "/<<PKGBUILDDIR>>/obj-arm-linux-gnueabihf" && /usr/bin/cmake -E cmake_depends "Unix Makefiles" "/<<PKGBUILDDIR>>" "/<<PKGBUILDDIR>>" "/<<PKGBUILDDIR>>/obj-arm-linux-gnueabihf" "/<<PKGBUILDDIR>>/obj-arm-linux-gnueabihf" "/<<PKGBUILDDIR>>/obj-arm-linux-gnueabihf/CMakeFiles/test_allocator.dir/DependInfo.cmake" --color=
make[3]: Leaving directory '/<<PKGBUILDDIR>>/obj-arm-linux-gnueabihf'
make  -f CMakeFiles/test_allocator.dir/build.make CMakeFiles/test_allocator.dir/build
make[3]: Entering directory '/<<PKGBUILDDIR>>/obj-arm-linux-gnueabihf'
[ 47%] Building CXX object CMakeFiles/test_allocator.dir/test/test_allocator.cc.o
/usr/bin/c++   -g -O2 -ffile-prefix-map=/<<PKGBUILDDIR>>=. -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -std=gnu++11 -MD -MT CMakeFiles/test_allocator.dir/test/test_allocator.cc.o -MF CMakeFiles/test_allocator.dir/test/test_allocator.cc.o.d -o CMakeFiles/test_allocator.dir/test/test_allocator.cc.o -c "/<<PKGBUILDDIR>>/test/test_allocator.cc"
/usr/include/c++/11/bits/vector.tcc: In function 'void gemmlowp::benchmark(gemmlowp::GemmContext*)':
/usr/include/c++/11/bits/vector.tcc:121:28: note: parameter passing for argument of type '__gnu_cxx::__normal_iterator<double*, std::vector<double> >' changed in GCC 7.1
  121 |           _M_realloc_insert(end(), std::forward<_Args>(__args)...);
      |           ~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In file included from /usr/include/c++/11/algorithm:62,
                 from /<<PKGBUILDDIR>>/test/../public/../internal/../internal/common.h:24,
                 from /<<PKGBUILDDIR>>/test/../public/../internal/../internal/kernel_default.h:22,
                 from /<<PKGBUILDDIR>>/test/../public/../internal/dispatch_gemm_shape.h:20,
                 from /<<PKGBUILDDIR>>/test/../public/gemmlowp.h:19,
                 from /<<PKGBUILDDIR>>/test/test.h:30,
                 from /<<PKGBUILDDIR>>/test/benchmark.cc:29:
/usr/include/c++/11/bits/stl_algo.h:1954:32: note: parameter passing for argument of type '__gnu_cxx::__normal_iterator<double*, std::vector<double> >' changed in GCC 7.1
 1954 |           std::__introsort_loop(__first, __last,
      |           ~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~
 1955 |                                 std::__lg(__last - __first) * 2,
      |                                 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 1956 |                                 __comp);
      |                                 ~~~~~~~
/usr/include/c++/11/bits/stl_algo.h:1866:32: note: parameter passing for argument of type '__gnu_cxx::__normal_iterator<double*, std::vector<double> >' changed in GCC 7.1
 1866 |           std::__insertion_sort(__first, __first + int(_S_threshold), __comp);
      |           ~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/usr/include/c++/11/bits/stl_algo.h:1871:30: note: parameter passing for argument of type '__gnu_cxx::__normal_iterator<double*, std::vector<double> >' changed in GCC 7.1
 1871 |         std::__insertion_sort(__first, __last, __comp);
      |         ~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~
[ 52%] Linking CXX executable test_allocator
/usr/bin/cmake -E cmake_link_script CMakeFiles/test_allocator.dir/link.txt --verbose=1
/usr/bin/c++ -g -O2 -ffile-prefix-map=/<<PKGBUILDDIR>>=. -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -Wl,-z,relro -Wl,-z,now -Wl,--as-needed CMakeFiles/test_allocator.dir/test/test_allocator.cc.o -o test_allocator 
make[3]: Leaving directory '/<<PKGBUILDDIR>>/obj-arm-linux-gnueabihf'
[ 52%] Built target test_allocator
make  -f CMakeFiles/test_fixedpoint.dir/build.make CMakeFiles/test_fixedpoint.dir/depend
make[3]: Entering directory '/<<PKGBUILDDIR>>/obj-arm-linux-gnueabihf'
cd "/<<PKGBUILDDIR>>/obj-arm-linux-gnueabihf" && /usr/bin/cmake -E cmake_depends "Unix Makefiles" "/<<PKGBUILDDIR>>" "/<<PKGBUILDDIR>>" "/<<PKGBUILDDIR>>/obj-arm-linux-gnueabihf" "/<<PKGBUILDDIR>>/obj-arm-linux-gnueabihf" "/<<PKGBUILDDIR>>/obj-arm-linux-gnueabihf/CMakeFiles/test_fixedpoint.dir/DependInfo.cmake" --color=
make[3]: Leaving directory '/<<PKGBUILDDIR>>/obj-arm-linux-gnueabihf'
make  -f CMakeFiles/test_fixedpoint.dir/build.make CMakeFiles/test_fixedpoint.dir/build
make[3]: Entering directory '/<<PKGBUILDDIR>>/obj-arm-linux-gnueabihf'
[ 58%] Building CXX object CMakeFiles/test_fixedpoint.dir/test/test_fixedpoint.cc.o
/usr/bin/c++   -g -O2 -ffile-prefix-map=/<<PKGBUILDDIR>>=. -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -std=gnu++11 -MD -MT CMakeFiles/test_fixedpoint.dir/test/test_fixedpoint.cc.o -MF CMakeFiles/test_fixedpoint.dir/test/test_fixedpoint.cc.o.d -o CMakeFiles/test_fixedpoint.dir/test/test_fixedpoint.cc.o -c "/<<PKGBUILDDIR>>/test/test_fixedpoint.cc"
[ 64%] Linking CXX executable benchmark_all_sizes
/usr/bin/cmake -E cmake_link_script CMakeFiles/benchmark_all_sizes.dir/link.txt --verbose=1
/usr/bin/c++ -g -O2 -ffile-prefix-map=/<<PKGBUILDDIR>>=. -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -Wl,-z,relro -Wl,-z,now -Wl,--as-needed CMakeFiles/benchmark_all_sizes.dir/test/benchmark_all_sizes.cc.o -o benchmark_all_sizes  -lpthread 
make[3]: Leaving directory '/<<PKGBUILDDIR>>/obj-arm-linux-gnueabihf'
[ 64%] Built target benchmark_all_sizes
[ 70%] Linking CXX executable benchmark
/usr/bin/cmake -E cmake_link_script CMakeFiles/benchmark.dir/link.txt --verbose=1
/usr/bin/c++ -g -O2 -ffile-prefix-map=/<<PKGBUILDDIR>>=. -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -Wl,-z,relro -Wl,-z,now -Wl,--as-needed CMakeFiles/benchmark.dir/test/benchmark.cc.o -o benchmark  -lpthread 
make[3]: Leaving directory '/<<PKGBUILDDIR>>/obj-arm-linux-gnueabihf'
[ 70%] Built target benchmark
[ 76%] Linking CXX executable test_fixedpoint
/usr/bin/cmake -E cmake_link_script CMakeFiles/test_fixedpoint.dir/link.txt --verbose=1
/usr/bin/c++ -g -O2 -ffile-prefix-map=/<<PKGBUILDDIR>>=. -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -Wl,-z,relro -Wl,-z,now -Wl,--as-needed CMakeFiles/test_fixedpoint.dir/test/test_fixedpoint.cc.o -o test_fixedpoint 
make[3]: Leaving directory '/<<PKGBUILDDIR>>/obj-arm-linux-gnueabihf'
[ 76%] Built target test_fixedpoint
[ 82%] Linking CXX static library libeight_bit_int_gemm.a
/usr/bin/cmake -P CMakeFiles/eight_bit_int_gemm.dir/cmake_clean_target.cmake
/usr/bin/cmake -E cmake_link_script CMakeFiles/eight_bit_int_gemm.dir/link.txt --verbose=1
/usr/bin/ar qc libeight_bit_int_gemm.a CMakeFiles/eight_bit_int_gemm.dir/eight_bit_int_gemm/eight_bit_int_gemm.cc.o
/usr/bin/ranlib libeight_bit_int_gemm.a
make[3]: Leaving directory '/<<PKGBUILDDIR>>/obj-arm-linux-gnueabihf'
[ 82%] Built target eight_bit_int_gemm
make  -f CMakeFiles/test_gemmlowp.dir/build.make CMakeFiles/test_gemmlowp.dir/depend
make[3]: Entering directory '/<<PKGBUILDDIR>>/obj-arm-linux-gnueabihf'
cd "/<<PKGBUILDDIR>>/obj-arm-linux-gnueabihf" && /usr/bin/cmake -E cmake_depends "Unix Makefiles" "/<<PKGBUILDDIR>>" "/<<PKGBUILDDIR>>" "/<<PKGBUILDDIR>>/obj-arm-linux-gnueabihf" "/<<PKGBUILDDIR>>/obj-arm-linux-gnueabihf" "/<<PKGBUILDDIR>>/obj-arm-linux-gnueabihf/CMakeFiles/test_gemmlowp.dir/DependInfo.cmake" --color=
make[3]: Leaving directory '/<<PKGBUILDDIR>>/obj-arm-linux-gnueabihf'
make  -f CMakeFiles/test_gemmlowp.dir/build.make CMakeFiles/test_gemmlowp.dir/build
make[3]: Entering directory '/<<PKGBUILDDIR>>/obj-arm-linux-gnueabihf'
[ 94%] Building CXX object CMakeFiles/test_gemmlowp.dir/test/test.cc.o
[ 94%] Building CXX object CMakeFiles/test_gemmlowp.dir/test/test_data.cc.o
/usr/bin/c++   -g -O2 -ffile-prefix-map=/<<PKGBUILDDIR>>=. -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -std=gnu++11 -MD -MT CMakeFiles/test_gemmlowp.dir/test/test_data.cc.o -MF CMakeFiles/test_gemmlowp.dir/test/test_data.cc.o.d -o CMakeFiles/test_gemmlowp.dir/test/test_data.cc.o -c "/<<PKGBUILDDIR>>/test/test_data.cc"
/usr/bin/c++   -g -O2 -ffile-prefix-map=/<<PKGBUILDDIR>>=. -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -std=gnu++11 -MD -MT CMakeFiles/test_gemmlowp.dir/test/test.cc.o -MF CMakeFiles/test_gemmlowp.dir/test/test.cc.o.d -o CMakeFiles/test_gemmlowp.dir/test/test.cc.o -c "/<<PKGBUILDDIR>>/test/test.cc"
/<<PKGBUILDDIR>>/test/test.cc: In static member function 'static const char* gemmlowp::MultiThreadGemmWrapper<Kernel, Scalar, tBitDepthParams>::Name() [with Kernel = gemmlowp::ReferenceKernel<gemmlowp::KernelFormat<gemmlowp::KernelSideFormat<gemmlowp::CellFormat<1, 1>, 1>, gemmlowp::KernelSideFormat<gemmlowp::CellFormat<1, 1>, 1> > >; Scalar = unsigned char; tBitDepthParams = gemmlowp::BitDepthParams<gemmlowp::OperandRange<0, 255>, gemmlowp::OperandRange<0, 255> >]':
/<<PKGBUILDDIR>>/test/test.cc:163:58: warning: '%s' directive output may be truncated writing up to 255 bytes into a region of size 231 [-Wformat-truncation=]
  163 |     snprintf(buf, sizeof(buf), "MultiThreadGemm, Kernel: %s", Kernel().Name());
      |                                                          ^~
In file included from /usr/include/stdio.h:866,
                 from /usr/include/c++/11/cstdio:42,
                 from /usr/include/c++/11/ext/string_conversions.h:43,
                 from /usr/include/c++/11/bits/basic_string.h:6608,
                 from /usr/include/c++/11/string:55,
                 from /usr/include/c++/11/bits/locale_classes.h:40,
                 from /usr/include/c++/11/bits/ios_base.h:41,
                 from /usr/include/c++/11/ios:42,
                 from /usr/include/c++/11/ostream:38,
                 from /usr/include/c++/11/iostream:39,
                 from /<<PKGBUILDDIR>>/test/test.h:26,
                 from /<<PKGBUILDDIR>>/test/test.cc:15:
/usr/include/arm-linux-gnueabihf/bits/stdio2.h:71:35: note: '__builtin___snprintf_chk' output between 26 and 281 bytes into a destination of size 256
   71 |   return __builtin___snprintf_chk (__s, __n, __USE_FORTIFY_LEVEL - 1,
      |          ~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   72 |                                    __glibc_objsize (__s), __fmt,
      |                                    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   73 |                                    __va_arg_pack ());
      |                                    ~~~~~~~~~~~~~~~~~
/<<PKGBUILDDIR>>/test/test.cc: In static member function 'static const char* gemmlowp::MultiThreadGemmWrapper<Kernel, Scalar, tBitDepthParams>::Name() [with Kernel = gemmlowp::ReferenceKernel<gemmlowp::KernelFormat<gemmlowp::KernelSideFormat<gemmlowp::CellFormat<4, 2>, 1>, gemmlowp::KernelSideFormat<gemmlowp::CellFormat<4, 2>, 2> > >; Scalar = unsigned char; tBitDepthParams = gemmlowp::BitDepthParams<gemmlowp::OperandRange<0, 255>, gemmlowp::OperandRange<0, 255> >]':
/<<PKGBUILDDIR>>/test/test.cc:163:58: warning: '%s' directive output may be truncated writing up to 255 bytes into a region of size 231 [-Wformat-truncation=]
  163 |     snprintf(buf, sizeof(buf), "MultiThreadGemm, Kernel: %s", Kernel().Name());
      |                                                          ^~
In file included from /usr/include/stdio.h:866,
                 from /usr/include/c++/11/cstdio:42,
                 from /usr/include/c++/11/ext/string_conversions.h:43,
                 from /usr/include/c++/11/bits/basic_string.h:6608,
                 from /usr/include/c++/11/string:55,
                 from /usr/include/c++/11/bits/locale_classes.h:40,
                 from /usr/include/c++/11/bits/ios_base.h:41,
                 from /usr/include/c++/11/ios:42,
                 from /usr/include/c++/11/ostream:38,
                 from /usr/include/c++/11/iostream:39,
                 from /<<PKGBUILDDIR>>/test/test.h:26,
                 from /<<PKGBUILDDIR>>/test/test.cc:15:
/usr/include/arm-linux-gnueabihf/bits/stdio2.h:71:35: note: '__builtin___snprintf_chk' output between 26 and 281 bytes into a destination of size 256
   71 |   return __builtin___snprintf_chk (__s, __n, __USE_FORTIFY_LEVEL - 1,
      |          ~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   72 |                                    __glibc_objsize (__s), __fmt,
      |                                    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   73 |                                    __va_arg_pack ());
      |                                    ~~~~~~~~~~~~~~~~~
/<<PKGBUILDDIR>>/test/test.cc: In static member function 'static const char* gemmlowp::MultiThreadGemmWrapper<Kernel, Scalar, tBitDepthParams>::Name() [with Kernel = gemmlowp::ReferenceKernel<gemmlowp::KernelFormat<gemmlowp::KernelSideFormat<gemmlowp::CellFormat<4, 2>, 4>, gemmlowp::KernelSideFormat<gemmlowp::CellFormat<4, 2>, 5> > >; Scalar = unsigned char; tBitDepthParams = gemmlowp::BitDepthParams<gemmlowp::OperandRange<0, 255>, gemmlowp::OperandRange<0, 255> >]':
/<<PKGBUILDDIR>>/test/test.cc:163:58: warning: '%s' directive output may be truncated writing up to 255 bytes into a region of size 231 [-Wformat-truncation=]
  163 |     snprintf(buf, sizeof(buf), "MultiThreadGemm, Kernel: %s", Kernel().Name());
      |                                                          ^~
In file included from /usr/include/stdio.h:866,
                 from /usr/include/c++/11/cstdio:42,
                 from /usr/include/c++/11/ext/string_conversions.h:43,
                 from /usr/include/c++/11/bits/basic_string.h:6608,
                 from /usr/include/c++/11/string:55,
                 from /usr/include/c++/11/bits/locale_classes.h:40,
                 from /usr/include/c++/11/bits/ios_base.h:41,
                 from /usr/include/c++/11/ios:42,
                 from /usr/include/c++/11/ostream:38,
                 from /usr/include/c++/11/iostream:39,
                 from /<<PKGBUILDDIR>>/test/test.h:26,
                 from /<<PKGBUILDDIR>>/test/test.cc:15:
/usr/include/arm-linux-gnueabihf/bits/stdio2.h:71:35: note: '__builtin___snprintf_chk' output between 26 and 281 bytes into a destination of size 256
   71 |   return __builtin___snprintf_chk (__s, __n, __USE_FORTIFY_LEVEL - 1,
      |          ~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   72 |                                    __glibc_objsize (__s), __fmt,
      |                                    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   73 |                                    __va_arg_pack ());
      |                                    ~~~~~~~~~~~~~~~~~
/<<PKGBUILDDIR>>/test/test.cc: In static member function 'static const char* gemmlowp::MultiThreadGemmWrapper<Kernel, Scalar, tBitDepthParams>::Name() [with Kernel = gemmlowp::ReferenceKernel<gemmlowp::KernelFormat<gemmlowp::KernelSideFormat<gemmlowp::CellFormat<3, 4, gemmlowp::CellOrder::DepthMajor>, 2>, gemmlowp::KernelSideFormat<gemmlowp::CellFormat<5, 4, gemmlowp::CellOrder::DepthMajor>, 3> > >; Scalar = unsigned char; tBitDepthParams = gemmlowp::BitDepthParams<gemmlowp::OperandRange<0, 255>, gemmlowp::OperandRange<0, 255> >]':
/<<PKGBUILDDIR>>/test/test.cc:163:58: warning: '%s' directive output may be truncated writing up to 255 bytes into a region of size 231 [-Wformat-truncation=]
  163 |     snprintf(buf, sizeof(buf), "MultiThreadGemm, Kernel: %s", Kernel().Name());
      |                                                          ^~
In file included from /usr/include/stdio.h:866,
                 from /usr/include/c++/11/cstdio:42,
                 from /usr/include/c++/11/ext/string_conversions.h:43,
                 from /usr/include/c++/11/bits/basic_string.h:6608,
                 from /usr/include/c++/11/string:55,
                 from /usr/include/c++/11/bits/locale_classes.h:40,
                 from /usr/include/c++/11/bits/ios_base.h:41,
                 from /usr/include/c++/11/ios:42,
                 from /usr/include/c++/11/ostream:38,
                 from /usr/include/c++/11/iostream:39,
                 from /<<PKGBUILDDIR>>/test/test.h:26,
                 from /<<PKGBUILDDIR>>/test/test.cc:15:
/usr/include/arm-linux-gnueabihf/bits/stdio2.h:71:35: note: '__builtin___snprintf_chk' output between 26 and 281 bytes into a destination of size 256
   71 |   return __builtin___snprintf_chk (__s, __n, __USE_FORTIFY_LEVEL - 1,
      |          ~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   72 |                                    __glibc_objsize (__s), __fmt,
      |                                    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   73 |                                    __va_arg_pack ());
      |                                    ~~~~~~~~~~~~~~~~~
/<<PKGBUILDDIR>>/test/test.cc: In static member function 'static const char* gemmlowp::MultiThreadGemmWrapper<Kernel, Scalar, tBitDepthParams>::Name() [with Kernel = gemmlowp::ReferenceKernel<gemmlowp::KernelFormat<gemmlowp::KernelSideFormat<gemmlowp::CellFormat<3, 4, gemmlowp::CellOrder::WidthMajor>, 2>, gemmlowp::KernelSideFormat<gemmlowp::CellFormat<5, 4, gemmlowp::CellOrder::WidthMajor>, 3> > >; Scalar = unsigned char; tBitDepthParams = gemmlowp::BitDepthParams<gemmlowp::OperandRange<0, 255>, gemmlowp::OperandRange<0, 255> >]':
/<<PKGBUILDDIR>>/test/test.cc:163:58: warning: '%s' directive output may be truncated writing up to 255 bytes into a region of size 231 [-Wformat-truncation=]
  163 |     snprintf(buf, sizeof(buf), "MultiThreadGemm, Kernel: %s", Kernel().Name());
      |                                                          ^~
In file included from /usr/include/stdio.h:866,
                 from /usr/include/c++/11/cstdio:42,
                 from /usr/include/c++/11/ext/string_conversions.h:43,
                 from /usr/include/c++/11/bits/basic_string.h:6608,
                 from /usr/include/c++/11/string:55,
                 from /usr/include/c++/11/bits/locale_classes.h:40,
                 from /usr/include/c++/11/bits/ios_base.h:41,
                 from /usr/include/c++/11/ios:42,
                 from /usr/include/c++/11/ostream:38,
                 from /usr/include/c++/11/iostream:39,
                 from /<<PKGBUILDDIR>>/test/test.h:26,
                 from /<<PKGBUILDDIR>>/test/test.cc:15:
/usr/include/arm-linux-gnueabihf/bits/stdio2.h:71:35: note: '__builtin___snprintf_chk' output between 26 and 281 bytes into a destination of size 256
   71 |   return __builtin___snprintf_chk (__s, __n, __USE_FORTIFY_LEVEL - 1,
      |          ~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   72 |                                    __glibc_objsize (__s), __fmt,
      |                                    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   73 |                                    __va_arg_pack ());
      |                                    ~~~~~~~~~~~~~~~~~
/<<PKGBUILDDIR>>/test/test.cc: In static member function 'static const char* gemmlowp::MultiThreadGemmWrapper<Kernel, Scalar, tBitDepthParams>::Name() [with Kernel = gemmlowp::ReferenceKernel<gemmlowp::KernelFormat<gemmlowp::KernelSideFormat<gemmlowp::CellFormat<5, 2, gemmlowp::CellOrder::WidthMajor>, 3>, gemmlowp::KernelSideFormat<gemmlowp::CellFormat<4, 2>, 2> > >; Scalar = unsigned char; tBitDepthParams = gemmlowp::BitDepthParams<gemmlowp::OperandRange<0, 255>, gemmlowp::OperandRange<0, 255> >]':
/<<PKGBUILDDIR>>/test/test.cc:163:58: warning: '%s' directive output may be truncated writing up to 255 bytes into a region of size 231 [-Wformat-truncation=]
  163 |     snprintf(buf, sizeof(buf), "MultiThreadGemm, Kernel: %s", Kernel().Name());
      |                                                          ^~
In file included from /usr/include/stdio.h:866,
                 from /usr/include/c++/11/cstdio:42,
                 from /usr/include/c++/11/ext/string_conversions.h:43,
                 from /usr/include/c++/11/bits/basic_string.h:6608,
                 from /usr/include/c++/11/string:55,
                 from /usr/include/c++/11/bits/locale_classes.h:40,
                 from /usr/include/c++/11/bits/ios_base.h:41,
                 from /usr/include/c++/11/ios:42,
                 from /usr/include/c++/11/ostream:38,
                 from /usr/include/c++/11/iostream:39,
                 from /<<PKGBUILDDIR>>/test/test.h:26,
                 from /<<PKGBUILDDIR>>/test/test.cc:15:
/usr/include/arm-linux-gnueabihf/bits/stdio2.h:71:35: note: '__builtin___snprintf_chk' output between 26 and 281 bytes into a destination of size 256
   71 |   return __builtin___snprintf_chk (__s, __n, __USE_FORTIFY_LEVEL - 1,
      |          ~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   72 |                                    __glibc_objsize (__s), __fmt,
      |                                    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   73 |                                    __va_arg_pack ());
      |                                    ~~~~~~~~~~~~~~~~~
/<<PKGBUILDDIR>>/test/test.cc: In static member function 'static const char* gemmlowp::MultiThreadGemmWrapper<Kernel, Scalar, tBitDepthParams>::Name() [with Kernel = gemmlowp::ReferenceKernel<gemmlowp::KernelFormat<gemmlowp::KernelSideFormat<gemmlowp::CellFormat<5, 2, gemmlowp::CellOrder::DepthMajor>, 3>, gemmlowp::KernelSideFormat<gemmlowp::CellFormat<4, 2, gemmlowp::CellOrder::WidthMajor>, 2> > >; Scalar = unsigned char; tBitDepthParams = gemmlowp::BitDepthParams<gemmlowp::OperandRange<0, 255>, gemmlowp::OperandRange<0, 255> >]':
/<<PKGBUILDDIR>>/test/test.cc:163:58: warning: '%s' directive output may be truncated writing up to 255 bytes into a region of size 231 [-Wformat-truncation=]
  163 |     snprintf(buf, sizeof(buf), "MultiThreadGemm, Kernel: %s", Kernel().Name());
      |                                                          ^~
In file included from /usr/include/stdio.h:866,
                 from /usr/include/c++/11/cstdio:42,
                 from /usr/include/c++/11/ext/string_conversions.h:43,
                 from /usr/include/c++/11/bits/basic_string.h:6608,
                 from /usr/include/c++/11/string:55,
                 from /usr/include/c++/11/bits/locale_classes.h:40,
                 from /usr/include/c++/11/bits/ios_base.h:41,
                 from /usr/include/c++/11/ios:42,
                 from /usr/include/c++/11/ostream:38,
                 from /usr/include/c++/11/iostream:39,
                 from /<<PKGBUILDDIR>>/test/test.h:26,
                 from /<<PKGBUILDDIR>>/test/test.cc:15:
/usr/include/arm-linux-gnueabihf/bits/stdio2.h:71:35: note: '__builtin___snprintf_chk' output between 26 and 281 bytes into a destination of size 256
   71 |   return __builtin___snprintf_chk (__s, __n, __USE_FORTIFY_LEVEL - 1,
      |          ~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   72 |                                    __glibc_objsize (__s), __fmt,
      |                                    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   73 |                                    __va_arg_pack ());
      |                                    ~~~~~~~~~~~~~~~~~
/<<PKGBUILDDIR>>/test/test.cc: In static member function 'static const char* gemmlowp::MultiThreadGemmWrapper<Kernel, Scalar, tBitDepthParams>::Name() [with Kernel = gemmlowp::ReferenceKernel<gemmlowp::KernelFormat<gemmlowp::KernelSideFormat<gemmlowp::CellFormat<8, 8, gemmlowp::CellOrder::Diagonal>, 2>, gemmlowp::KernelSideFormat<gemmlowp::CellFormat<3, 8, gemmlowp::CellOrder::WidthMajor>, 1> > >; Scalar = unsigned char; tBitDepthParams = gemmlowp::BitDepthParams<gemmlowp::OperandRange<0, 255>, gemmlowp::OperandRange<0, 255> >]':
/<<PKGBUILDDIR>>/test/test.cc:163:58: warning: '%s' directive output may be truncated writing up to 255 bytes into a region of size 231 [-Wformat-truncation=]
  163 |     snprintf(buf, sizeof(buf), "MultiThreadGemm, Kernel: %s", Kernel().Name());
      |                                                          ^~
In file included from /usr/include/stdio.h:866,
                 from /usr/include/c++/11/cstdio:42,
                 from /usr/include/c++/11/ext/string_conversions.h:43,
                 from /usr/include/c++/11/bits/basic_string.h:6608,
                 from /usr/include/c++/11/string:55,
                 from /usr/include/c++/11/bits/locale_classes.h:40,
                 from /usr/include/c++/11/bits/ios_base.h:41,
                 from /usr/include/c++/11/ios:42,
                 from /usr/include/c++/11/ostream:38,
                 from /usr/include/c++/11/iostream:39,
                 from /<<PKGBUILDDIR>>/test/test.h:26,
                 from /<<PKGBUILDDIR>>/test/test.cc:15:
/usr/include/arm-linux-gnueabihf/bits/stdio2.h:71:35: note: '__builtin___snprintf_chk' output between 26 and 281 bytes into a destination of size 256
   71 |   return __builtin___snprintf_chk (__s, __n, __USE_FORTIFY_LEVEL - 1,
      |          ~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   72 |                                    __glibc_objsize (__s), __fmt,
      |                                    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   73 |                                    __va_arg_pack ());
      |                                    ~~~~~~~~~~~~~~~~~
/<<PKGBUILDDIR>>/test/test.cc: In static member function 'static const char* gemmlowp::MultiThreadGemmWrapper<Kernel, Scalar, tBitDepthParams>::Name() [with Kernel = gemmlowp::ReferenceKernel<gemmlowp::KernelFormat<gemmlowp::KernelSideFormat<gemmlowp::CellFormat<1, 4, gemmlowp::CellOrder::DepthMajor>, 1>, gemmlowp::KernelSideFormat<gemmlowp::CellFormat<4, 4, gemmlowp::CellOrder::Diagonal>, 1> > >; Scalar = unsigned char; tBitDepthParams = gemmlowp::BitDepthParams<gemmlowp::OperandRange<0, 255>, gemmlowp::OperandRange<0, 255> >]':
/<<PKGBUILDDIR>>/test/test.cc:163:58: warning: '%s' directive output may be truncated writing up to 255 bytes into a region of size 231 [-Wformat-truncation=]
  163 |     snprintf(buf, sizeof(buf), "MultiThreadGemm, Kernel: %s", Kernel().Name());
      |                                                          ^~
In file included from /usr/include/stdio.h:866,
                 from /usr/include/c++/11/cstdio:42,
                 from /usr/include/c++/11/ext/string_conversions.h:43,
                 from /usr/include/c++/11/bits/basic_string.h:6608,
                 from /usr/include/c++/11/string:55,
                 from /usr/include/c++/11/bits/locale_classes.h:40,
                 from /usr/include/c++/11/bits/ios_base.h:41,
                 from /usr/include/c++/11/ios:42,
                 from /usr/include/c++/11/ostream:38,
                 from /usr/include/c++/11/iostream:39,
                 from /<<PKGBUILDDIR>>/test/test.h:26,
                 from /<<PKGBUILDDIR>>/test/test.cc:15:
/usr/include/arm-linux-gnueabihf/bits/stdio2.h:71:35: note: '__builtin___snprintf_chk' output between 26 and 281 bytes into a destination of size 256
   71 |   return __builtin___snprintf_chk (__s, __n, __USE_FORTIFY_LEVEL - 1,
      |          ~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   72 |                                    __glibc_objsize (__s), __fmt,
      |                                    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   73 |                                    __va_arg_pack ());
      |                                    ~~~~~~~~~~~~~~~~~
/<<PKGBUILDDIR>>/test/test.cc: In static member function 'static const char* gemmlowp::SingleThreadGemmWrapper<Kernel, Scalar, tBitDepthParams>::Name() [with Kernel = gemmlowp::DefaultKernel<gemmlowp::BitDepthParams<gemmlowp::OperandRange<0, 255>, gemmlowp::OperandRange<0, 255> > >; Scalar = unsigned char; tBitDepthParams = gemmlowp::BitDepthParams<gemmlowp::OperandRange<0, 255>, gemmlowp::OperandRange<0, 255> >]':
/<<PKGBUILDDIR>>/test/test.cc:123:59: warning: '%s' directive output may be truncated writing up to 255 bytes into a region of size 230 [-Wformat-truncation=]
  123 |     snprintf(buf, sizeof(buf), "SingleThreadGemm, Kernel: %s", Kernel().Name());
      |                                                           ^~
In file included from /usr/include/stdio.h:866,
                 from /usr/include/c++/11/cstdio:42,
                 from /usr/include/c++/11/ext/string_conversions.h:43,
                 from /usr/include/c++/11/bits/basic_string.h:6608,
                 from /usr/include/c++/11/string:55,
                 from /usr/include/c++/11/bits/locale_classes.h:40,
                 from /usr/include/c++/11/bits/ios_base.h:41,
                 from /usr/include/c++/11/ios:42,
                 from /usr/include/c++/11/ostream:38,
                 from /usr/include/c++/11/iostream:39,
                 from /<<PKGBUILDDIR>>/test/test.h:26,
                 from /<<PKGBUILDDIR>>/test/test.cc:15:
/usr/include/arm-linux-gnueabihf/bits/stdio2.h:71:35: note: '__builtin___snprintf_chk' output between 27 and 282 bytes into a destination of size 256
   71 |   return __builtin___snprintf_chk (__s, __n, __USE_FORTIFY_LEVEL - 1,
      |          ~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   72 |                                    __glibc_objsize (__s), __fmt,
      |                                    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   73 |                                    __va_arg_pack ());
      |                                    ~~~~~~~~~~~~~~~~~
/<<PKGBUILDDIR>>/test/test.cc: In static member function 'static const char* gemmlowp::MultiThreadGemmWrapper<Kernel, Scalar, tBitDepthParams>::Name() [with Kernel = gemmlowp::DefaultKernel<gemmlowp::BitDepthParams<gemmlowp::OperandRange<0, 255>, gemmlowp::OperandRange<0, 255> > >; Scalar = unsigned char; tBitDepthParams = gemmlowp::BitDepthParams<gemmlowp::OperandRange<0, 255>, gemmlowp::OperandRange<0, 255> >]':
/<<PKGBUILDDIR>>/test/test.cc:163:58: warning: '%s' directive output may be truncated writing up to 255 bytes into a region of size 231 [-Wformat-truncation=]
  163 |     snprintf(buf, sizeof(buf), "MultiThreadGemm, Kernel: %s", Kernel().Name());
      |                                                          ^~
In file included from /usr/include/stdio.h:866,
                 from /usr/include/c++/11/cstdio:42,
                 from /usr/include/c++/11/ext/string_conversions.h:43,
                 from /usr/include/c++/11/bits/basic_string.h:6608,
                 from /usr/include/c++/11/string:55,
                 from /usr/include/c++/11/bits/locale_classes.h:40,
                 from /usr/include/c++/11/bits/ios_base.h:41,
                 from /usr/include/c++/11/ios:42,
                 from /usr/include/c++/11/ostream:38,
                 from /usr/include/c++/11/iostream:39,
                 from /<<PKGBUILDDIR>>/test/test.h:26,
                 from /<<PKGBUILDDIR>>/test/test.cc:15:
/usr/include/arm-linux-gnueabihf/bits/stdio2.h:71:35: note: '__builtin___snprintf_chk' output between 26 and 281 bytes into a destination of size 256
   71 |   return __builtin___snprintf_chk (__s, __n, __USE_FORTIFY_LEVEL - 1,
      |          ~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   72 |                                    __glibc_objsize (__s), __fmt,
      |                                    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   73 |                                    __va_arg_pack ());
      |                                    ~~~~~~~~~~~~~~~~~
/<<PKGBUILDDIR>>/test/test.cc: In static member function 'static const char* gemmlowp::SingleThreadGemmWrapper<Kernel, Scalar, tBitDepthParams>::Name() [with Kernel = gemmlowp::DefaultKernel<gemmlowp::BitDepthParams<gemmlowp::OperandRange<1, 255>, gemmlowp::OperandRange<0, 255> > >; Scalar = unsigned char; tBitDepthParams = gemmlowp::BitDepthParams<gemmlowp::OperandRange<1, 255>, gemmlowp::OperandRange<0, 255> >]':
/<<PKGBUILDDIR>>/test/test.cc:123:59: warning: '%s' directive output may be truncated writing up to 255 bytes into a region of size 230 [-Wformat-truncation=]
  123 |     snprintf(buf, sizeof(buf), "SingleThreadGemm, Kernel: %s", Kernel().Name());
      |                                                           ^~
In file included from /usr/include/stdio.h:866,
                 from /usr/include/c++/11/cstdio:42,
                 from /usr/include/c++/11/ext/string_conversions.h:43,
                 from /usr/include/c++/11/bits/basic_string.h:6608,
                 from /usr/include/c++/11/string:55,
                 from /usr/include/c++/11/bits/locale_classes.h:40,
                 from /usr/include/c++/11/bits/ios_base.h:41,
                 from /usr/include/c++/11/ios:42,
                 from /usr/include/c++/11/ostream:38,
                 from /usr/include/c++/11/iostream:39,
                 from /<<PKGBUILDDIR>>/test/test.h:26,
                 from /<<PKGBUILDDIR>>/test/test.cc:15:
/usr/include/arm-linux-gnueabihf/bits/stdio2.h:71:35: note: '__builtin___snprintf_chk' output between 27 and 282 bytes into a destination of size 256
   71 |   return __builtin___snprintf_chk (__s, __n, __USE_FORTIFY_LEVEL - 1,
      |          ~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   72 |                                    __glibc_objsize (__s), __fmt,
      |                                    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   73 |                                    __va_arg_pack ());
      |                                    ~~~~~~~~~~~~~~~~~
/<<PKGBUILDDIR>>/test/test.cc: In static member function 'static const char* gemmlowp::MultiThreadGemmWrapper<Kernel, Scalar, tBitDepthParams>::Name() [with Kernel = gemmlowp::DefaultKernel<gemmlowp::BitDepthParams<gemmlowp::OperandRange<1, 255>, gemmlowp::OperandRange<0, 255> > >; Scalar = unsigned char; tBitDepthParams = gemmlowp::BitDepthParams<gemmlowp::OperandRange<1, 255>, gemmlowp::OperandRange<0, 255> >]':
/<<PKGBUILDDIR>>/test/test.cc:163:58: warning: '%s' directive output may be truncated writing up to 255 bytes into a region of size 231 [-Wformat-truncation=]
  163 |     snprintf(buf, sizeof(buf), "MultiThreadGemm, Kernel: %s", Kernel().Name());
      |                                                          ^~
In file included from /usr/include/stdio.h:866,
                 from /usr/include/c++/11/cstdio:42,
                 from /usr/include/c++/11/ext/string_conversions.h:43,
                 from /usr/include/c++/11/bits/basic_string.h:6608,
                 from /usr/include/c++/11/string:55,
                 from /usr/include/c++/11/bits/locale_classes.h:40,
                 from /usr/include/c++/11/bits/ios_base.h:41,
                 from /usr/include/c++/11/ios:42,
                 from /usr/include/c++/11/ostream:38,
                 from /usr/include/c++/11/iostream:39,
                 from /<<PKGBUILDDIR>>/test/test.h:26,
                 from /<<PKGBUILDDIR>>/test/test.cc:15:
/usr/include/arm-linux-gnueabihf/bits/stdio2.h:71:35: note: '__builtin___snprintf_chk' output between 26 and 281 bytes into a destination of size 256
   71 |   return __builtin___snprintf_chk (__s, __n, __USE_FORTIFY_LEVEL - 1,
      |          ~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   72 |                                    __glibc_objsize (__s), __fmt,
      |                                    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   73 |                                    __va_arg_pack ());
      |                                    ~~~~~~~~~~~~~~~~~
[100%] Linking CXX executable test_gemmlowp
/usr/bin/cmake -E cmake_link_script CMakeFiles/test_gemmlowp.dir/link.txt --verbose=1
/usr/bin/c++ -g -O2 -ffile-prefix-map=/<<PKGBUILDDIR>>=. -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -Wl,-z,relro -Wl,-z,now -Wl,--as-needed CMakeFiles/test_gemmlowp.dir/test/test.cc.o CMakeFiles/test_gemmlowp.dir/test/test_data.cc.o -o test_gemmlowp  libeight_bit_int_gemm.a -lpthread 
make[3]: Leaving directory '/<<PKGBUILDDIR>>/obj-arm-linux-gnueabihf'
[100%] Built target test_gemmlowp
make[2]: Leaving directory '/<<PKGBUILDDIR>>/obj-arm-linux-gnueabihf'
/usr/bin/cmake -E cmake_progress_start "/<<PKGBUILDDIR>>/obj-arm-linux-gnueabihf/CMakeFiles" 0
make[1]: Leaving directory '/<<PKGBUILDDIR>>/obj-arm-linux-gnueabihf'
   dh_auto_test -a -O-Scmake
	cd obj-arm-linux-gnueabihf && make -j4 test ARGS\+=--verbose ARGS\+=-j4
make[1]: Entering directory '/<<PKGBUILDDIR>>/obj-arm-linux-gnueabihf'
Running tests...
/usr/bin/ctest --force-new-ctest-process --verbose -j4
UpdateCTestConfiguration  from :/<<PKGBUILDDIR>>/obj-arm-linux-gnueabihf/DartConfiguration.tcl
Parse Config file:/<<PKGBUILDDIR>>/obj-arm-linux-gnueabihf/DartConfiguration.tcl
UpdateCTestConfiguration  from :/<<PKGBUILDDIR>>/obj-arm-linux-gnueabihf/DartConfiguration.tcl
Parse Config file:/<<PKGBUILDDIR>>/obj-arm-linux-gnueabihf/DartConfiguration.tcl
Test project /<<PKGBUILDDIR>>/obj-arm-linux-gnueabihf
Constructing a list of tests
Done constructing a list of tests
Updating test list for fixtures
Added 0 tests to meet fixture requirements
Checking test dependency graph...
Checking test dependency graph end
test 1
    Start 1: test_math_helpers

1: Test command: /<<PKGBUILDDIR>>/obj-arm-linux-gnueabihf/test_math_helpers
1: Test timeout computed to be: 1500
test 2
    Start 2: test_blocking_counter

2: Test command: /<<PKGBUILDDIR>>/obj-arm-linux-gnueabihf/test_blocking_counter
2: Test timeout computed to be: 1500
test 3
    Start 3: test_allocator

3: Test command: /<<PKGBUILDDIR>>/obj-arm-linux-gnueabihf/test_allocator
3: Test timeout computed to be: 1500
test 4
    Start 4: test_fixedpoint

4: Test command: /<<PKGBUILDDIR>>/obj-arm-linux-gnueabihf/test_fixedpoint
4: Test timeout computed to be: 1500
1/5 Test #3: test_allocator ...................   Passed    0.00 sec
test 5
    Start 5: test_gemmlowp

5: Test command: /<<PKGBUILDDIR>>/obj-arm-linux-gnueabihf/test_gemmlowp
5: Test timeout computed to be: 1500
5: TestWithSmallData: PASS
5:     number of matrix entries: 8
5:     median value: 136
5:     median unsigned diff: 0 (tolerating 0)
5:     max unsigned diff: 0 (tolerating 0)
5:     median signed diff: 0 (tolerating 0)
5:     mean signed diff: 0 (tolerating 0)
5: No error: 100.00 % of entries
5: Error in 1..1 range: 0.00 % of entries
5: Error in 2..3 range: 0.00 % of entries
5: Error in 4..7 range: 0.00 % of entries
5: Error in 8..15 range: 0.00 % of entries
5: Error in 16..31 range: 0.00 % of entries
5: Error in 32..63 range: 0.00 % of entries
5: Error in 64..127 range: 0.00 % of entries
5: Error in 128..255 range: 0.00 % of entries
5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 6
5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 6
5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 8
5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 6
5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 10
5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 10
5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 10
5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 6
5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 6
5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 8
5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 6
5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 10
5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 10
5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 10
5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 8
5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 8
5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 8
5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 8
5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12
5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12
5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12
5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10
5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10
5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10
5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10
5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12
5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12
5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12
5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10
5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10
5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10
5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10
5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12
5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12
5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12
5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10
5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10
5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10
5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10
5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12
5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12
5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12
5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10
5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10
5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10
5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10
5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12
5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14
5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14
5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10
5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10
5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10
5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10
5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14
5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14
5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14
5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 8
5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 8
5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 8
5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 8
5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12
5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12
5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12
5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10
5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10
5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10
5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10
5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12
5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14
5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14
5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10
5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12
5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12
5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10
5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14
5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14
5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14
5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10
5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10
5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10
5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10
5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14
5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14
5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14
5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 12
5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12
5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12
5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 12
5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14
5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14
5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14
5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10
5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12
5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12
5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10
5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14
5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14
5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14
5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10
5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12
5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12
5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10
5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14
5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14
5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14
5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 12
5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12
5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12
5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 12
5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14
5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14
5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14
5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 12
5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12
5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12
5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 12
5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14
5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14
5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14
5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10
5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12
5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12
5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10
5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14
5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14
5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14
5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 12
5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12
5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12
5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 12
5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16
5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16
5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16
5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 12
5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12
5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12
5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 12
5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16
5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16
5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16
5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18
5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14
5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14
5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14
5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14
5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16
5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16
5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16
5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18
5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14
5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14
5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14
5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14
5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16
5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16
5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16
5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18
5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14
5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14
5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14
5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14
5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16
2/5 Test #2: test_blocking_counter ............   Passed    0.19 sec
3/5 Test #1: test_math_helpers ................   Passed    0.19 sec
5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16
5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16
5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18
5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14
5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14
5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14
5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14
5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16
5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16
5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16
5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18
5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14
5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14
5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14
5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14
5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16
5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16
5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16
5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18
5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14
5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14
5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14
5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14
5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16
5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16
5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16
5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18
5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14
5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14
5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14
5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14
5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16
5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16
5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16
5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18
5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14
5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14
5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14
5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14
5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16
5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16
5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16
5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18
5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 16
5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 16
5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 16
4: PASS (Scalar int32)
5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 16
5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 18
5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18
5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18
5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20
5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 12
5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12
5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12
5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 12
5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16
5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16
5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16
5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14
5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14
5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14
5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14
5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16
5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18
5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18
5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18
5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14
4: PASS (Scalar int16)
4/5 Test #4: test_fixedpoint ..................   Passed    1.42 sec
5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14
5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14
5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14
5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16
5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18
5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18
5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18
5: PASS: 512x512x512 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20
5: PASS: 1024x1024x1024 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 22
5: PASS: 567x2345x123 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 24
5: PASS: 100x5000x100 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 24
5: PASS: 1000x1x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 1x1000x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 22
5: PASS: 1000x1x1000 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 1000x1000x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 22
5: PASS: 777x3456x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 24
5: PASS: 4567x555x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20
5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 6
5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 8
5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 8
5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 6
5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 10
5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 10
5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12
5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 8
5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 8
5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 8
5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 8
5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 10
5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12
5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 10
5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 6
5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 6
5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 6
5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 6
5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 8
5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 10
5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 10
5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 8
5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 8
5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 8
5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 8
5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12
5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12
5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12
5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 8
5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 8
5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 8
5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 8
5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12
5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12
5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12
5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10
5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10
5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10
5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10
5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12
5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12
5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14
5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10
5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10
5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10
5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10
5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14
5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14
5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14
5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10
5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10
5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10
5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10
5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12
5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14
5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14
5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 8
5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 8
5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 8
5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 8
5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12
5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12
5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12
5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10
5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10
5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10
5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10
5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14
5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14
5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14
5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10
5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12
5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12
5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10
5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14
5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14
5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14
5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10
5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12
5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12
5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10
5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14
5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14
5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14
5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10
5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12
5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12
5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10
5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14
5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14
5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14
5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10
5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12
5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12
5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10
5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14
5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14
5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14
5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10
5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10
5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10
5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10
5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14
5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14
5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14
5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 12
5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12
5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12
5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 12
5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14
5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14
5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14
5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 12
5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12
5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12
5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 12
5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14
5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14
5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14
5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10
5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12
5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12
5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10
5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14
5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14
5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14
5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 12
5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12
5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12
5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 12
5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16
5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16
5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16
5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 12
5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12
5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12
5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 12
5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16
5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16
5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16
5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18
5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14
5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14
5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14
5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14
5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16
5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16
5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16
5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18
5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14
5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14
5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14
5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14
5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16
5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16
5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16
5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18
5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14
5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14
5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14
5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14
5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16
5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16
5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16
5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18
5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14
5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14
5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14
5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14
5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16
5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16
5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16
5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18
5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14
5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14
5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14
5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14
5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16
5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16
5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16
5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18
5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14
5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14
5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14
5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14
5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16
5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16
5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16
5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18
5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14
5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14
5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14
5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14
5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16
5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16
5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16
5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18
5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14
5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14
5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14
5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14
5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16
5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16
5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16
5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18
5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 16
5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 16
5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 16
5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 16
5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 18
5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18
5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18
5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20
5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 12
5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12
5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12
5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 12
5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16
5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16
5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16
5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14
5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14
5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14
5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14
5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16
5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18
5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18
5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18
5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14
5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14
5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14
5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14
5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16
5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18
5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18
5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18
5: PASS: 512x512x512 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20
5: PASS: 1024x1024x1024 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 22
5: PASS: 567x2345x123 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 24
5: PASS: 100x5000x100 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 24
5: PASS: 1000x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 1x1000x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20
5: PASS: 1000x1x1000 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 1000x1000x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 22
5: PASS: 777x3456x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 24
5: PASS: 4567x555x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20
5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 8
5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 8
5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 8
5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 8
5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 12
5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 12
5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 12
5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16
5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 8
5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 8
5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 8
5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 8
5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 12
5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 12
5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 12
5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16
5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 8
5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 8
5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 8
5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 8
5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 12
5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 12
5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 12
5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16
5: PASS: 1x1x2 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 8
5: PASS: 1x1x2 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 8
5: PASS: 1x1x2 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 8
5: PASS: 1x1x2 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 8
5: PASS: 1x1x2 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 12
5: PASS: 1x1x2 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 12
5: PASS: 1x1x2 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 12
5: PASS: 1x1x2 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16
5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 8
5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 8
5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 8
5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 8
5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 12
5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 12
5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 12
5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16
5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 8
5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 8
5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 8
5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 8
5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 12
5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 12
5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 12
5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16
5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 10
5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 10
5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 10
5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 10
5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 12
5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 12
5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 12
5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16
5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 10
5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 10
5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 10
5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 10
5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 12
5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 14
5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 14
5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16
5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 10
5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 12
5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 12
5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 10
5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 14
5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 14
5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 14
5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16
5: PASS: 3x5x7 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 10
5: PASS: 3x5x7 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 10
5: PASS: 3x5x7 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 10
5: PASS: 3x5x7 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 10
5: PASS: 3x5x7 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 12
5: PASS: 3x5x7 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 12
5: PASS: 3x5x7 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 12
5: PASS: 3x5x7 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16
5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 10
5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 10
5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 10
5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 10
5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 12
5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 12
5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 12
5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16
5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 10
5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 10
5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 10
5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 10
5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 14
5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 14
5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 14
5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16
5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 10
5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 12
5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 12
5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 10
5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 14
5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 14
5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 14
5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16
5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 12
5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 12
5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 12
5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 12
5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 14
5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 14
5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 14
5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16
5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 10
5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 10
5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 10
5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 10
5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 14
5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 14
5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 14
5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16
5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 12
5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 12
5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 12
5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 12
5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 14
5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 14
5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 14
5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16
5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 1, shift 12
5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 10/0/0, mult 1, shift 12
5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 0/10/0, mult 1, shift 12
5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/10, mult 1, shift 12
5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 10, shift 14
5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 10/10/10, mult 10, shift 14
5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 256/1/17, mult 4, shift 14
5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16
5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 1, shift 10
5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 10/0/0, mult 1, shift 12
5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 0/10/0, mult 1, shift 12
5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/10, mult 1, shift 10
5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 10, shift 14
5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 10/10/10, mult 10, shift 14
5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 256/1/17, mult 4, shift 14
5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16
5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 1, shift 12
5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 10/0/0, mult 1, shift 12
5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 0/10/0, mult 1, shift 12
5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/10, mult 1, shift 12
5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 10, shift 14
5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 10/10/10, mult 10, shift 14
5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 256/1/17, mult 4, shift 14
5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16
5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 1, shift 12
5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 10/0/0, mult 1, shift 12
5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 0/10/0, mult 1, shift 12
5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/10, mult 1, shift 12
5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 10, shift 14
5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 10/10/10, mult 10, shift 14
5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 256/1/17, mult 4, shift 14
5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16
5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 12
5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 12
5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 12
5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 12
5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 16
5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 16
5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 16
5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16
5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 12
5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 12
5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 12
5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 12
5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 16
5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 16
5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 16
5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18
5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 14
5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 14
5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 14
5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 14
5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 16
5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 16
5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 16
5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18
5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 14
5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 14
5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 14
5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 14
5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 16
5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 16
5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 16
5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18
5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 14
5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 14
5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 14
5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 14
5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 16
5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 16
5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 16
5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18
5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 14
5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 14
5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 14
5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 14
5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 16
5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 16
5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 16
5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18
5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 1, shift 14
5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 10/0/0, mult 1, shift 14
5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 0/10/0, mult 1, shift 14
5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/10, mult 1, shift 14
5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 10, shift 16
5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 10/10/10, mult 10, shift 16
5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 256/1/17, mult 4, shift 16
5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18
5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 1, shift 14
5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 10/0/0, mult 1, shift 14
5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 0/10/0, mult 1, shift 14
5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/10, mult 1, shift 14
5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 10, shift 16
5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 10/10/10, mult 10, shift 16
5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 256/1/17, mult 4, shift 16
5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18
5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 1, shift 14
5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 10/0/0, mult 1, shift 14
5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 0/10/0, mult 1, shift 14
5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/10, mult 1, shift 14
5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 10, shift 16
5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 10/10/10, mult 10, shift 16
5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 256/1/17, mult 4, shift 16
5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18
5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 1, shift 14
5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 10/0/0, mult 1, shift 14
5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 0/10/0, mult 1, shift 14
5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/10, mult 1, shift 14
5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 10, shift 16
5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 10/10/10, mult 10, shift 16
5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 256/1/17, mult 4, shift 16
5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18
5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 16
5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 16
5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 16
5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 16
5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 18
5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 18
5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 18
5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20
5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 12
5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 12
5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 12
5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 12
5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 16
5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 16
5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 16
5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16
5: PASS: 37x55x73 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 12
5: PASS: 37x55x73 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 14
5: PASS: 37x55x73 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 14
5: PASS: 37x55x73 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 12
5: PASS: 37x55x73 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 16
5: PASS: 37x55x73 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 16
5: PASS: 37x55x73 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 16
5: PASS: 37x55x73 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18
5: PASS: 57x87x117 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 14
5: PASS: 57x87x117 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 14
5: PASS: 57x87x117 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 14
5: PASS: 57x87x117 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 14
5: PASS: 57x87x117 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 16
5: PASS: 57x87x117 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 18
5: PASS: 57x87x117 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 18
5: PASS: 57x87x117 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18
5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 14
5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 14
5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 14
5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 14
5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 16
5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 18
5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 18
5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18
5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 14
5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 14
5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 14
5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 14
5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 16
5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 18
5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 18
5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18
5: PASS: 78x101x82 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 14
5: PASS: 78x101x82 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 14
5: PASS: 78x101x82 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 14
5: PASS: 78x101x82 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 14
5: PASS: 78x101x82 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 18
5: PASS: 78x101x82 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 18
5: PASS: 78x101x82 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 18
5: PASS: 78x101x82 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20
5: PASS: 512x512x512 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20
5: PASS: 1024x1024x1024 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 22
5: PASS: 567x2345x123 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 24
5: PASS: 100x5000x100 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 24
5: PASS: 1x1x1000 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16
5: PASS: 1000x1x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16
5: PASS: 1x1000x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20
5: PASS: 1x1000x1000 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 22
5: PASS: 1000x1x1000 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16
5: PASS: 1000x1000x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 22
5: PASS: 777x3456x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 24
5: PASS: 4567x555x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20
5: PASS: 70x90x110 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 14
5: PASS: 70x90x110 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 14
5: PASS: 70x90x110 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 14
5: PASS: 70x90x110 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 14
5: PASS: 70x90x110 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 16
5: PASS: 70x90x110 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 18
5: PASS: 70x90x110 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 18
5: PASS: 70x90x110 ColMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18
5: PASS: 70x90x110 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 14
5: PASS: 70x90x110 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 14
5: PASS: 70x90x110 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 14
5: PASS: 70x90x110 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 14
5: PASS: 70x90x110 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 16
5: PASS: 70x90x110 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 18
5: PASS: 70x90x110 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 18
5: PASS: 70x90x110 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18
5: PASS: 70x90x110 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 14
5: PASS: 70x90x110 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 14
5: PASS: 70x90x110 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 14
5: PASS: 70x90x110 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 14
5: PASS: 70x90x110 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 16
5: PASS: 70x90x110 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 18
5: PASS: 70x90x110 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 18
5: PASS: 70x90x110 ColMajor x RowMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18
5: PASS: 70x90x110 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 14
5: PASS: 70x90x110 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 14
5: PASS: 70x90x110 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 14
5: PASS: 70x90x110 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 14
5: PASS: 70x90x110 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 16
5: PASS: 70x90x110 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 18
5: PASS: 70x90x110 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 18
5: PASS: 70x90x110 RowMajor x RowMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18
5: PASS: 70x90x110 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 1, shift 14
5: PASS: 70x90x110 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 10/0/0, mult 1, shift 14
5: PASS: 70x90x110 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 0/10/0, mult 1, shift 14
5: PASS: 70x90x110 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/10, mult 1, shift 14
5: PASS: 70x90x110 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 10, shift 16
5: PASS: 70x90x110 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 10/10/10, mult 10, shift 18
5: PASS: 70x90x110 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 256/1/17, mult 4, shift 18
5: PASS: 70x90x110 ColMajor x ColMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18
5: PASS: 70x90x110 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 1, shift 14
5: PASS: 70x90x110 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 10/0/0, mult 1, shift 14
5: PASS: 70x90x110 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 0/10/0, mult 1, shift 14
5: PASS: 70x90x110 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/10, mult 1, shift 14
5: PASS: 70x90x110 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 10, shift 16
5: PASS: 70x90x110 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 10/10/10, mult 10, shift 18
5: PASS: 70x90x110 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 256/1/17, mult 4, shift 18
5: PASS: 70x90x110 RowMajor x ColMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18
5: PASS: 70x90x110 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 1, shift 14
5: PASS: 70x90x110 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 10/0/0, mult 1, shift 14
5: PASS: 70x90x110 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 0/10/0, mult 1, shift 14
5: PASS: 70x90x110 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/10, mult 1, shift 14
5: PASS: 70x90x110 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 10, shift 16
5: PASS: 70x90x110 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 10/10/10, mult 10, shift 18
5: PASS: 70x90x110 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 256/1/17, mult 4, shift 18
5: PASS: 70x90x110 ColMajor x RowMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18
5: PASS: 70x90x110 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 1, shift 14
5: PASS: 70x90x110 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 10/0/0, mult 1, shift 14
5: PASS: 70x90x110 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 0/10/0, mult 1, shift 14
5: PASS: 70x90x110 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/10, mult 1, shift 14
5: PASS: 70x90x110 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 10, shift 18
5: PASS: 70x90x110 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 10/10/10, mult 10, shift 18
5: PASS: 70x90x110 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 256/1/17, mult 4, shift 18
5: PASS: 70x90x110 RowMajor x RowMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18
5: PASS: 300x400x500 ColMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20
5: PASS: 300x400x500 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20
5: PASS: 300x400x500 ColMajor x RowMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20
5: PASS: 300x400x500 RowMajor x RowMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20
5: PASS: 300x400x500 ColMajor x ColMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20
5: PASS: 300x400x500 RowMajor x ColMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20
5: PASS: 300x400x500 ColMajor x RowMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20
5: PASS: 300x400x500 RowMajor x RowMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20
5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 8
5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 8
5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 8
5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 8
5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12
5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12
5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12
5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 8
5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 8
5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 8
5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 8
5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12
5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12
5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12
5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10
5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10
5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10
5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10
5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12
5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12
5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12
5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10
5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10
5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10
5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10
5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14
5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14
5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14
5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 12
5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12
5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12
5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 12
5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14
5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14
5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14
5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10
5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10
5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10
5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10
5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14
5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14
5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14
5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 8
5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 8
5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 8
5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 8
5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12
5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12
5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12
5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10
5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10
5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10
5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10
5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14
5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14
5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14
5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 12
5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12
5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12
5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 12
5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14
5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14
5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14
5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 12
5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12
5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12
5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 12
5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16
5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16
5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16
5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18
5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 16
5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 16
5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 16
5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 16
5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 18
5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18
5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18
5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20
5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14
5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 16
5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 16
5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14
5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 18
5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18
5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18
5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20
5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14
5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14
5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14
5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14
5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 18
5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18
5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18
5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20
5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14
5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14
5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14
5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14
5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16
5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18
5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18
5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18
5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14
5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14
5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14
5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14
5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 18
5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18
5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18
5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20
5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14
5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14
5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14
5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14
5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16
5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18
5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18
5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18
5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14
5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14
5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14
5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14
5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 18
5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18
5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18
5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20
5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14
5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14
5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14
5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14
5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 18
5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18
5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18
5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18
5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14
5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14
5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14
5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14
5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 18
5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18
5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18
5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18
5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14
5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14
5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14
5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14
5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 18
5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18
5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18
5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20
5: PASS: 300x400x1 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20
5: PASS: 300x400x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20
5: PASS: 300x400x1 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20
5: PASS: 300x400x1 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20
5: PASS: 300x400x1 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20
5: PASS: 300x400x1 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20
5: PASS: 300x400x1 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20
5: PASS: 300x400x1 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20
5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 8
5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 8
5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 8
5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 8
5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12
5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12
5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12
5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10
5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10
5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10
5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10
5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12
5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12
5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14
5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10
5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10
5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10
5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10
5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12
5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12
5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12
5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10
5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10
5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10
5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10
5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12
5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14
5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14
5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10
5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12
5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12
5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10
5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14
5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14
5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14
5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10
5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10
5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10
5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10
5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14
5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14
5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14
5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10
5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10
5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10
5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10
5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12
5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12
5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12
5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10
5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10
5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10
5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10
5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14
5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14
5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14
5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 12
5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12
5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12
5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 12
5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14
5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14
5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14
5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 12
5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12
5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12
5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 12
5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16
5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16
5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16
5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18
5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14
5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 16
5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 16
5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14
5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 18
5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18
5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18
5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20
5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14
5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14
5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14
5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14
5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 18
5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18
5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18
5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18
5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14
5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14
5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14
5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14
5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 18
5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18
5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18
5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18
5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14
5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14
5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14
5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14
5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 18
5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18
5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18
5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18
5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14
5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14
5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14
5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14
5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 18
5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18
5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18
5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20
5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14
5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14
5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14
5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14
5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 18
5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18
5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18
5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20
5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14
5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14
5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14
5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14
5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16
5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18
5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18
5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18
5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14
5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14
5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14
5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14
5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 18
5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18
5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18
5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18
5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14
5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14
5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14
5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14
5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 18
5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18
5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18
5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18
5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14
5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14
5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14
5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14
5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 18
5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18
5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18
5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20
5: PASS: 300x400x1 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20
5: PASS: 300x400x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20
5: PASS: 300x400x1 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20
5: PASS: 300x400x1 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20
5: PASS: 300x400x1 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20
5: PASS: 300x400x1 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20
5: PASS: 300x400x1 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20
5: PASS: 300x400x1 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20
5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 8
5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 8
5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 8
5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 8
5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 12
5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 12
5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 12
5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16
5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 10
5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 10
5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 10
5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 10
5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 12
5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 12
5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 12
5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16
5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 8
5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 8
5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 8
5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 8
5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 12
5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 12
5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 12
5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16
5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 10
5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 10
5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 10
5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 10
5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 12
5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 12
5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 12
5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16
5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 10
5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 10
5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 10
5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 10
5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 12
5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 14
5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 14
5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16
5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 8
5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 8
5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 8
5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 8
5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 12
5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 12
5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 12
5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16
5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 8
5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 8
5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 8
5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 8
5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 12
5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 12
5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 12
5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16
5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 10
5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 10
5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 10
5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 10
5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 12
5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 14
5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 14
5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16
5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 12
5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 12
5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 12
5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 12
5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 14
5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 14
5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 14
5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16
5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 12
5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 12
5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 12
5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 12
5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 16
5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 16
5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 16
5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18
5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 16
5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 16
5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 16
5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 16
5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 18
5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 18
5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 18
5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20
5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 14
5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 14
5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 14
5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 14
5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 18
5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 18
5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 18
5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18
5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 14
5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 14
5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 14
5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 14
5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 18
5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 18
5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 18
5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20
5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 14
5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 14
5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 14
5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 14
5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 16
5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 18
5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 18
5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18
5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 14
5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 14
5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 14
5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 14
5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 16
5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 18
5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 18
5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18
5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 14
5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 14
5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 14
5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 14
5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 18
5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 18
5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 18
5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20
5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 1, shift 14
5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 10/0/0, mult 1, shift 14
5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 0/10/0, mult 1, shift 14
5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/10, mult 1, shift 14
5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 10, shift 18
5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 10/10/10, mult 10, shift 18
5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 256/1/17, mult 4, shift 18
5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18
5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 1, shift 14
5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 10/0/0, mult 1, shift 14
5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 0/10/0, mult 1, shift 14
5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/10, mult 1, shift 14
5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 10, shift 18
5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 10/10/10, mult 10, shift 18
5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 256/1/17, mult 4, shift 18
5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18
5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 1, shift 14
5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 10/0/0, mult 1, shift 14
5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 0/10/0, mult 1, shift 14
5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/10, mult 1, shift 14
5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 10, shift 18
5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 10/10/10, mult 10, shift 18
5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 256/1/17, mult 4, shift 18
5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20
5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 1, shift 14
5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 10/0/0, mult 1, shift 14
5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 0/10/0, mult 1, shift 14
5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/10, mult 1, shift 14
5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 10, shift 16
5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 10/10/10, mult 10, shift 18
5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 256/1/17, mult 4, shift 18
5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18
5: PASS: 300x400x1 ColMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20
5: PASS: 300x400x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20
5: PASS: 300x400x1 ColMajor x RowMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20
5: PASS: 300x400x1 RowMajor x RowMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20
5: PASS: 300x400x1 ColMajor x ColMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20
5: PASS: 300x400x1 RowMajor x ColMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20
5: PASS: 300x400x1 ColMajor x RowMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20
5: PASS: 300x400x1 RowMajor x RowMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20
5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 8
5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 8
5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 8
5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 8
5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12
5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12
5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12
5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 8
5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 8
5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 8
5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 8
5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12
5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12
5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12
5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10
5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10
5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10
5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10
5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12
5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12
5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12
5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 8
5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 8
5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 8
5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 8
5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12
5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12
5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12
5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10
5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10
5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10
5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10
5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12
5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12
5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12
5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 8
5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 8
5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 8
5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 8
5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12
5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12
5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12
5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10
5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10
5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10
5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10
5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12
5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12
5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14
5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10
5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10
5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10
5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10
5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14
5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14
5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14
5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 8
5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 8
5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 8
5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 8
5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12
5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12
5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12
5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10
5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10
5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10
5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10
5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14
5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14
5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14
5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 12
5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12
5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12
5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 12
5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14
5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14
5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14
5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10
5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10
5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10
5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10
5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14
5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14
5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14
5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10
5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12
5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12
5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10
5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14
5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14
5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14
5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10
5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10
5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10
5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10
5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14
5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14
5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14
5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10
5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10
5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10
5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10
5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14
5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14
5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14
5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10
5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10
5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10
5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10
5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14
5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14
5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14
5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 12
5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12
5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12
5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 12
5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14
5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14
5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14
5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10
5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12
5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12
5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10
5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14
5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14
5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14
5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 12
5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12
5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12
5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 12
5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16
5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16
5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16
5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 12
5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12
5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12
5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 12
5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16
5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16
5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16
5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18
5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14
5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14
5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14
5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14
5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16
5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16
5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16
5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18
5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14
5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14
5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14
5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14
5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16
5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16
5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16
5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18
5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14
5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14
5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14
5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14
5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16
5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16
5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16
5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18
5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14
5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14
5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14
5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14
5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16
5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16
5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16
5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18
5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14
5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14
5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14
5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14
5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16
5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16
5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16
5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18
5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14
5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14
5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14
5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14
5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16
5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16
5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16
5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18
5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14
5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14
5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14
5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14
5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16
5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16
5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16
5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18
5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14
5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14
5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14
5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14
5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16
5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16
5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16
5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18
5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 16
5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 16
5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 16
5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 16
5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 18
5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18
5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18
5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20
5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 12
5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12
5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12
5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 12
5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16
5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16
5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16
5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14
5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14
5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14
5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14
5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16
5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18
5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18
5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18
5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14
5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14
5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14
5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14
5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16
5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18
5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18
5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18
5: PASS: 512x512x512 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20
5: PASS: 1024x1024x1024 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 22
5: PASS: 567x2345x123 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 24
5: PASS: 100x5000x100 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 24
5: PASS: 1000x1x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 1x1000x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 22
5: PASS: 1000x1x1000 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 1000x1000x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20
5: PASS: 777x3456x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 24
5: PASS: 4567x555x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20
5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 8
5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 8
5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 8
5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 8
5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12
5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12
5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12
5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 8
5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 8
5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 8
5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 8
5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12
5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12
5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12
5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 8
5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 8
5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 8
5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 8
5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12
5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12
5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12
5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 8
5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 8
5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 8
5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 8
5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12
5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12
5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12
5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 8
5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 8
5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 8
5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 8
5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12
5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12
5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12
5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 8
5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10
5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10
5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 8
5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12
5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12
5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12
5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10
5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10
5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10
5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10
5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12
5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14
5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14
5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10
5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10
5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10
5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10
5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14
5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14
5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14
5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 8
5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 8
5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 8
5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 8
5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12
5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12
5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12
5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10
5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10
5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10
5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10
5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14
5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14
5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14
5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 12
5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12
5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12
5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 12
5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14
5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14
5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14
5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10
5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12
5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12
5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10
5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14
5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14
5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14
5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 12
5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12
5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12
5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 12
5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14
5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14
5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14
5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 12
5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12
5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12
5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 12
5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14
5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14
5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14
5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10
5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10
5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10
5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10
5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14
5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14
5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14
5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10
5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12
5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12
5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10
5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14
5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14
5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14
5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10
5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10
5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10
5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10
5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14
5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14
5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14
5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 12
5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12
5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12
5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 12
5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14
5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14
5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14
5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 12
5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12
5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12
5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 12
5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16
5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16
5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16
5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 12
5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12
5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12
5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 12
5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16
5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16
5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16
5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18
5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14
5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14
5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14
5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14
5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16
5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16
5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16
5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18
5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14
5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14
5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14
5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14
5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16
5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16
5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16
5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18
5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14
5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14
5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14
5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14
5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16
5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16
5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16
5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18
5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14
5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14
5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14
5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14
5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16
5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16
5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16
5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18
5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14
5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14
5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14
5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14
5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16
5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16
5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16
5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18
5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14
5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14
5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14
5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14
5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16
5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16
5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16
5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18
5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14
5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14
5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14
5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14
5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16
5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16
5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16
5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18
5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14
5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14
5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14
5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14
5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16
5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16
5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16
5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18
5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 16
5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 16
5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 16
5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 16
5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 18
5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18
5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18
5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20
5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 12
5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12
5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12
5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 12
5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16
5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16
5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16
5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14
5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14
5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14
5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14
5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16
5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18
5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18
5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18
5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14
5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14
5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14
5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14
5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16
5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18
5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18
5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18
5: PASS: 512x512x512 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20
5: PASS: 1024x1024x1024 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 22
5: PASS: 567x2345x123 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 24
5: PASS: 100x5000x100 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 24
5: PASS: 1000x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 1x1000x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 22
5: PASS: 1000x1x1000 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 1000x1000x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 22
5: PASS: 777x3456x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 24
5: PASS: 4567x555x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20
5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 8
5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 8
5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 8
5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 8
5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 12
5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 12
5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 12
5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16
5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 8
5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 8
5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 8
5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 8
5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 12
5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 12
5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 12
5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16
5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 8
5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 8
5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 8
5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 8
5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 12
5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 12
5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 12
5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16
5: PASS: 1x1x2 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 8
5: PASS: 1x1x2 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 8
5: PASS: 1x1x2 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 8
5: PASS: 1x1x2 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 8
5: PASS: 1x1x2 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 12
5: PASS: 1x1x2 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 12
5: PASS: 1x1x2 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 12
5: PASS: 1x1x2 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16
5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 8
5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 8
5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 8
5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 8
5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 12
5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 12
5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 12
5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16
5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 8
5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 8
5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 8
5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 8
5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 12
5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 12
5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 12
5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16
5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 10
5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 10
5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 10
5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 10
5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 14
5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 14
5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 14
5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16
5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 10
5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 10
5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 10
5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 10
5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 12
5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 14
5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 14
5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16
5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 10
5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 10
5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 10
5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 10
5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 12
5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 14
5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 14
5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16
5: PASS: 3x5x7 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 10
5: PASS: 3x5x7 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 10
5: PASS: 3x5x7 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 10
5: PASS: 3x5x7 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 10
5: PASS: 3x5x7 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 12
5: PASS: 3x5x7 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 14
5: PASS: 3x5x7 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 14
5: PASS: 3x5x7 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16
5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 10
5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 10
5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 10
5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 10
5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 12
5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 12
5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 12
5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16
5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 10
5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 10
5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 10
5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 10
5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 14
5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 14
5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 14
5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16
5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 10
5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 12
5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 12
5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 10
5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 14
5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 14
5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 14
5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16
5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 10
5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 10
5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 10
5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 10
5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 14
5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 14
5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 14
5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16
5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 12
5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 12
5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 12
5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 12
5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 14
5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 14
5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 14
5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16
5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 10
5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 12
5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 12
5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 10
5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 14
5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 14
5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 14
5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16
5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 1, shift 12
5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 10/0/0, mult 1, shift 12
5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 0/10/0, mult 1, shift 12
5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/10, mult 1, shift 12
5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 10, shift 14
5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 10/10/10, mult 10, shift 14
5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 256/1/17, mult 4, shift 14
5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16
5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 1, shift 12
5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 10/0/0, mult 1, shift 12
5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 0/10/0, mult 1, shift 12
5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/10, mult 1, shift 12
5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 10, shift 14
5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 10/10/10, mult 10, shift 14
5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 256/1/17, mult 4, shift 14
5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16
5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 1, shift 10
5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 10/0/0, mult 1, shift 12
5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 0/10/0, mult 1, shift 12
5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/10, mult 1, shift 10
5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 10, shift 14
5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 10/10/10, mult 10, shift 14
5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 256/1/17, mult 4, shift 14
5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16
5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 1, shift 10
5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 10/0/0, mult 1, shift 12
5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 0/10/0, mult 1, shift 10
5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/10, mult 1, shift 10
5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 10, shift 14
5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 10/10/10, mult 10, shift 14
5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 256/1/17, mult 4, shift 14
5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16
5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 12
5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 12
5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 12
5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 12
5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 16
5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 16
5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 16
5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16
5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 12
5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 12
5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 12
5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 12
5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 16
5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 16
5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 16
5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18
5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 14
5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 14
5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 14
5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 14
5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 16
5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 16
5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 16
5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18
5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 14
5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 14
5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 14
5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 14
5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 16
5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 16
5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 16
5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18
5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 14
5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 14
5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 14
5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 14
5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 16
5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 16
5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 16
5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18
5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 14
5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 14
5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 14
5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 14
5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 16
5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 16
5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 16
5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18
5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 1, shift 14
5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 10/0/0, mult 1, shift 14
5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 0/10/0, mult 1, shift 14
5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/10, mult 1, shift 14
5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 10, shift 16
5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 10/10/10, mult 10, shift 16
5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 256/1/17, mult 4, shift 16
5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18
5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 1, shift 14
5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 10/0/0, mult 1, shift 14
5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 0/10/0, mult 1, shift 14
5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/10, mult 1, shift 14
5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 10, shift 16
5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 10/10/10, mult 10, shift 16
5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 256/1/17, mult 4, shift 16
5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18
5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 1, shift 14
5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 10/0/0, mult 1, shift 14
5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 0/10/0, mult 1, shift 14
5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/10, mult 1, shift 14
5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 10, shift 16
5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 10/10/10, mult 10, shift 16
5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 256/1/17, mult 4, shift 16
5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18
5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 1, shift 14
5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 10/0/0, mult 1, shift 14
5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 0/10/0, mult 1, shift 14
5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/10, mult 1, shift 14
5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 10, shift 16
5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 10/10/10, mult 10, shift 16
5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 256/1/17, mult 4, shift 16
5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18
5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 16
5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 16
5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 16
5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 16
5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 18
5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 18
5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 18
5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20
5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 12
5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 12
5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 12
5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 12
5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 16
5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 16
5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 16
5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16
5: PASS: 37x55x73 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 12
5: PASS: 37x55x73 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 14
5: PASS: 37x55x73 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 14
5: PASS: 37x55x73 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 12
5: PASS: 37x55x73 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 16
5: PASS: 37x55x73 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 16
5: PASS: 37x55x73 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 16
5: PASS: 37x55x73 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18
5: PASS: 57x87x117 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 14
5: PASS: 57x87x117 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 14
5: PASS: 57x87x117 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 14
5: PASS: 57x87x117 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 14
5: PASS: 57x87x117 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 16
5: PASS: 57x87x117 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 18
5: PASS: 57x87x117 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 18
5: PASS: 57x87x117 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18
5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 14
5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 14
5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 14
5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 14
5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 16
5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 18
5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 18
5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18
5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 14
5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 14
5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 14
5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 14
5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 16
5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 18
5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 18
5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18
5: PASS: 78x101x82 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 14
5: PASS: 78x101x82 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 14
5: PASS: 78x101x82 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 14
5: PASS: 78x101x82 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 14
5: PASS: 78x101x82 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 18
5: PASS: 78x101x82 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 18
5: PASS: 78x101x82 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 18
5: PASS: 78x101x82 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20
5: PASS: 512x512x512 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20
5: PASS: 1024x1024x1024 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 22
5: PASS: 567x2345x123 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 24
5: PASS: 100x5000x100 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 24
5: PASS: 1x1x1000 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16
5: PASS: 1000x1x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16
5: PASS: 1x1000x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20
5: PASS: 1x1000x1000 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 22
5: PASS: 1000x1x1000 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16
5: PASS: 1000x1000x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 22
5: PASS: 777x3456x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 24
5: PASS: 4567x555x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20
5: PASS: 70x90x110 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 14
5: PASS: 70x90x110 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 14
5: PASS: 70x90x110 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 14
5: PASS: 70x90x110 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 14
5: PASS: 70x90x110 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 16
5: PASS: 70x90x110 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 18
5: PASS: 70x90x110 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 18
5: PASS: 70x90x110 ColMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18
5: PASS: 70x90x110 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 14
5: PASS: 70x90x110 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 14
5: PASS: 70x90x110 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 14
5: PASS: 70x90x110 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 14
5: PASS: 70x90x110 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 18
5: PASS: 70x90x110 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 18
5: PASS: 70x90x110 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 18
5: PASS: 70x90x110 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18
5: PASS: 70x90x110 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 14
5: PASS: 70x90x110 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 14
5: PASS: 70x90x110 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 14
5: PASS: 70x90x110 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 14
5: PASS: 70x90x110 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 18
5: PASS: 70x90x110 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 18
5: PASS: 70x90x110 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 18
5: PASS: 70x90x110 ColMajor x RowMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18
5: PASS: 70x90x110 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 14
5: PASS: 70x90x110 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 14
5: PASS: 70x90x110 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 14
5: PASS: 70x90x110 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 14
5: PASS: 70x90x110 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 16
5: PASS: 70x90x110 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 18
5: PASS: 70x90x110 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 18
5: PASS: 70x90x110 RowMajor x RowMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18
5: PASS: 70x90x110 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 1, shift 14
5: PASS: 70x90x110 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 10/0/0, mult 1, shift 14
5: PASS: 70x90x110 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 0/10/0, mult 1, shift 14
5: PASS: 70x90x110 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/10, mult 1, shift 14
5: PASS: 70x90x110 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 10, shift 18
5: PASS: 70x90x110 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 10/10/10, mult 10, shift 18
5: PASS: 70x90x110 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 256/1/17, mult 4, shift 18
5: PASS: 70x90x110 ColMajor x ColMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18
5: PASS: 70x90x110 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 1, shift 14
5: PASS: 70x90x110 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 10/0/0, mult 1, shift 14
5: PASS: 70x90x110 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 0/10/0, mult 1, shift 14
5: PASS: 70x90x110 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/10, mult 1, shift 14
5: PASS: 70x90x110 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 10, shift 16
5: PASS: 70x90x110 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 10/10/10, mult 10, shift 18
5: PASS: 70x90x110 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 256/1/17, mult 4, shift 18
5: PASS: 70x90x110 RowMajor x ColMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18
5: PASS: 70x90x110 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 1, shift 14
5: PASS: 70x90x110 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 10/0/0, mult 1, shift 14
5: PASS: 70x90x110 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 0/10/0, mult 1, shift 14
5: PASS: 70x90x110 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/10, mult 1, shift 14
5: PASS: 70x90x110 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 10, shift 18
5: PASS: 70x90x110 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 10/10/10, mult 10, shift 18
5: PASS: 70x90x110 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 256/1/17, mult 4, shift 18
5: PASS: 70x90x110 ColMajor x RowMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18
5: PASS: 70x90x110 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 1, shift 14
5: PASS: 70x90x110 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 10/0/0, mult 1, shift 14
5: PASS: 70x90x110 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 0/10/0, mult 1, shift 14
5: PASS: 70x90x110 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/10, mult 1, shift 14
5: PASS: 70x90x110 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 10, shift 18
5: PASS: 70x90x110 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 10/10/10, mult 10, shift 18
5: PASS: 70x90x110 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 256/1/17, mult 4, shift 18
5: PASS: 70x90x110 RowMajor x RowMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18
5: PASS: 300x400x500 ColMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20
5: PASS: 300x400x500 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20
5: PASS: 300x400x500 ColMajor x RowMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20
5: PASS: 300x400x500 RowMajor x RowMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20
5: PASS: 300x400x500 ColMajor x ColMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20
5: PASS: 300x400x500 RowMajor x ColMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20
5: PASS: 300x400x500 ColMajor x RowMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20
5: PASS: 300x400x500 RowMajor x RowMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20
5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 8
5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 8
5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 8
5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 8
5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12
5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12
5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12
5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 8
5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 8
5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 8
5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 8
5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12
5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12
5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12
5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10
5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10
5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10
5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10
5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12
5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12
5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14
5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10
5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10
5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10
5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10
5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14
5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14
5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14
5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10
5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10
5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10
5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10
5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14
5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14
5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14
5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10
5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10
5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10
5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10
5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12
5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14
5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14
5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10
5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10
5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10
5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10
5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12
5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12
5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12
5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 12
5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12
5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12
5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 12
5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14
5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14
5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14
5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10
5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10
5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10
5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10
5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14
5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14
5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14
5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 12
5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12
5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12
5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 12
5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16
5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16
5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16
5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14
5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 16
5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 16
5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14
5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 18
5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18
5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18
5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20
5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14
5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14
5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14
5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14
5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 18
5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18
5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18
5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18
5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14
5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14
5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14
5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14
5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 18
5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18
5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18
5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18
5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14
5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14
5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14
5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14
5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 18
5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18
5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18
5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20
5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14
5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14
5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14
5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14
5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16
5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18
5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18
5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18
5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14
5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14
5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14
5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14
5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16
5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18
5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18
5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18
5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14
5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14
5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14
5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14
5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16
5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18
5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18
5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18
5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14
5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14
5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14
5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14
5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16
5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18
5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18
5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18
5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14
5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14
5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14
5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14
5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 18
5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18
5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18
5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20
5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14
5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14
5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14
5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14
5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 18
5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18
5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18
5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18
5: PASS: 300x400x1 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20
5: PASS: 300x400x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20
5: PASS: 300x400x1 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20
5: PASS: 300x400x1 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20
5: PASS: 300x400x1 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20
5: PASS: 300x400x1 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20
5: PASS: 300x400x1 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20
5: PASS: 300x400x1 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20
5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10
5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10
5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10
5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10
5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12
5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12
5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12
5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 8
5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 8
5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 8
5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 8
5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12
5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12
5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12
5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10
5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10
5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10
5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10
5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14
5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14
5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14
5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10
5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10
5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10
5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10
5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12
5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12
5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14
5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10
5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12
5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12
5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10
5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14
5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14
5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14
5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10
5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10
5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10
5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10
5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14
5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14
5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14
5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 8
5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 8
5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 8
5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 8
5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12
5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12
5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12
5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 8
5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 8
5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 8
5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 8
5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12
5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12
5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12
5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 12
5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12
5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12
5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 12
5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14
5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14
5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14
5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 12
5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12
5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12
5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 12
5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16
5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16
5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16
5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18
5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14
5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 16
5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 16
5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14
5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 18
5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18
5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18
5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18
5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14
5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 16
5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 16
5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14
5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 18
5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18
5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18
5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20
5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14
5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14
5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14
5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14
5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16
5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18
5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18
5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18
5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14
5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14
5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14
5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14
5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 18
5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18
5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18
5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20
5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14
5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14
5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14
5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14
5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16
5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18
5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18
5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18
5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14
5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14
5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14
5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14
5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 18
5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18
5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18
5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18
5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14
5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14
5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14
5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14
5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 18
5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18
5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18
5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20
5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14
5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14
5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14
5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14
5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 18
5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18
5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18
5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20
5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14
5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14
5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14
5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14
5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16
5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18
5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18
5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18
5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14
5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14
5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14
5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14
5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16
5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18
5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18
5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18
5: PASS: 300x400x1 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20
5: PASS: 300x400x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20
5: PASS: 300x400x1 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20
5: PASS: 300x400x1 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20
5: PASS: 300x400x1 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20
5: PASS: 300x400x1 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20
5: PASS: 300x400x1 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20
5: PASS: 300x400x1 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20
5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 10
5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 10
5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 10
5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 10
5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 12
5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 12
5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 12
5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16
5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 8
5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 8
5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 8
5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 8
5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 12
5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 12
5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 12
5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16
5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 8
5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 8
5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 8
5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 8
5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 12
5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 12
5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 12
5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16
5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 10
5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 10
5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 10
5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 10
5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 12
5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 14
5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 14
5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16
5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 10
5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 10
5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 10
5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 10
5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 14
5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 14
5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 14
5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16
5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 10
5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 10
5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 10
5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 10
5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 14
5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 14
5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 14
5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16
5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 8
5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 8
5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 8
5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 8
5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 12
5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 12
5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 12
5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16
5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 10
5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 10
5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 10
5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 10
5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 14
5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 14
5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 14
5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16
5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 12
5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 12
5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 12
5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 12
5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 14
5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 14
5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 14
5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16
5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 12
5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 12
5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 12
5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 12
5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 16
5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 16
5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 16
5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18
5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 14
5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 14
5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 14
5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 14
5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 18
5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 18
5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 18
5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18
5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 16
5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 16
5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 16
5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 16
5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 18
5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 18
5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 18
5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20
5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 14
5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 14
5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 14
5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 14
5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 16
5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 18
5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 18
5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18
5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 14
5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 14
5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 14
5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 14
5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 16
5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 18
5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 18
5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18
5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 14
5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 14
5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 14
5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 14
5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 18
5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 18
5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 18
5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18
5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 14
5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 14
5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 14
5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 14
5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 16
5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 18
5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 18
5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18
5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 1, shift 14
5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 10/0/0, mult 1, shift 14
5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 0/10/0, mult 1, shift 14
5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/10, mult 1, shift 14
5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 10, shift 18
5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 10/10/10, mult 10, shift 18
5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 256/1/17, mult 4, shift 18
5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20
5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 1, shift 14
5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 10/0/0, mult 1, shift 14
5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 0/10/0, mult 1, shift 14
5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/10, mult 1, shift 14
5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 10, shift 18
5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 10/10/10, mult 10, shift 18
5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 256/1/17, mult 4, shift 18
5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20
5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 1, shift 14
5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 10/0/0, mult 1, shift 14
5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 0/10/0, mult 1, shift 14
5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/10, mult 1, shift 14
5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 10, shift 18
5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 10/10/10, mult 10, shift 18
5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 256/1/17, mult 4, shift 18
5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18
5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 1, shift 14
5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 10/0/0, mult 1, shift 14
5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 0/10/0, mult 1, shift 14
5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/10, mult 1, shift 14
5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 10, shift 16
5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 10/10/10, mult 10, shift 18
5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 256/1/17, mult 4, shift 18
5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18
5: PASS: 300x400x1 ColMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20
5: PASS: 300x400x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20
5: PASS: 300x400x1 ColMajor x RowMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20
5: PASS: 300x400x1 RowMajor x RowMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20
5: PASS: 300x400x1 ColMajor x ColMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20
5: PASS: 300x400x1 RowMajor x ColMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20
5: PASS: 300x400x1 ColMajor x RowMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20
5: PASS: 300x400x1 RowMajor x RowMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20
5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 6
5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 8
5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 8
5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 6
5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 10
5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 10
5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 10
5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 8
5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 8
5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 8
5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 8
5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12
5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12
5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12
5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 8
5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 8
5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 8
5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 8
5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12
5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12
5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12
5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 8
5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 8
5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 8
5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 8
5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12
5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12
5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12
5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 8
5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 8
5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 8
5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 8
5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12
5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12
5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12
5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10
5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10
5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10
5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10
5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12
5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12
5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12
5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10
5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10
5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10
5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10
5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12
5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12
5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12
5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10
5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10
5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10
5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10
5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12
5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14
5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14
5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 8
5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 8
5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 8
5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 8
5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12
5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12
5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12
5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10
5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10
5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10
5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10
5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14
5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14
5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14
5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 12
5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12
5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12
5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 12
5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14
5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14
5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14
5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10
5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10
5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10
5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10
5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14
5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14
5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14
5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10
5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12
5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10
5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10
5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14
5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14
5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14
5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 12
5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12
5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12
5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 12
5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14
5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14
5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14
5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10
5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12
5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12
5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10
5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14
5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14
5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14
5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 12
5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12
5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12
5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 12
5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14
5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14
5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14
5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 12
5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12
5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12
5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 12
5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14
5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14
5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14
5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10
5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10
5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10
5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10
5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14
5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14
5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14
5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 12
5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12
5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12
5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 12
5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16
5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16
5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16
5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 12
5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12
5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12
5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 12
5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16
5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16
5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16
5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18
5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14
5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14
5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14
5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14
5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16
5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16
5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16
5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18
5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14
5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14
5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14
5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14
5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16
5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16
5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16
5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18
5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14
5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14
5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14
5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14
5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16
5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16
5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16
5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18
5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14
5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14
5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14
5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14
5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16
5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16
5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16
5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18
5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14
5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14
5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14
5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14
5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16
5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16
5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16
5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18
5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14
5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14
5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14
5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14
5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16
5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16
5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16
5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18
5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14
5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14
5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14
5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14
5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16
5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16
5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16
5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18
5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14
5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14
5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14
5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14
5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16
5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16
5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16
5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18
5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 16
5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 16
5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 16
5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 16
5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 18
5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18
5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18
5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20
5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 12
5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12
5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12
5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 12
5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16
5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16
5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16
5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14
5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14
5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14
5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14
5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16
5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18
5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18
5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18
5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14
5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14
5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14
5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14
5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16
5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18
5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18
5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18
5: PASS: 512x512x512 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20
5: PASS: 1024x1024x1024 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 22
5: PASS: 567x2345x123 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 24
5: PASS: 100x5000x100 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 24
5: PASS: 1000x1x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 1x1000x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20
5: PASS: 1000x1x1000 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 1000x1000x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 22
5: PASS: 777x3456x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 24
5: PASS: 4567x555x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20
5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 8
5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 8
5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 8
5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 8
5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 10
5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12
5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12
5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 8
5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 8
5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 8
5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 8
5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12
5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12
5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12
5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 6
5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 6
5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 6
5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 6
5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 8
5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 10
5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12
5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 8
5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 8
5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 8
5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 8
5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12
5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12
5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12
5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 8
5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 8
5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 8
5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 8
5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12
5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12
5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12
5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10
5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10
5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10
5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10
5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12
5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14
5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14
5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10
5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10
5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10
5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10
5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14
5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14
5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14
5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10
5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10
5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10
5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10
5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12
5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12
5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12
5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 8
5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 8
5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 8
5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 8
5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12
5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12
5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12
5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10
5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10
5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10
5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10
5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14
5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14
5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14
5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 12
5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12
5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12
5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 12
5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14
5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14
5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14
5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10
5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12
5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12
5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10
5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14
5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14
5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14
5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10
5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10
5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10
5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10
5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14
5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14
5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14
5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10
5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10
5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10
5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10
5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14
5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14
5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14
5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10
5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10
5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10
5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10
5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14
5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14
5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14
5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 12
5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12
5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12
5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 12
5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14
5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14
5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14
5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10
5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10
5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10
5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10
5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14
5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14
5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14
5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 12
5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12
5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12
5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 12
5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14
5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14
5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14
5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 12
5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12
5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12
5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 12
5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16
5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16
5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16
5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 12
5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12
5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12
5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 12
5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16
5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16
5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16
5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18
5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14
5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14
5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14
5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14
5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16
5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16
5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16
5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18
5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14
5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14
5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14
5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14
5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16
5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16
5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16
5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18
5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14
5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14
5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14
5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14
5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16
5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16
5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16
5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18
5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14
5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14
5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14
5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14
5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16
5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16
5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16
5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18
5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14
5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14
5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14
5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14
5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16
5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16
5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16
5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18
5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14
5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14
5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14
5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14
5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16
5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16
5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16
5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18
5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14
5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14
5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14
5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14
5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16
5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16
5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16
5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18
5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14
5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14
5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14
5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14
5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16
5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16
5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16
5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18
5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 16
5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 16
5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 16
5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 16
5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 18
5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18
5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18
5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20
5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 12
5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12
5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12
5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 12
5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16
5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16
5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16
5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14
5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14
5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14
5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14
5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16
5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18
5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18
5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18
5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14
5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14
5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14
5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14
5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16
5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18
5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18
5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18
5: PASS: 512x512x512 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20
5: PASS: 1024x1024x1024 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 22
5: PASS: 567x2345x123 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 24
5: PASS: 100x5000x100 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 24
5: PASS: 1000x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 1x1000x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 22
5: PASS: 1000x1x1000 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 1000x1000x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 22
5: PASS: 777x3456x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 24
5: PASS: 4567x555x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20
5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 8
5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 8
5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 8
5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 8
5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 12
5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 12
5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 12
5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16
5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 8
5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 8
5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 8
5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 8
5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 12
5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 12
5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 12
5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16
5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 8
5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 8
5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 8
5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 8
5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 12
5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 12
5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 12
5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16
5: PASS: 1x1x2 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 8
5: PASS: 1x1x2 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 8
5: PASS: 1x1x2 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 8
5: PASS: 1x1x2 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 8
5: PASS: 1x1x2 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 12
5: PASS: 1x1x2 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 12
5: PASS: 1x1x2 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 12
5: PASS: 1x1x2 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16
5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 8
5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 8
5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 8
5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 8
5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 12
5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 12
5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 12
5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16
5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 10
5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 10
5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 10
5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 10
5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 14
5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 14
5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 12
5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16
5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 10
5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 10
5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 10
5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 10
5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 12
5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 12
5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 14
5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16
5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 10
5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 10
5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 10
5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 10
5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 12
5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 14
5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 14
5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16
5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 10
5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 10
5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 10
5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 10
5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 14
5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 14
5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 14
5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16
5: PASS: 3x5x7 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 10
5: PASS: 3x5x7 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 10
5: PASS: 3x5x7 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 10
5: PASS: 3x5x7 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 10
5: PASS: 3x5x7 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 12
5: PASS: 3x5x7 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 14
5: PASS: 3x5x7 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 12
5: PASS: 3x5x7 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16
5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 10
5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 10
5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 10
5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 10
5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 12
5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 12
5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 12
5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16
5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 10
5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 10
5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 10
5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 10
5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 14
5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 14
5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 14
5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16
5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 10
5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 10
5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 10
5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 10
5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 14
5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 14
5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 14
5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16
5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 12
5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 12
5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 12
5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 12
5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 14
5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 14
5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 14
5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16
5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 10
5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 12
5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 12
5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 10
5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 14
5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 14
5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 14
5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16
5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 12
5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 12
5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 12
5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 12
5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 14
5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 14
5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 14
5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16
5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 1, shift 12
5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 10/0/0, mult 1, shift 12
5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 0/10/0, mult 1, shift 12
5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/10, mult 1, shift 12
5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 10, shift 14
5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 10/10/10, mult 10, shift 14
5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 256/1/17, mult 4, shift 14
5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16
5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 1, shift 12
5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 10/0/0, mult 1, shift 12
5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 0/10/0, mult 1, shift 12
5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/10, mult 1, shift 12
5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 10, shift 14
5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 10/10/10, mult 10, shift 14
5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 256/1/17, mult 4, shift 14
5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16
5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 1, shift 10
5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 10/0/0, mult 1, shift 12
5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 0/10/0, mult 1, shift 12
5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/10, mult 1, shift 10
5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 10, shift 14
5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 10/10/10, mult 10, shift 14
5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 256/1/17, mult 4, shift 14
5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16
5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 1, shift 10
5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 10/0/0, mult 1, shift 12
5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 0/10/0, mult 1, shift 12
5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/10, mult 1, shift 10
5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 10, shift 14
5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 10/10/10, mult 10, shift 14
5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 256/1/17, mult 4, shift 14
5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16
5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 12
5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 12
5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 12
5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 12
5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 16
5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 16
5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 16
5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16
5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 12
5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 12
5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 12
5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 12
5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 16
5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 16
5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 16
5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18
5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 14
5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 14
5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 14
5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 14
5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 16
5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 16
5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 16
5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18
5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 14
5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 14
5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 14
5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 14
5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 16
5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 16
5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 16
5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18
5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 14
5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 14
5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 14
5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 14
5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 16
5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 16
5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 16
5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18
5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 14
5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 14
5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 14
5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 14
5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 16
5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 16
5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 16
5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18
5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 1, shift 14
5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 10/0/0, mult 1, shift 14
5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 0/10/0, mult 1, shift 14
5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/10, mult 1, shift 14
5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 10, shift 16
5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 10/10/10, mult 10, shift 16
5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 256/1/17, mult 4, shift 16
5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18
5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 1, shift 14
5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 10/0/0, mult 1, shift 14
5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 0/10/0, mult 1, shift 14
5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/10, mult 1, shift 14
5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 10, shift 16
5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 10/10/10, mult 10, shift 16
5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 256/1/17, mult 4, shift 16
5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18
5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 1, shift 14
5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 10/0/0, mult 1, shift 14
5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 0/10/0, mult 1, shift 14
5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/10, mult 1, shift 14
5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 10, shift 16
5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 10/10/10, mult 10, shift 16
5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 256/1/17, mult 4, shift 16
5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18
5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 1, shift 14
5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 10/0/0, mult 1, shift 14
5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 0/10/0, mult 1, shift 14
5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/10, mult 1, shift 14
5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 10, shift 16
5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 10/10/10, mult 10, shift 16
5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 256/1/17, mult 4, shift 16
5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18
5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 14
5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 16
5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 16
5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 14
5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 18
5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 18
5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 18
5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20
5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 12
5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 12
5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 12
5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 12
5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 16
5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 16
5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 16
5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16
5: PASS: 37x55x73 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 12
5: PASS: 37x55x73 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 14
5: PASS: 37x55x73 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 14
5: PASS: 37x55x73 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 12
5: PASS: 37x55x73 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 16
5: PASS: 37x55x73 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 16
5: PASS: 37x55x73 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 16
5: PASS: 37x55x73 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18
5: PASS: 57x87x117 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 14
5: PASS: 57x87x117 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 14
5: PASS: 57x87x117 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 14
5: PASS: 57x87x117 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 14
5: PASS: 57x87x117 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 16
5: PASS: 57x87x117 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 18
5: PASS: 57x87x117 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 18
5: PASS: 57x87x117 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18
5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 14
5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 14
5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 14
5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 14
5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 16
5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 18
5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 18
5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18
5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 14
5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 14
5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 14
5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 14
5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 16
5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 18
5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 18
5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18
5: PASS: 78x101x82 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 14
5: PASS: 78x101x82 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 14
5: PASS: 78x101x82 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 14
5: PASS: 78x101x82 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 14
5: PASS: 78x101x82 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 18
5: PASS: 78x101x82 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 18
5: PASS: 78x101x82 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 18
5: PASS: 78x101x82 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18
5: PASS: 512x512x512 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20
5: PASS: 1024x1024x1024 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 22
5: PASS: 567x2345x123 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 24
5: PASS: 100x5000x100 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 24
5: PASS: 1x1x1000 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16
5: PASS: 1000x1x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16
5: PASS: 1x1000x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20
5: PASS: 1x1000x1000 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 22
5: PASS: 1000x1x1000 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16
5: PASS: 1000x1000x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 22
5: PASS: 777x3456x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 24
5: PASS: 4567x555x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20
5: PASS: 70x90x110 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 14
5: PASS: 70x90x110 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 14
5: PASS: 70x90x110 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 14
5: PASS: 70x90x110 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 14
5: PASS: 70x90x110 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 16
5: PASS: 70x90x110 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 18
5: PASS: 70x90x110 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 18
5: PASS: 70x90x110 ColMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18
5: PASS: 70x90x110 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 14
5: PASS: 70x90x110 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 14
5: PASS: 70x90x110 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 14
5: PASS: 70x90x110 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 14
5: PASS: 70x90x110 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 16
5: PASS: 70x90x110 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 18
5: PASS: 70x90x110 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 18
5: PASS: 70x90x110 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18
5: PASS: 70x90x110 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 14
5: PASS: 70x90x110 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 14
5: PASS: 70x90x110 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 14
5: PASS: 70x90x110 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 14
5: PASS: 70x90x110 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 16
5: PASS: 70x90x110 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 18
5: PASS: 70x90x110 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 18
5: PASS: 70x90x110 ColMajor x RowMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18
5: PASS: 70x90x110 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 14
5: PASS: 70x90x110 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 14
5: PASS: 70x90x110 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 14
5: PASS: 70x90x110 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 14
5: PASS: 70x90x110 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 16
5: PASS: 70x90x110 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 18
5: PASS: 70x90x110 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 18
5: PASS: 70x90x110 RowMajor x RowMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18
5: PASS: 70x90x110 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 1, shift 14
5: PASS: 70x90x110 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 10/0/0, mult 1, shift 14
5: PASS: 70x90x110 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 0/10/0, mult 1, shift 14
5: PASS: 70x90x110 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/10, mult 1, shift 14
5: PASS: 70x90x110 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 10, shift 16
5: PASS: 70x90x110 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 10/10/10, mult 10, shift 18
5: PASS: 70x90x110 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 256/1/17, mult 4, shift 18
5: PASS: 70x90x110 ColMajor x ColMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18
5: PASS: 70x90x110 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 1, shift 14
5: PASS: 70x90x110 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 10/0/0, mult 1, shift 14
5: PASS: 70x90x110 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 0/10/0, mult 1, shift 14
5: PASS: 70x90x110 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/10, mult 1, shift 14
5: PASS: 70x90x110 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 10, shift 16
5: PASS: 70x90x110 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 10/10/10, mult 10, shift 18
5: PASS: 70x90x110 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 256/1/17, mult 4, shift 18
5: PASS: 70x90x110 RowMajor x ColMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18
5: PASS: 70x90x110 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 1, shift 14
5: PASS: 70x90x110 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 10/0/0, mult 1, shift 14
5: PASS: 70x90x110 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 0/10/0, mult 1, shift 14
5: PASS: 70x90x110 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/10, mult 1, shift 14
5: PASS: 70x90x110 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 10, shift 16
5: PASS: 70x90x110 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 10/10/10, mult 10, shift 18
5: PASS: 70x90x110 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 256/1/17, mult 4, shift 18
5: PASS: 70x90x110 ColMajor x RowMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18
5: PASS: 70x90x110 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 1, shift 14
5: PASS: 70x90x110 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 10/0/0, mult 1, shift 14
5: PASS: 70x90x110 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 0/10/0, mult 1, shift 14
5: PASS: 70x90x110 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/10, mult 1, shift 14
5: PASS: 70x90x110 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 10, shift 16
5: PASS: 70x90x110 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 10/10/10, mult 10, shift 18
5: PASS: 70x90x110 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 256/1/17, mult 4, shift 18
5: PASS: 70x90x110 RowMajor x RowMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18
5: PASS: 300x400x500 ColMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20
5: PASS: 300x400x500 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20
5: PASS: 300x400x500 ColMajor x RowMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20
5: PASS: 300x400x500 RowMajor x RowMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20
5: PASS: 300x400x500 ColMajor x ColMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20
5: PASS: 300x400x500 RowMajor x ColMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20
5: PASS: 300x400x500 ColMajor x RowMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20
5: PASS: 300x400x500 RowMajor x RowMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20
5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 8
5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 8
5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 8
5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 8
5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12
5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12
5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12
5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10
5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10
5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10
5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10
5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12
5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12
5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12
5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10
5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10
5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10
5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10
5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12
5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12
5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14
5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10
5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10
5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10
5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10
5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12
5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12
5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12
5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10
5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10
5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10
5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10
5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12
5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14
5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14
5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 8
5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 8
5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10
5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 8
5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12
5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12
5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12
5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10
5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10
5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10
5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10
5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12
5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12
5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12
5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10
5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10
5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10
5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10
5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14
5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14
5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14
5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 12
5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12
5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12
5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 12
5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14
5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14
5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14
5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 12
5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12
5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12
5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 12
5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16
5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16
5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16
5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 16
5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 16
5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 16
5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 16
5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 18
5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18
5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18
5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20
5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14
5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14
5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14
5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14
5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 18
5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18
5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18
5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18
5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14
5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14
5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14
5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14
5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 18
5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18
5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18
5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18
5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14
5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14
5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14
5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14
5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16
5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18
5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18
5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18
5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14
5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14
5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14
5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14
5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 18
5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18
5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18
5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20
5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14
5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14
5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14
5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14
5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16
5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18
5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18
5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18
5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14
5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14
5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14
5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14
5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16
5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18
5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18
5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18
5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14
5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14
5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14
5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14
5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16
5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18
5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18
5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18
5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14
5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14
5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14
5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14
5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16
5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18
5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18
5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18
5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14
5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14
5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14
5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14
5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16
5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18
5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18
5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18
5: PASS: 300x400x1 ColMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20
5: PASS: 300x400x1 RowMajor x ColMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20
5: PASS: 300x400x1 ColMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20
5: PASS: 300x400x1 RowMajor x RowMajor -> ColMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20
5: PASS: 300x400x1 ColMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20
5: PASS: 300x400x1 RowMajor x ColMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20
5: PASS: 300x400x1 ColMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20
5: PASS: 300x400x1 RowMajor x RowMajor -> RowMajor, SingleThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20
5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 8
5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10
5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10
5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 8
5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12
5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12
5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12
5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10
5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10
5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10
5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10
5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12
5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12
5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12
5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10
5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10
5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10
5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10
5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14
5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14
5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14
5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10
5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10
5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10
5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10
5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12
5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12
5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12
5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10
5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10
5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10
5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10
5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14
5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14
5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14
5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10
5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 10
5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 10
5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10
5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12
5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14
5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14
5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 8
5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 8
5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 8
5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 8
5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 12
5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 12
5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 12
5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 10
5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12
5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12
5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 10
5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14
5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14
5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14
5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 12
5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12
5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12
5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 12
5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 14
5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 14
5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 14
5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 12
5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 12
5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 12
5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 12
5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16
5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 16
5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 16
5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 16
5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 16
5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 16
5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 16
5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 18
5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18
5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18
5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20
5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14
5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 16
5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 16
5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14
5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 18
5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18
5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18
5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20
5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14
5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14
5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14
5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14
5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16
5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18
5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18
5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18
5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14
5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14
5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14
5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14
5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16
5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18
5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18
5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18
5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14
5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14
5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14
5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14
5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16
5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18
5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18
5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18
5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14
5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14
5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14
5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14
5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 18
5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18
5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18
5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20
5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14
5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14
5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14
5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14
5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 18
5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18
5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18
5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18
5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14
5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14
5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14
5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14
5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16
5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18
5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18
5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18
5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14
5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14
5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14
5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14
5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 18
5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18
5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18
5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20
5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 1, shift 14
5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/0/0, mult 1, shift 14
5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/10/0, mult 1, shift 14
5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/10, mult 1, shift 14
5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 0/0/0, mult 10, shift 16
5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 10/10/10, mult 10, shift 18
5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets 256/1/17, mult 4, shift 18
5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18
5: PASS: 300x400x1 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20
5: PASS: 300x400x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20
5: PASS: 300x400x1 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20
5: PASS: 300x400x1 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20
5: PASS: 300x400x1 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20
5: PASS: 300x400x1 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20
5: PASS: 300x400x1 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20
5: PASS: 300x400x1 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x16 WidthMajor, Rhs: 1 cells 16x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20
5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 8
5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 8
5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 8
5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 8
5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 12
5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 12
5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 12
5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16
5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 8
5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 8
5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 8
5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 8
5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 10
5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 12
5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 12
5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16
5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 8
5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 8
5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 8
5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 8
5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 12
5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 12
5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 12
5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16
5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 10
5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 10
5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 10
5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 10
5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 12
5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 14
5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 14
5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16
5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 12
5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 12
5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 12
5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 12
5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 14
5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 14
5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 14
5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16
5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 10
5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 10
5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 10
5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 10
5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 12
5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 12
5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 12
5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16
5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 10
5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 10
5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 10
5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 10
5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 12
5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 12
5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 12
5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16
5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 12
5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 12
5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 12
5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 12
5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 14
5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 14
5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 14
5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16
5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 10
5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 10
5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 10
5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 10
5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 14
5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 14
5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 14
5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16
5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 12
5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 12
5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 12
5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 12
5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 16
5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 16
5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 16
5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 16
5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 16
5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 16
5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 16
5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 16
5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 18
5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 18
5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 18
5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20
5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 14
5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 16
5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 16
5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 14
5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 18
5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 18
5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 18
5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20
5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 14
5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 14
5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 14
5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 14
5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 18
5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 18
5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 18
5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20
5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 14
5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 14
5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 14
5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 14
5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 16
5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 18
5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 18
5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18
5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 14
5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 14
5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 14
5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 14
5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 18
5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 18
5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 18
5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18
5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 1, shift 14
5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 10/0/0, mult 1, shift 14
5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 0/10/0, mult 1, shift 14
5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/10, mult 1, shift 14
5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 0/0/0, mult 10, shift 16
5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 10/10/10, mult 10, shift 18
5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, public Gemm, offsets 256/1/17, mult 4, shift 18
5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18
5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 1, shift 14
5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 10/0/0, mult 1, shift 14
5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 0/10/0, mult 1, shift 14
5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/10, mult 1, shift 14
5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 10, shift 16
5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 10/10/10, mult 10, shift 18
5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, public Gemm, offsets 256/1/17, mult 4, shift 18
5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18
5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 1, shift 14
5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 10/0/0, mult 1, shift 14
5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 0/10/0, mult 1, shift 14
5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/10, mult 1, shift 14
5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 10, shift 16
5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 10/10/10, mult 10, shift 18
5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, public Gemm, offsets 256/1/17, mult 4, shift 18
5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18
5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 1, shift 14
5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 10/0/0, mult 1, shift 14
5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 0/10/0, mult 1, shift 14
5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/10, mult 1, shift 14
5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 10, shift 16
5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 10/10/10, mult 10, shift 18
5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, public Gemm, offsets 256/1/17, mult 4, shift 18
5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18
5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 1, shift 14
5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 10/0/0, mult 1, shift 14
5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 0/10/0, mult 1, shift 14
5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/10, mult 1, shift 14
5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 0/0/0, mult 10, shift 16
5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 10/10/10, mult 10, shift 18
5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, public Gemm, offsets 256/1/17, mult 4, shift 18
5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 18
5: PASS: 300x400x1 ColMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20
5: PASS: 300x400x1 RowMajor x ColMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20
5: PASS: 300x400x1 ColMajor x RowMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20
5: PASS: 300x400x1 RowMajor x RowMajor -> ColMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20
5: PASS: 300x400x1 ColMajor x ColMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20
5: PASS: 300x400x1 RowMajor x ColMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20
5: PASS: 300x400x1 ColMajor x RowMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20
5: PASS: 300x400x1 RowMajor x RowMajor -> RowMajor, public Gemm, offsets -75/-91/74980, mult 123, shift 20
5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 8
5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 8
5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 8
5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 8
5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 12
5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 12
5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 12
5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16
5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 8
5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 8
5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 8
5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 8
5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 12
5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 12
5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 12
5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16
5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 8
5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 8
5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 8
5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 8
5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 12
5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 12
5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 12
5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16
5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 8
5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 10
5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 10
5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 8
5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 12
5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 12
5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 12
5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16
5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 12
5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 12
5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 12
5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 12
5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 14
5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 14
5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 14
5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16
5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 10
5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 10
5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 10
5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 10
5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 12
5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 14
5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 14
5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16
5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 10
5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 10
5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 10
5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 10
5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 12
5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 12
5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 12
5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16
5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 10
5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 10
5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 10
5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 10
5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 14
5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 14
5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 14
5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16
5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 12
5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 12
5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 12
5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 12
5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 14
5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 14
5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 14
5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16
5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 12
5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 12
5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 12
5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 12
5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16
5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 16
5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 16
5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18
5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 16
5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 16
5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 16
5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 16
5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 18
5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18
5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18
5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20
5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14
5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14
5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14
5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14
5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 18
5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18
5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18
5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18
5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14
5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14
5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14
5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14
5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 18
5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18
5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18
5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18
5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14
5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14
5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14
5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14
5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 18
5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18
5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18
5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20
5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14
5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14
5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14
5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14
5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16
5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18
5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18
5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18
5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14
5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14
5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14
5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14
5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 18
5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18
5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18
5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18
5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14
5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14
5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14
5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14
5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16
5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18
5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18
5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18
5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14
5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14
5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14
5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14
5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 18
5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18
5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18
5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18
5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14
5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14
5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14
5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14
5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16
5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18
5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18
5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18
5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14
5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14
5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14
5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14
5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 18
5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18
5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18
5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20
5: PASS: 300x400x1 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20
5: PASS: 300x400x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20
5: PASS: 300x400x1 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20
5: PASS: 300x400x1 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20
5: PASS: 300x400x1 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20
5: PASS: 300x400x1 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20
5: PASS: 300x400x1 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20
5: PASS: 300x400x1 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20
5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 8
5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 8
5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 8
5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 8
5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 12
5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 12
5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 12
5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16
5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 8
5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 8
5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 8
5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 8
5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 12
5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 12
5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 12
5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16
5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 10
5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 10
5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 10
5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 10
5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 12
5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 12
5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 12
5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16
5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 10
5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 10
5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 10
5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 10
5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 12
5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 12
5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 14
5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16
5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 12
5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 12
5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 12
5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 12
5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 14
5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 14
5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 14
5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16
5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 10
5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 10
5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 10
5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 10
5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 14
5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 14
5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 14
5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16
5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 10
5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 10
5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 10
5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 10
5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 12
5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 12
5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 12
5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16
5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 10
5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 12
5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 12
5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 10
5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 14
5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 14
5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 14
5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16
5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 12
5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 12
5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 12
5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 12
5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 14
5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 14
5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 14
5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16
5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 12
5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 12
5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 12
5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 12
5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16
5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 16
5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 16
5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16
5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14
5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 16
5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 16
5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14
5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 18
5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18
5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18
5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20
5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14
5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 16
5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 16
5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14
5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 18
5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18
5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18
5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20
5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14
5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14
5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14
5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14
5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16
5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18
5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18
5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18
5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14
5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14
5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14
5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14
5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 18
5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18
5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18
5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20
5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14
5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14
5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14
5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14
5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16
5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18
5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18
5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18
5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14
5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14
5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14
5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14
5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16
5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18
5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18
5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18
5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14
5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14
5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14
5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14
5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 18
5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18
5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18
5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20
5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14
5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14
5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14
5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14
5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16
5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18
5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18
5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18
5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14
5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14
5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14
5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14
5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 18
5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18
5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18
5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18
5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14
5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14
5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14
5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14
5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16
5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 16
5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18
5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18
5: PASS: 300x400x1 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20
5: PASS: 300x400x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20
5: PASS: 300x400x1 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20
5: PASS: 300x400x1 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20
5: PASS: 300x400x1 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20
5: PASS: 300x400x1 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20
5: PASS: 300x400x1 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20
5: PASS: 300x400x1 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20
5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 8
5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 8
5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 8
5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 8
5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 12
5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 12
5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 12
5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16
5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 8
5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 8
5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 8
5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 8
5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 10
5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 12
5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 10
5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16
5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 8
5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 8
5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 8
5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 8
5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 12
5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 12
5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 12
5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16
5: PASS: 1x1x2 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 8
5: PASS: 1x1x2 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 8
5: PASS: 1x1x2 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 8
5: PASS: 1x1x2 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 8
5: PASS: 1x1x2 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 12
5: PASS: 1x1x2 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 12
5: PASS: 1x1x2 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 12
5: PASS: 1x1x2 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16
5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 8
5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 8
5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 8
5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 8
5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 12
5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 12
5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 12
5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16
5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 8
5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 8
5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 8
5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 8
5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 12
5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 12
5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 12
5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16
5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 10
5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 10
5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 10
5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 10
5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 12
5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 12
5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 12
5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16
5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 10
5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 10
5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 10
5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 10
5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 12
5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 14
5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 14
5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16
5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 10
5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 10
5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 10
5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 10
5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 14
5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 14
5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 14
5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16
5: PASS: 3x5x7 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 10
5: PASS: 3x5x7 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 10
5: PASS: 3x5x7 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 10
5: PASS: 3x5x7 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 10
5: PASS: 3x5x7 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 12
5: PASS: 3x5x7 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 14
5: PASS: 3x5x7 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 14
5: PASS: 3x5x7 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16
5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 8
5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 8
5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 8
5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 8
5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 12
5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 12
5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 12
5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16
5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 10
5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 10
5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 10
5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 10
5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 14
5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 14
5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 14
5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16
5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 10
5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 10
5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 10
5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 10
5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 14
5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 14
5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 14
5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16
5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 10
5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 10
5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 10
5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 10
5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 14
5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 14
5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 14
5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16
5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 12
5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 12
5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 12
5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 12
5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 14
5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 14
5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 14
5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16
5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 10
5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 10
5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 10
5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 10
5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 14
5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 14
5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 14
5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16
5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 10
5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 10
5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 10
5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 10
5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 14
5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 14
5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 14
5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16
5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 10
5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 12
5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 12
5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 10
5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 14
5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 14
5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 14
5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16
5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 12
5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 12
5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 12
5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 12
5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 14
5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 14
5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 14
5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16
5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 10
5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 12
5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 12
5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 10
5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 14
5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 14
5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 14
5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16
5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 12
5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 12
5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 12
5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 12
5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16
5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 16
5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 16
5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16
5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 12
5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 12
5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 12
5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 12
5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16
5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 16
5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 16
5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18
5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14
5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14
5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14
5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14
5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16
5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 16
5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 16
5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18
5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14
5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14
5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14
5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14
5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16
5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 16
5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 16
5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18
5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14
5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14
5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14
5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14
5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16
5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 16
5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 16
5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18
5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14
5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14
5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14
5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14
5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16
5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 16
5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 16
5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18
5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14
5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14
5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14
5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14
5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16
5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 16
5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 16
5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18
5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14
5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14
5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14
5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14
5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16
5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 16
5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 16
5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18
5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14
5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14
5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14
5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14
5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16
5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 16
5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 16
5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18
5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14
5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14
5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14
5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14
5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16
5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 16
5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 16
5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18
5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 16
5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 16
5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 16
5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 16
5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 18
5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18
5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18
5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20
5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 12
5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 12
5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 12
5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 12
5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16
5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 16
5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 16
5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16
5: PASS: 37x55x73 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 12
5: PASS: 37x55x73 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14
5: PASS: 37x55x73 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14
5: PASS: 37x55x73 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 12
5: PASS: 37x55x73 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16
5: PASS: 37x55x73 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 16
5: PASS: 37x55x73 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 16
5: PASS: 37x55x73 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18
5: PASS: 57x87x117 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14
5: PASS: 57x87x117 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14
5: PASS: 57x87x117 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14
5: PASS: 57x87x117 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14
5: PASS: 57x87x117 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16
5: PASS: 57x87x117 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18
5: PASS: 57x87x117 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18
5: PASS: 57x87x117 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18
5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14
5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14
5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14
5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14
5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16
5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18
5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18
5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18
5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14
5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14
5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14
5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14
5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16
5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18
5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18
5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18
5: PASS: 78x101x82 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14
5: PASS: 78x101x82 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14
5: PASS: 78x101x82 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14
5: PASS: 78x101x82 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14
5: PASS: 78x101x82 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 18
5: PASS: 78x101x82 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18
5: PASS: 78x101x82 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18
5: PASS: 78x101x82 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18
5: PASS: 512x512x512 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20
5: PASS: 1024x1024x1024 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 22
5: PASS: 567x2345x123 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 24
5: PASS: 100x5000x100 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 24
5: PASS: 1x1x1000 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16
5: PASS: 1000x1x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16
5: PASS: 1x1000x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 22
5: PASS: 1x1000x1000 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20
5: PASS: 1000x1x1000 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16
5: PASS: 1000x1000x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 22
5: PASS: 777x3456x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 24
5: PASS: 4567x555x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20
5: PASS: 70x90x110 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14
5: PASS: 70x90x110 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14
5: PASS: 70x90x110 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14
5: PASS: 70x90x110 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14
5: PASS: 70x90x110 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16
5: PASS: 70x90x110 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18
5: PASS: 70x90x110 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18
5: PASS: 70x90x110 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18
5: PASS: 70x90x110 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14
5: PASS: 70x90x110 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14
5: PASS: 70x90x110 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14
5: PASS: 70x90x110 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14
5: PASS: 70x90x110 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16
5: PASS: 70x90x110 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18
5: PASS: 70x90x110 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18
5: PASS: 70x90x110 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18
5: PASS: 70x90x110 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14
5: PASS: 70x90x110 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14
5: PASS: 70x90x110 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14
5: PASS: 70x90x110 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14
5: PASS: 70x90x110 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16
5: PASS: 70x90x110 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18
5: PASS: 70x90x110 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18
5: PASS: 70x90x110 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18
5: PASS: 70x90x110 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14
5: PASS: 70x90x110 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14
5: PASS: 70x90x110 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14
5: PASS: 70x90x110 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14
5: PASS: 70x90x110 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16
5: PASS: 70x90x110 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18
5: PASS: 70x90x110 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18
5: PASS: 70x90x110 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18
5: PASS: 70x90x110 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14
5: PASS: 70x90x110 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14
5: PASS: 70x90x110 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14
5: PASS: 70x90x110 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14
5: PASS: 70x90x110 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16
5: PASS: 70x90x110 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18
5: PASS: 70x90x110 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18
5: PASS: 70x90x110 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18
5: PASS: 70x90x110 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14
5: PASS: 70x90x110 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14
5: PASS: 70x90x110 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14
5: PASS: 70x90x110 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14
5: PASS: 70x90x110 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16
5: PASS: 70x90x110 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18
5: PASS: 70x90x110 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18
5: PASS: 70x90x110 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18
5: PASS: 70x90x110 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14
5: PASS: 70x90x110 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14
5: PASS: 70x90x110 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14
5: PASS: 70x90x110 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14
5: PASS: 70x90x110 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16
5: PASS: 70x90x110 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18
5: PASS: 70x90x110 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18
5: PASS: 70x90x110 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18
5: PASS: 70x90x110 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14
5: PASS: 70x90x110 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14
5: PASS: 70x90x110 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14
5: PASS: 70x90x110 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14
5: PASS: 70x90x110 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16
5: PASS: 70x90x110 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18
5: PASS: 70x90x110 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18
5: PASS: 70x90x110 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18
5: PASS: 300x400x500 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20
5: PASS: 300x400x500 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20
5: PASS: 300x400x500 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20
5: PASS: 300x400x500 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20
5: PASS: 300x400x500 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20
5: PASS: 300x400x500 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20
5: PASS: 300x400x500 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20
5: PASS: 300x400x500 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20
5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 8
5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 8
5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 8
5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 8
5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 12
5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 12
5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 12
5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16
5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 10
5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 10
5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 10
5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 10
5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 12
5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 12
5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 12
5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16
5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 10
5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 10
5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 10
5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 10
5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 14
5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 14
5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 14
5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16
5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 10
5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 10
5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 10
5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 10
5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 12
5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 12
5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 12
5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16
5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 10
5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 10
5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 10
5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 10
5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 14
5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 14
5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 14
5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16
5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 10
5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 10
5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 10
5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 10
5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 12
5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 14
5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 14
5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16
5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 8
5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 8
5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 8
5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 8
5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 12
5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 12
5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 12
5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16
5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 10
5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 12
5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 12
5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 10
5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 14
5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 14
5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 14
5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16
5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 10
5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 10
5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 10
5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 10
5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 14
5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 14
5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 14
5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16
5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 12
5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 12
5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 12
5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 12
5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16
5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 16
5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 16
5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18
5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14
5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 16
5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 16
5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14
5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 18
5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18
5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18
5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18
5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14
5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14
5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14
5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14
5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 18
5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18
5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18
5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18
5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14
5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14
5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14
5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14
5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16
5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18
5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18
5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18
5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14
5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14
5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14
5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14
5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 18
5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18
5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18
5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18
5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14
5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14
5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14
5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14
5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16
5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18
5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18
5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18
5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14
5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14
5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14
5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14
5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16
5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18
5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18
5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18
5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14
5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14
5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14
5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14
5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16
5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18
5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18
5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18
5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14
5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14
5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14
5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14
5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16
5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18
5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18
5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18
5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14
5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14
5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14
5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14
5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16
5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18
5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18
5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18
5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14
5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14
5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14
5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14
5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16
5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18
5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18
5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18
5: PASS: 300x400x1 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20
5: PASS: 300x400x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20
5: PASS: 300x400x1 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20
5: PASS: 300x400x1 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20
5: PASS: 300x400x1 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20
5: PASS: 300x400x1 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20
5: PASS: 300x400x1 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20
5: PASS: 300x400x1 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20
5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 8
5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 8
5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 8
5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 8
5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 12
5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 12
5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 12
5: PASS: 2x2x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16
5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 8
5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 8
5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 8
5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 8
5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 12
5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 12
5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 12
5: PASS: 3x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16
5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 10
5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 10
5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 10
5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 10
5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 12
5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 12
5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 12
5: PASS: 4x4x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16
5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 8
5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 8
5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 8
5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 8
5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 12
5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 12
5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 12
5: PASS: 5x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16
5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 10
5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 10
5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 10
5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 10
5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 14
5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 14
5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 14
5: PASS: 6x6x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16
5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 10
5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 12
5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 12
5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 10
5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 14
5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 14
5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 14
5: PASS: 3x5x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16
5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 8
5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 8
5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 8
5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 8
5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 12
5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 12
5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 12
5: PASS: 7x3x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16
5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 10
5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 10
5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 10
5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 10
5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 12
5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 14
5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 14
5: PASS: 5x7x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16
5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 12
5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 12
5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 12
5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 12
5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 14
5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 14
5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 14
5: PASS: 8x8x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16
5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 12
5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 12
5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 12
5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 12
5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16
5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 16
5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 16
5: PASS: 32x32x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18
5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14
5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 16
5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 16
5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14
5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 18
5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18
5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18
5: PASS: 128x128x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20
5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14
5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 16
5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 16
5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14
5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 18
5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18
5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18
5: PASS: 321x123x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20
5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14
5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14
5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14
5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14
5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 18
5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18
5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18
5: PASS: 70x90x1 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20
5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14
5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14
5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14
5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14
5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16
5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18
5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18
5: PASS: 70x90x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18
5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14
5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14
5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14
5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14
5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 18
5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18
5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18
5: PASS: 70x90x1 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18
5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14
5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14
5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14
5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14
5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 18
5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18
5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18
5: PASS: 70x90x1 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20
5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14
5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14
5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14
5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14
5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 18
5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18
5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18
5: PASS: 70x90x1 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20
5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14
5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14
5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14
5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14
5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16
5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18
5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18
5: PASS: 70x90x1 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18
5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14
5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14
5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14
5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14
5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 18
5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18
5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18
5: PASS: 70x90x1 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18
5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14
5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14
5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14
5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14
5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 18
5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18
5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18
5: PASS: 70x90x1 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18
5: PASS: 300x400x1 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20
5: PASS: 300x400x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20
5: PASS: 300x400x1 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20
5: PASS: 300x400x1 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20
5: PASS: 300x400x1 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20
5: PASS: 300x400x1 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20
5: PASS: 300x400x1 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20
5: PASS: 300x400x1 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20
5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 8
5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 8
5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 8
5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 8
5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 12
5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 12
5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 12
5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16
5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 6
5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 6
5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 6
5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 6
5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 8
5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 10
5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 10
5: PASS: 2x1x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16
5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 8
5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 8
5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 10
5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 8
5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 12
5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 12
5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 12
5: PASS: 1x2x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16
5: PASS: 1x1x2 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 8
5: PASS: 1x1x2 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 8
5: PASS: 1x1x2 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 8
5: PASS: 1x1x2 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 8
5: PASS: 1x1x2 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 12
5: PASS: 1x1x2 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 12
5: PASS: 1x1x2 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 12
5: PASS: 1x1x2 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16
5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 10
5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 10
5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 10
5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 10
5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 12
5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 14
5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 12
5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16
5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 10
5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 10
5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 10
5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 10
5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 12
5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 12
5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 12
5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16
5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 10
5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 10
5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 10
5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 10
5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 12
5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 12
5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 12
5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16
5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 10
5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 10
5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 10
5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 10
5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 14
5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 14
5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 14
5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16
5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 10
5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 10
5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 10
5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 10
5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 12
5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 14
5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 14
5: PASS: 6x6x6 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16
5: PASS: 3x5x7 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 10
5: PASS: 3x5x7 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 10
5: PASS: 3x5x7 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 10
5: PASS: 3x5x7 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 10
5: PASS: 3x5x7 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 12
5: PASS: 3x5x7 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 12
5: PASS: 3x5x7 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 14
5: PASS: 3x5x7 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16
5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 10
5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 10
5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 10
5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 10
5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 12
5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 12
5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 12
5: PASS: 7x3x5 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16
5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 10
5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 10
5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 10
5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 10
5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 14
5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 14
5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 14
5: PASS: 5x7x3 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16
5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 12
5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 12
5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 12
5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 12
5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 14
5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 14
5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 14
5: PASS: 8x8x8 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16
5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 10
5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 10
5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 10
5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 10
5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 14
5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 14
5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 14
5: PASS: 8x8x8 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16
5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 12
5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 12
5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 12
5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 12
5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 14
5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 14
5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 14
5: PASS: 8x8x8 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16
5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 10
5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 10
5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 10
5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 10
5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 14
5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 14
5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 14
5: PASS: 8x8x8 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16
5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 10
5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 10
5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 10
5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 10
5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 14
5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 14
5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 14
5: PASS: 8x8x8 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16
5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 12
5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 12
5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 12
5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 12
5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 14
5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 14
5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 14
5: PASS: 8x8x8 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16
5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 10
5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 12
5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 12
5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 10
5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 14
5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 14
5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 14
5: PASS: 8x8x8 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16
5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 10
5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 12
5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 12
5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 10
5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 14
5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 14
5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 14
5: PASS: 8x8x8 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16
5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 12
5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 12
5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 12
5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 12
5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16
5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 16
5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 16
5: PASS: 16x16x16 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16
5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 12
5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 12
5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 12
5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 12
5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16
5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 16
5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 16
5: PASS: 32x32x32 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18
5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14
5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14
5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14
5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14
5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16
5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 16
5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 16
5: PASS: 64x64x64 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18
5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14
5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14
5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14
5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14
5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16
5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 16
5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 16
5: PASS: 64x64x64 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18
5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14
5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14
5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14
5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14
5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16
5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 16
5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 16
5: PASS: 64x64x64 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18
5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14
5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14
5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14
5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14
5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16
5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 16
5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 16
5: PASS: 64x64x64 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18
5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14
5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14
5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14
5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14
5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16
5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 16
5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 16
5: PASS: 64x64x64 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18
5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14
5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14
5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14
5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14
5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16
5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 16
5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 16
5: PASS: 64x64x64 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18
5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14
5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14
5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14
5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14
5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16
5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 16
5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 16
5: PASS: 64x64x64 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18
5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14
5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14
5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14
5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14
5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16
5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 16
5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 16
5: PASS: 64x64x64 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18
5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 16
5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 16
5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 16
5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 16
5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 18
5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18
5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18
5: PASS: 128x128x128 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20
5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 12
5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 12
5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 12
5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 12
5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16
5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 16
5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 16
5: PASS: 16x17x16 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16
5: PASS: 37x55x73 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 12
5: PASS: 37x55x73 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14
5: PASS: 37x55x73 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14
5: PASS: 37x55x73 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 12
5: PASS: 37x55x73 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16
5: PASS: 37x55x73 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 16
5: PASS: 37x55x73 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 16
5: PASS: 37x55x73 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18
5: PASS: 57x87x117 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14
5: PASS: 57x87x117 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14
5: PASS: 57x87x117 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14
5: PASS: 57x87x117 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14
5: PASS: 57x87x117 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16
5: PASS: 57x87x117 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18
5: PASS: 57x87x117 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18
5: PASS: 57x87x117 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18
5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14
5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14
5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14
5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14
5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16
5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18
5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18
5: PASS: 93x83x73 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18
5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14
5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14
5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14
5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14
5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16
5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18
5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18
5: PASS: 109x89x99 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18
5: PASS: 78x101x82 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14
5: PASS: 78x101x82 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14
5: PASS: 78x101x82 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14
5: PASS: 78x101x82 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14
5: PASS: 78x101x82 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 18
5: PASS: 78x101x82 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18
5: PASS: 78x101x82 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18
5: PASS: 78x101x82 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18
5: PASS: 512x512x512 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20
5: PASS: 1024x1024x1024 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 22
5: PASS: 567x2345x123 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 24
5: PASS: 100x5000x100 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 24
5: PASS: 1x1x1000 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16
5: PASS: 1000x1x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16
5: PASS: 1x1000x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 22
5: PASS: 1x1000x1000 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 22
5: PASS: 1000x1x1000 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 16
5: PASS: 1000x1000x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 22
5: PASS: 777x3456x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 24
5: PASS: 4567x555x1 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20
5: PASS: 70x90x110 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14
5: PASS: 70x90x110 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14
5: PASS: 70x90x110 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14
5: PASS: 70x90x110 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14
5: PASS: 70x90x110 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16
5: PASS: 70x90x110 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18
5: PASS: 70x90x110 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18
5: PASS: 70x90x110 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18
5: PASS: 70x90x110 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14
5: PASS: 70x90x110 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14
5: PASS: 70x90x110 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14
5: PASS: 70x90x110 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14
5: PASS: 70x90x110 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16
5: PASS: 70x90x110 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18
5: PASS: 70x90x110 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18
5: PASS: 70x90x110 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18
5: PASS: 70x90x110 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14
5: PASS: 70x90x110 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14
5: PASS: 70x90x110 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14
5: PASS: 70x90x110 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14
5: PASS: 70x90x110 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 18
5: PASS: 70x90x110 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18
5: PASS: 70x90x110 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18
5: PASS: 70x90x110 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18
5: PASS: 70x90x110 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14
5: PASS: 70x90x110 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14
5: PASS: 70x90x110 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14
5: PASS: 70x90x110 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14
5: PASS: 70x90x110 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16
5: PASS: 70x90x110 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18
5: PASS: 70x90x110 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18
5: PASS: 70x90x110 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18
5: PASS: 70x90x110 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14
5: PASS: 70x90x110 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14
5: PASS: 70x90x110 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14
5: PASS: 70x90x110 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14
5: PASS: 70x90x110 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16
5: PASS: 70x90x110 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18
5: PASS: 70x90x110 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18
5: PASS: 70x90x110 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18
5: PASS: 70x90x110 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14
5: PASS: 70x90x110 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14
5: PASS: 70x90x110 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14
5: PASS: 70x90x110 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14
5: PASS: 70x90x110 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16
5: PASS: 70x90x110 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18
5: PASS: 70x90x110 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18
5: PASS: 70x90x110 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18
5: PASS: 70x90x110 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14
5: PASS: 70x90x110 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14
5: PASS: 70x90x110 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14
5: PASS: 70x90x110 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14
5: PASS: 70x90x110 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 18
5: PASS: 70x90x110 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18
5: PASS: 70x90x110 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18
5: PASS: 70x90x110 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18
5: PASS: 70x90x110 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 1, shift 14
5: PASS: 70x90x110 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 10/0/0, mult 1, shift 14
5: PASS: 70x90x110 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/10/0, mult 1, shift 14
5: PASS: 70x90x110 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/10, mult 1, shift 14
5: PASS: 70x90x110 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 0/0/0, mult 10, shift 16
5: PASS: 70x90x110 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 10/10/10, mult 10, shift 18
5: PASS: 70x90x110 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets 256/1/17, mult 4, shift 18
5: PASS: 70x90x110 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 18
5: PASS: 300x400x500 ColMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20
5: PASS: 300x400x500 RowMajor x ColMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20
5: PASS: 300x400x500 ColMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20
5: PASS: 300x400x500 RowMajor x RowMajor -> ColMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20
5: PASS: 300x400x500 ColMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20
5: PASS: 300x400x500 RowMajor x ColMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20
5: PASS: 300x400x500 ColMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20
5: PASS: 300x400x500 RowMajor x RowMajor -> RowMajor, EightBitIntGemm, offsets -75/-91/74980, mult 123, shift 20
5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 1x1 DepthMajor, Rhs: 1 cells 1x1 DepthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 1x1 DepthMajor, Rhs: 1 cells 1x1 DepthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 1x1 DepthMajor, Rhs: 1 cells 1x1 DepthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 1x1 DepthMajor, Rhs: 1 cells 1x1 DepthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 1x1 DepthMajor, Rhs: 1 cells 1x1 DepthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 1x1 DepthMajor, Rhs: 1 cells 1x1 DepthMajor), offsets 0/0/0, mult 1, shift 12
5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 1x1 DepthMajor, Rhs: 1 cells 1x1 DepthMajor), offsets 10/0/0, mult 1, shift 12
5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 1x1 DepthMajor, Rhs: 1 cells 1x1 DepthMajor), offsets 0/10/0, mult 1, shift 12
5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 1x1 DepthMajor, Rhs: 1 cells 1x1 DepthMajor), offsets 0/0/10, mult 1, shift 12
5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 1x1 DepthMajor, Rhs: 1 cells 1x1 DepthMajor), offsets 0/0/0, mult 10, shift 16
5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 1x1 DepthMajor, Rhs: 1 cells 1x1 DepthMajor), offsets 10/10/10, mult 10, shift 16
5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 1x1 DepthMajor, Rhs: 1 cells 1x1 DepthMajor), offsets 256/1/17, mult 4, shift 16
5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 1x1 DepthMajor, Rhs: 1 cells 1x1 DepthMajor), offsets -75/-91/74980, mult 123, shift 18
5: PASS: 200x200x200 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 1x1 DepthMajor, Rhs: 1 cells 1x1 DepthMajor), offsets -75/-91/74980, mult 123, shift 20
5: PASS: 200x200x200 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 1x1 DepthMajor, Rhs: 1 cells 1x1 DepthMajor), offsets -75/-91/74980, mult 123, shift 20
5: PASS: 200x200x200 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 1x1 DepthMajor, Rhs: 1 cells 1x1 DepthMajor), offsets -75/-91/74980, mult 123, shift 20
5: PASS: 200x200x200 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 1x1 DepthMajor, Rhs: 1 cells 1x1 DepthMajor), offsets -75/-91/74980, mult 123, shift 20
5: PASS: 200x200x200 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 1x1 DepthMajor, Rhs: 1 cells 1x1 DepthMajor), offsets -75/-91/74980, mult 123, shift 20
5: PASS: 200x200x200 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 1x1 DepthMajor, Rhs: 1 cells 1x1 DepthMajor), offsets -75/-91/74980, mult 123, shift 20
5: PASS: 200x200x200 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 1x1 DepthMajor, Rhs: 1 cells 1x1 DepthMajor), offsets -75/-91/74980, mult 123, shift 20
5: PASS: 200x200x200 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 1x1 DepthMajor, Rhs: 1 cells 1x1 DepthMajor), offsets -75/-91/74980, mult 123, shift 20
5: PASS: 50x5000x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 1x1 DepthMajor, Rhs: 1 cells 1x1 DepthMajor), offsets -75/-91/74980, mult 123, shift 24
5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x2 DepthMajor, Rhs: 2 cells 2x4 DepthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x2 DepthMajor, Rhs: 2 cells 2x4 DepthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x2 DepthMajor, Rhs: 2 cells 2x4 DepthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x2 DepthMajor, Rhs: 2 cells 2x4 DepthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x2 DepthMajor, Rhs: 2 cells 2x4 DepthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x2 DepthMajor, Rhs: 2 cells 2x4 DepthMajor), offsets 0/0/0, mult 1, shift 12
5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x2 DepthMajor, Rhs: 2 cells 2x4 DepthMajor), offsets 10/0/0, mult 1, shift 12
5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x2 DepthMajor, Rhs: 2 cells 2x4 DepthMajor), offsets 0/10/0, mult 1, shift 12
5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x2 DepthMajor, Rhs: 2 cells 2x4 DepthMajor), offsets 0/0/10, mult 1, shift 12
5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x2 DepthMajor, Rhs: 2 cells 2x4 DepthMajor), offsets 0/0/0, mult 10, shift 16
5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x2 DepthMajor, Rhs: 2 cells 2x4 DepthMajor), offsets 10/10/10, mult 10, shift 16
5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x2 DepthMajor, Rhs: 2 cells 2x4 DepthMajor), offsets 256/1/17, mult 4, shift 16
5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x2 DepthMajor, Rhs: 2 cells 2x4 DepthMajor), offsets -75/-91/74980, mult 123, shift 18
5: PASS: 200x200x200 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x2 DepthMajor, Rhs: 2 cells 2x4 DepthMajor), offsets -75/-91/74980, mult 123, shift 20
5: PASS: 200x200x200 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x2 DepthMajor, Rhs: 2 cells 2x4 DepthMajor), offsets -75/-91/74980, mult 123, shift 20
5: PASS: 200x200x200 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x2 DepthMajor, Rhs: 2 cells 2x4 DepthMajor), offsets -75/-91/74980, mult 123, shift 20
5: PASS: 200x200x200 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x2 DepthMajor, Rhs: 2 cells 2x4 DepthMajor), offsets -75/-91/74980, mult 123, shift 20
5: PASS: 200x200x200 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x2 DepthMajor, Rhs: 2 cells 2x4 DepthMajor), offsets -75/-91/74980, mult 123, shift 20
5: PASS: 200x200x200 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x2 DepthMajor, Rhs: 2 cells 2x4 DepthMajor), offsets -75/-91/74980, mult 123, shift 20
5: PASS: 200x200x200 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x2 DepthMajor, Rhs: 2 cells 2x4 DepthMajor), offsets -75/-91/74980, mult 123, shift 20
5: PASS: 200x200x200 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x2 DepthMajor, Rhs: 2 cells 2x4 DepthMajor), offsets -75/-91/74980, mult 123, shift 20
5: PASS: 50x5000x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 4x2 DepthMajor, Rhs: 2 cells 2x4 DepthMajor), offsets -75/-91/74980, mult 123, shift 24
5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 4 cells 4x2 DepthMajor, Rhs: 5 cells 2x4 DepthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 4 cells 4x2 DepthMajor, Rhs: 5 cells 2x4 DepthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 4 cells 4x2 DepthMajor, Rhs: 5 cells 2x4 DepthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 4 cells 4x2 DepthMajor, Rhs: 5 cells 2x4 DepthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 4 cells 4x2 DepthMajor, Rhs: 5 cells 2x4 DepthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 4 cells 4x2 DepthMajor, Rhs: 5 cells 2x4 DepthMajor), offsets 0/0/0, mult 1, shift 12
5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 4 cells 4x2 DepthMajor, Rhs: 5 cells 2x4 DepthMajor), offsets 10/0/0, mult 1, shift 12
5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 4 cells 4x2 DepthMajor, Rhs: 5 cells 2x4 DepthMajor), offsets 0/10/0, mult 1, shift 12
5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 4 cells 4x2 DepthMajor, Rhs: 5 cells 2x4 DepthMajor), offsets 0/0/10, mult 1, shift 12
5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 4 cells 4x2 DepthMajor, Rhs: 5 cells 2x4 DepthMajor), offsets 0/0/0, mult 10, shift 16
5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 4 cells 4x2 DepthMajor, Rhs: 5 cells 2x4 DepthMajor), offsets 10/10/10, mult 10, shift 16
5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 4 cells 4x2 DepthMajor, Rhs: 5 cells 2x4 DepthMajor), offsets 256/1/17, mult 4, shift 16
5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 4 cells 4x2 DepthMajor, Rhs: 5 cells 2x4 DepthMajor), offsets -75/-91/74980, mult 123, shift 18
5: PASS: 200x200x200 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 4 cells 4x2 DepthMajor, Rhs: 5 cells 2x4 DepthMajor), offsets -75/-91/74980, mult 123, shift 20
5: PASS: 200x200x200 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 4 cells 4x2 DepthMajor, Rhs: 5 cells 2x4 DepthMajor), offsets -75/-91/74980, mult 123, shift 20
5: PASS: 200x200x200 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 4 cells 4x2 DepthMajor, Rhs: 5 cells 2x4 DepthMajor), offsets -75/-91/74980, mult 123, shift 20
5: PASS: 200x200x200 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 4 cells 4x2 DepthMajor, Rhs: 5 cells 2x4 DepthMajor), offsets -75/-91/74980, mult 123, shift 20
5: PASS: 200x200x200 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 4 cells 4x2 DepthMajor, Rhs: 5 cells 2x4 DepthMajor), offsets -75/-91/74980, mult 123, shift 20
5: PASS: 200x200x200 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 4 cells 4x2 DepthMajor, Rhs: 5 cells 2x4 DepthMajor), offsets -75/-91/74980, mult 123, shift 20
5: PASS: 200x200x200 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 4 cells 4x2 DepthMajor, Rhs: 5 cells 2x4 DepthMajor), offsets -75/-91/74980, mult 123, shift 20
5: PASS: 200x200x200 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 4 cells 4x2 DepthMajor, Rhs: 5 cells 2x4 DepthMajor), offsets -75/-91/74980, mult 123, shift 20
5: PASS: 50x5000x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 4 cells 4x2 DepthMajor, Rhs: 5 cells 2x4 DepthMajor), offsets -75/-91/74980, mult 123, shift 24
5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 3x4 DepthMajor, Rhs: 3 cells 4x5 DepthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 3x4 DepthMajor, Rhs: 3 cells 4x5 DepthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 3x4 DepthMajor, Rhs: 3 cells 4x5 DepthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 3x4 DepthMajor, Rhs: 3 cells 4x5 DepthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 3x4 DepthMajor, Rhs: 3 cells 4x5 DepthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 3x4 DepthMajor, Rhs: 3 cells 4x5 DepthMajor), offsets 0/0/0, mult 1, shift 12
5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 3x4 DepthMajor, Rhs: 3 cells 4x5 DepthMajor), offsets 10/0/0, mult 1, shift 12
5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 3x4 DepthMajor, Rhs: 3 cells 4x5 DepthMajor), offsets 0/10/0, mult 1, shift 12
5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 3x4 DepthMajor, Rhs: 3 cells 4x5 DepthMajor), offsets 0/0/10, mult 1, shift 12
5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 3x4 DepthMajor, Rhs: 3 cells 4x5 DepthMajor), offsets 0/0/0, mult 10, shift 16
5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 3x4 DepthMajor, Rhs: 3 cells 4x5 DepthMajor), offsets 10/10/10, mult 10, shift 16
5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 3x4 DepthMajor, Rhs: 3 cells 4x5 DepthMajor), offsets 256/1/17, mult 4, shift 16
5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 3x4 DepthMajor, Rhs: 3 cells 4x5 DepthMajor), offsets -75/-91/74980, mult 123, shift 18
5: PASS: 200x200x200 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 3x4 DepthMajor, Rhs: 3 cells 4x5 DepthMajor), offsets -75/-91/74980, mult 123, shift 20
5: PASS: 200x200x200 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 3x4 DepthMajor, Rhs: 3 cells 4x5 DepthMajor), offsets -75/-91/74980, mult 123, shift 20
5: PASS: 200x200x200 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 3x4 DepthMajor, Rhs: 3 cells 4x5 DepthMajor), offsets -75/-91/74980, mult 123, shift 20
5: PASS: 200x200x200 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 3x4 DepthMajor, Rhs: 3 cells 4x5 DepthMajor), offsets -75/-91/74980, mult 123, shift 20
5: PASS: 200x200x200 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 3x4 DepthMajor, Rhs: 3 cells 4x5 DepthMajor), offsets -75/-91/74980, mult 123, shift 20
5: PASS: 200x200x200 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 3x4 DepthMajor, Rhs: 3 cells 4x5 DepthMajor), offsets -75/-91/74980, mult 123, shift 20
5: PASS: 200x200x200 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 3x4 DepthMajor, Rhs: 3 cells 4x5 DepthMajor), offsets -75/-91/74980, mult 123, shift 20
5: PASS: 200x200x200 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 3x4 DepthMajor, Rhs: 3 cells 4x5 DepthMajor), offsets -75/-91/74980, mult 123, shift 20
5: PASS: 50x5000x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 3x4 DepthMajor, Rhs: 3 cells 4x5 DepthMajor), offsets -75/-91/74980, mult 123, shift 24
5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 3x4 WidthMajor, Rhs: 3 cells 4x5 WidthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 3x4 WidthMajor, Rhs: 3 cells 4x5 WidthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 3x4 WidthMajor, Rhs: 3 cells 4x5 WidthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 3x4 WidthMajor, Rhs: 3 cells 4x5 WidthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 3x4 WidthMajor, Rhs: 3 cells 4x5 WidthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 3x4 WidthMajor, Rhs: 3 cells 4x5 WidthMajor), offsets 0/0/0, mult 1, shift 12
5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 3x4 WidthMajor, Rhs: 3 cells 4x5 WidthMajor), offsets 10/0/0, mult 1, shift 12
5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 3x4 WidthMajor, Rhs: 3 cells 4x5 WidthMajor), offsets 0/10/0, mult 1, shift 12
5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 3x4 WidthMajor, Rhs: 3 cells 4x5 WidthMajor), offsets 0/0/10, mult 1, shift 12
5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 3x4 WidthMajor, Rhs: 3 cells 4x5 WidthMajor), offsets 0/0/0, mult 10, shift 16
5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 3x4 WidthMajor, Rhs: 3 cells 4x5 WidthMajor), offsets 10/10/10, mult 10, shift 16
5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 3x4 WidthMajor, Rhs: 3 cells 4x5 WidthMajor), offsets 256/1/17, mult 4, shift 16
5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 3x4 WidthMajor, Rhs: 3 cells 4x5 WidthMajor), offsets -75/-91/74980, mult 123, shift 18
5: PASS: 200x200x200 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 3x4 WidthMajor, Rhs: 3 cells 4x5 WidthMajor), offsets -75/-91/74980, mult 123, shift 20
5: PASS: 200x200x200 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 3x4 WidthMajor, Rhs: 3 cells 4x5 WidthMajor), offsets -75/-91/74980, mult 123, shift 20
5: PASS: 200x200x200 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 3x4 WidthMajor, Rhs: 3 cells 4x5 WidthMajor), offsets -75/-91/74980, mult 123, shift 20
5: PASS: 200x200x200 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 3x4 WidthMajor, Rhs: 3 cells 4x5 WidthMajor), offsets -75/-91/74980, mult 123, shift 20
5: PASS: 200x200x200 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 3x4 WidthMajor, Rhs: 3 cells 4x5 WidthMajor), offsets -75/-91/74980, mult 123, shift 20
5: PASS: 200x200x200 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 3x4 WidthMajor, Rhs: 3 cells 4x5 WidthMajor), offsets -75/-91/74980, mult 123, shift 20
5: PASS: 200x200x200 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 3x4 WidthMajor, Rhs: 3 cells 4x5 WidthMajor), offsets -75/-91/74980, mult 123, shift 20
5: PASS: 200x200x200 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 3x4 WidthMajor, Rhs: 3 cells 4x5 WidthMajor), offsets -75/-91/74980, mult 123, shift 20
5: PASS: 50x5000x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 3x4 WidthMajor, Rhs: 3 cells 4x5 WidthMajor), offsets -75/-91/74980, mult 123, shift 24
5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 3 cells 5x2 WidthMajor, Rhs: 2 cells 2x4 DepthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 3 cells 5x2 WidthMajor, Rhs: 2 cells 2x4 DepthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 3 cells 5x2 WidthMajor, Rhs: 2 cells 2x4 DepthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 3 cells 5x2 WidthMajor, Rhs: 2 cells 2x4 DepthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 3 cells 5x2 WidthMajor, Rhs: 2 cells 2x4 DepthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 3 cells 5x2 WidthMajor, Rhs: 2 cells 2x4 DepthMajor), offsets 0/0/0, mult 1, shift 12
5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 3 cells 5x2 WidthMajor, Rhs: 2 cells 2x4 DepthMajor), offsets 10/0/0, mult 1, shift 12
5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 3 cells 5x2 WidthMajor, Rhs: 2 cells 2x4 DepthMajor), offsets 0/10/0, mult 1, shift 12
5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 3 cells 5x2 WidthMajor, Rhs: 2 cells 2x4 DepthMajor), offsets 0/0/10, mult 1, shift 12
5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 3 cells 5x2 WidthMajor, Rhs: 2 cells 2x4 DepthMajor), offsets 0/0/0, mult 10, shift 16
5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 3 cells 5x2 WidthMajor, Rhs: 2 cells 2x4 DepthMajor), offsets 10/10/10, mult 10, shift 16
5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 3 cells 5x2 WidthMajor, Rhs: 2 cells 2x4 DepthMajor), offsets 256/1/17, mult 4, shift 16
5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 3 cells 5x2 WidthMajor, Rhs: 2 cells 2x4 DepthMajor), offsets -75/-91/74980, mult 123, shift 18
5: PASS: 200x200x200 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 3 cells 5x2 WidthMajor, Rhs: 2 cells 2x4 DepthMajor), offsets -75/-91/74980, mult 123, shift 20
5: PASS: 200x200x200 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 3 cells 5x2 WidthMajor, Rhs: 2 cells 2x4 DepthMajor), offsets -75/-91/74980, mult 123, shift 20
5: PASS: 200x200x200 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 3 cells 5x2 WidthMajor, Rhs: 2 cells 2x4 DepthMajor), offsets -75/-91/74980, mult 123, shift 20
5: PASS: 200x200x200 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 3 cells 5x2 WidthMajor, Rhs: 2 cells 2x4 DepthMajor), offsets -75/-91/74980, mult 123, shift 20
5: PASS: 200x200x200 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 3 cells 5x2 WidthMajor, Rhs: 2 cells 2x4 DepthMajor), offsets -75/-91/74980, mult 123, shift 20
5: PASS: 200x200x200 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 3 cells 5x2 WidthMajor, Rhs: 2 cells 2x4 DepthMajor), offsets -75/-91/74980, mult 123, shift 20
5: PASS: 200x200x200 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 3 cells 5x2 WidthMajor, Rhs: 2 cells 2x4 DepthMajor), offsets -75/-91/74980, mult 123, shift 20
5: PASS: 200x200x200 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 3 cells 5x2 WidthMajor, Rhs: 2 cells 2x4 DepthMajor), offsets -75/-91/74980, mult 123, shift 20
5: PASS: 50x5000x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 3 cells 5x2 WidthMajor, Rhs: 2 cells 2x4 DepthMajor), offsets -75/-91/74980, mult 123, shift 24
5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 3 cells 5x2 DepthMajor, Rhs: 2 cells 2x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 3 cells 5x2 DepthMajor, Rhs: 2 cells 2x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 3 cells 5x2 DepthMajor, Rhs: 2 cells 2x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 3 cells 5x2 DepthMajor, Rhs: 2 cells 2x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 3 cells 5x2 DepthMajor, Rhs: 2 cells 2x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 3 cells 5x2 DepthMajor, Rhs: 2 cells 2x4 WidthMajor), offsets 0/0/0, mult 1, shift 12
5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 3 cells 5x2 DepthMajor, Rhs: 2 cells 2x4 WidthMajor), offsets 10/0/0, mult 1, shift 12
5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 3 cells 5x2 DepthMajor, Rhs: 2 cells 2x4 WidthMajor), offsets 0/10/0, mult 1, shift 12
5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 3 cells 5x2 DepthMajor, Rhs: 2 cells 2x4 WidthMajor), offsets 0/0/10, mult 1, shift 12
5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 3 cells 5x2 DepthMajor, Rhs: 2 cells 2x4 WidthMajor), offsets 0/0/0, mult 10, shift 16
5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 3 cells 5x2 DepthMajor, Rhs: 2 cells 2x4 WidthMajor), offsets 10/10/10, mult 10, shift 16
5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 3 cells 5x2 DepthMajor, Rhs: 2 cells 2x4 WidthMajor), offsets 256/1/17, mult 4, shift 16
5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 3 cells 5x2 DepthMajor, Rhs: 2 cells 2x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 18
5: PASS: 200x200x200 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 3 cells 5x2 DepthMajor, Rhs: 2 cells 2x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20
5: PASS: 200x200x200 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 3 cells 5x2 DepthMajor, Rhs: 2 cells 2x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20
5: PASS: 200x200x200 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 3 cells 5x2 DepthMajor, Rhs: 2 cells 2x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20
5: PASS: 200x200x200 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 3 cells 5x2 DepthMajor, Rhs: 2 cells 2x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20
5: PASS: 200x200x200 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 3 cells 5x2 DepthMajor, Rhs: 2 cells 2x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20
5: PASS: 200x200x200 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 3 cells 5x2 DepthMajor, Rhs: 2 cells 2x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20
5: PASS: 200x200x200 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 3 cells 5x2 DepthMajor, Rhs: 2 cells 2x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20
5: PASS: 200x200x200 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 3 cells 5x2 DepthMajor, Rhs: 2 cells 2x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 20
5: PASS: 50x5000x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 3 cells 5x2 DepthMajor, Rhs: 2 cells 2x4 WidthMajor), offsets -75/-91/74980, mult 123, shift 24
5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 8x8 Diagonal, Rhs: 1 cells 8x3 WidthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 8x8 Diagonal, Rhs: 1 cells 8x3 WidthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 8x8 Diagonal, Rhs: 1 cells 8x3 WidthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 8x8 Diagonal, Rhs: 1 cells 8x3 WidthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 8x8 Diagonal, Rhs: 1 cells 8x3 WidthMajor), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 8x8 Diagonal, Rhs: 1 cells 8x3 WidthMajor), offsets 0/0/0, mult 1, shift 12
5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 8x8 Diagonal, Rhs: 1 cells 8x3 WidthMajor), offsets 10/0/0, mult 1, shift 12
5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 8x8 Diagonal, Rhs: 1 cells 8x3 WidthMajor), offsets 0/10/0, mult 1, shift 12
5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 8x8 Diagonal, Rhs: 1 cells 8x3 WidthMajor), offsets 0/0/10, mult 1, shift 12
5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 8x8 Diagonal, Rhs: 1 cells 8x3 WidthMajor), offsets 0/0/0, mult 10, shift 16
5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 8x8 Diagonal, Rhs: 1 cells 8x3 WidthMajor), offsets 10/10/10, mult 10, shift 16
5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 8x8 Diagonal, Rhs: 1 cells 8x3 WidthMajor), offsets 256/1/17, mult 4, shift 16
5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 8x8 Diagonal, Rhs: 1 cells 8x3 WidthMajor), offsets -75/-91/74980, mult 123, shift 18
5: PASS: 200x200x200 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 8x8 Diagonal, Rhs: 1 cells 8x3 WidthMajor), offsets -75/-91/74980, mult 123, shift 20
5: PASS: 200x200x200 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 8x8 Diagonal, Rhs: 1 cells 8x3 WidthMajor), offsets -75/-91/74980, mult 123, shift 20
5: PASS: 200x200x200 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 8x8 Diagonal, Rhs: 1 cells 8x3 WidthMajor), offsets -75/-91/74980, mult 123, shift 20
5: PASS: 200x200x200 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 8x8 Diagonal, Rhs: 1 cells 8x3 WidthMajor), offsets -75/-91/74980, mult 123, shift 20
5: PASS: 200x200x200 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 8x8 Diagonal, Rhs: 1 cells 8x3 WidthMajor), offsets -75/-91/74980, mult 123, shift 20
5: PASS: 200x200x200 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 8x8 Diagonal, Rhs: 1 cells 8x3 WidthMajor), offsets -75/-91/74980, mult 123, shift 20
5: PASS: 200x200x200 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 8x8 Diagonal, Rhs: 1 cells 8x3 WidthMajor), offsets -75/-91/74980, mult 123, shift 20
5: PASS: 200x200x200 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 8x8 Diagonal, Rhs: 1 cells 8x3 WidthMajor), offsets -75/-91/74980, mult 123, shift 20
5: PASS: 50x5000x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 2 cells 8x8 Diagonal, Rhs: 1 cells 8x3 WidthMajor), offsets -75/-91/74980, mult 123, shift 24
5: PASS: 1x1x1 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 1x4 DepthMajor, Rhs: 1 cells 4x4 Diagonal), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 2x2x2 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 1x4 DepthMajor, Rhs: 1 cells 4x4 Diagonal), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 3x3x3 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 1x4 DepthMajor, Rhs: 1 cells 4x4 Diagonal), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 4x4x4 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 1x4 DepthMajor, Rhs: 1 cells 4x4 Diagonal), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 5x5x5 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 1x4 DepthMajor, Rhs: 1 cells 4x4 Diagonal), offsets -75/-91/74980, mult 123, shift 16
5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 1x4 DepthMajor, Rhs: 1 cells 4x4 Diagonal), offsets 0/0/0, mult 1, shift 12
5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 1x4 DepthMajor, Rhs: 1 cells 4x4 Diagonal), offsets 10/0/0, mult 1, shift 12
5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 1x4 DepthMajor, Rhs: 1 cells 4x4 Diagonal), offsets 0/10/0, mult 1, shift 12
5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 1x4 DepthMajor, Rhs: 1 cells 4x4 Diagonal), offsets 0/0/10, mult 1, shift 12
5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 1x4 DepthMajor, Rhs: 1 cells 4x4 Diagonal), offsets 0/0/0, mult 10, shift 16
5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 1x4 DepthMajor, Rhs: 1 cells 4x4 Diagonal), offsets 10/10/10, mult 10, shift 16
5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 1x4 DepthMajor, Rhs: 1 cells 4x4 Diagonal), offsets 256/1/17, mult 4, shift 16
5: PASS: 50x50x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 1x4 DepthMajor, Rhs: 1 cells 4x4 Diagonal), offsets -75/-91/74980, mult 123, shift 18
5: PASS: 200x200x200 ColMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 1x4 DepthMajor, Rhs: 1 cells 4x4 Diagonal), offsets -75/-91/74980, mult 123, shift 20
5: PASS: 200x200x200 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 1x4 DepthMajor, Rhs: 1 cells 4x4 Diagonal), offsets -75/-91/74980, mult 123, shift 20
5: PASS: 200x200x200 ColMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 1x4 DepthMajor, Rhs: 1 cells 4x4 Diagonal), offsets -75/-91/74980, mult 123, shift 20
5: PASS: 200x200x200 RowMajor x RowMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 1x4 DepthMajor, Rhs: 1 cells 4x4 Diagonal), offsets -75/-91/74980, mult 123, shift 20
5: PASS: 200x200x200 ColMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 1x4 DepthMajor, Rhs: 1 cells 4x4 Diagonal), offsets -75/-91/74980, mult 123, shift 20
5: PASS: 200x200x200 RowMajor x ColMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 1x4 DepthMajor, Rhs: 1 cells 4x4 Diagonal), offsets -75/-91/74980, mult 123, shift 20
5: PASS: 200x200x200 ColMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 1x4 DepthMajor, Rhs: 1 cells 4x4 Diagonal), offsets -75/-91/74980, mult 123, shift 20
5: PASS: 200x200x200 RowMajor x RowMajor -> RowMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 1x4 DepthMajor, Rhs: 1 cells 4x4 Diagonal), offsets -75/-91/74980, mult 123, shift 20
5: PASS: 50x5000x50 RowMajor x ColMajor -> ColMajor, MultiThreadGemm, Kernel: reference(Lhs: 1 cells 1x4 DepthMajor, Rhs: 1 cells 4x4 Diagonal), offsets -75/-91/74980, mult 123, shift 24
5: TestWithRealData: PASS with Lhs: 8 bit, Rhs: 8 bit
5:     number of matrix entries: 49152
5:     median value: 104
5:     median unsigned diff: 0 (tolerating 0)
5:     max unsigned diff: 0 (tolerating 0)
5:     median signed diff: 0 (tolerating 0)
5:     mean signed diff: 0 (tolerating 0)
5: No error: 100.00 % of entries
5: Error in 1..1 range: 0.00 % of entries
5: Error in 2..3 range: 0.00 % of entries
5: Error in 4..7 range: 0.00 % of entries
5: Error in 8..15 range: 0.00 % of entries
5: Error in 16..31 range: 0.00 % of entries
5: Error in 32..63 range: 0.00 % of entries
5: Error in 64..127 range: 0.00 % of entries
5: Error in 128..255 range: 0.00 % of entries
5: TestWithRealData: PASS with (legacy, no longer requantizing) Lhs: 7 bit, Rhs: 5 bit
5:     number of matrix entries: 49152
5:     median value: 104
5:     median unsigned diff: 0 (tolerating 2)
5:     max unsigned diff: 0 (tolerating 10)
5:     median signed diff: 0 (tolerating 0)
5:     mean signed diff: 0 (tolerating 0.2)
5: No error: 100.00 % of entries
5: Error in 1..1 range: 0.00 % of entries
5: Error in 2..3 range: 0.00 % of entries
5: Error in 4..7 range: 0.00 % of entries
5: Error in 8..15 range: 0.00 % of entries
5: Error in 16..31 range: 0.00 % of entries
5: Error in 32..63 range: 0.00 % of entries
5: Error in 64..127 range: 0.00 % of entries
5: Error in 128..255 range: 0.00 % of entries
5: TestOutputStages: PASS with ResultOrder=RowMajor
5: TestOutputStages: PASS with ResultOrder=ColMajor
5: TestOutputStages: PASS with ResultOrder=RowMajor
5: TestOutputStages: PASS with ResultOrder=ColMajor
5: TestOutputStages: PASS with ResultOrder=RowMajor
5: TestOutputStages: PASS with ResultOrder=ColMajor
5: TestOutputStages: PASS with ResultOrder=RowMajor
5: TestOutputStages: PASS with ResultOrder=ColMajor
5: TestWithSmallDataPerChannelQuantization: PASS
5:     number of matrix entries: 18
5:     median value: 127
5:     median unsigned diff: 0 (tolerating 0)
5:     max unsigned diff: 0 (tolerating 0)
5:     median signed diff: 0 (tolerating 0)
5:     mean signed diff: 0 (tolerating 0)
5: No error: 100.00 % of entries
5: Error in 1..1 range: 0.00 % of entries
5: Error in 2..3 range: 0.00 % of entries
5: Error in 4..7 range: 0.00 % of entries
5: Error in 8..15 range: 0.00 % of entries
5: Error in 16..31 range: 0.00 % of entries
5: Error in 32..63 range: 0.00 % of entries
5: Error in 64..127 range: 0.00 % of entries
5: Error in 128..255 range: 0.00 % of entries
5: TestWithLargeDataPerChannelQuantization: PASS
5:     number of matrix entries: 550
5:     median value: 7
5:     median unsigned diff: 0 (tolerating 0)
5:     max unsigned diff: 0 (tolerating 0)
5:     median signed diff: 0 (tolerating 0)
5:     mean signed diff: 0 (tolerating 0)
5: No error: 100.00 % of entries
5: Error in 1..1 range: 0.00 % of entries
5: Error in 2..3 range: 0.00 % of entries
5: Error in 4..7 range: 0.00 % of entries
5: Error in 8..15 range: 0.00 % of entries
5: Error in 16..31 range: 0.00 % of entries
5: Error in 32..63 range: 0.00 % of entries
5: Error in 64..127 range: 0.00 % of entries
5: Error in 128..255 range: 0.00 % of entries
5: TestMultithreadedPerChannelQuantization: PASS
5:     number of matrix entries: 1280
5:     median value: 0
5:     median unsigned diff: 0 (tolerating 0)
5:     max unsigned diff: 0 (tolerating 0)
5:     median signed diff: 0 (tolerating 0)
5:     mean signed diff: 0 (tolerating 0)
5: No error: 100.00 % of entries
5: Error in 1..1 range: 0.00 % of entries
5: Error in 2..3 range: 0.00 % of entries
5: Error in 4..7 range: 0.00 % of entries
5: Error in 8..15 range: 0.00 % of entries
5: Error in 16..31 range: 0.00 % of entries
5: Error in 32..63 range: 0.00 % of entries
5: Error in 64..127 range: 0.00 % of entries
5: Error in 128..255 range: 0.00 % of entries
5: All tests passed.
5/5 Test #5: test_gemmlowp ....................   Passed  480.40 sec

100% tests passed, 0 tests failed out of 5

Total Test time (real) = 480.41 sec
make[1]: Leaving directory '/<<PKGBUILDDIR>>/obj-arm-linux-gnueabihf'
   create-stamp debian/debhelper-build-stamp
 fakeroot debian/rules binary-arch
dh binary-arch -Scmake
   dh_testroot -a -O-Scmake
   dh_prep -a -O-Scmake
   dh_auto_install --destdir=debian/libgemmlowp-dev/ -a -O-Scmake
	cd obj-arm-linux-gnueabihf && make -j4 install DESTDIR=/<<BUILDDIR>>/gemmlowp-0.0\~git20211220.e844ffd/debian/libgemmlowp-dev AM_UPDATE_INFO_DIR=no "INSTALL=install --strip-program=true"
make[1]: Entering directory '/<<PKGBUILDDIR>>/obj-arm-linux-gnueabihf'
/usr/bin/cmake -S"/<<PKGBUILDDIR>>" -B"/<<PKGBUILDDIR>>/obj-arm-linux-gnueabihf" --check-build-system CMakeFiles/Makefile.cmake 0
make  -f CMakeFiles/Makefile2 preinstall
make[2]: Entering directory '/<<PKGBUILDDIR>>/obj-arm-linux-gnueabihf'
make[2]: Nothing to be done for 'preinstall'.
make[2]: Leaving directory '/<<PKGBUILDDIR>>/obj-arm-linux-gnueabihf'
Install the project...
/usr/bin/cmake -P cmake_install.cmake
-- Install configuration: "None"
-- Installing: /<<PKGBUILDDIR>>/debian/libgemmlowp-dev/usr/include/gemmlowp/eight_bit_int_gemm/eight_bit_int_gemm.h
-- Installing: /<<PKGBUILDDIR>>/debian/libgemmlowp-dev/usr/include/gemmlowp/meta/base.h
-- Installing: /<<PKGBUILDDIR>>/debian/libgemmlowp-dev/usr/include/gemmlowp/meta/legacy_multi_thread_common.h
-- Installing: /<<PKGBUILDDIR>>/debian/libgemmlowp-dev/usr/include/gemmlowp/meta/legacy_multi_thread_gemm.h
-- Installing: /<<PKGBUILDDIR>>/debian/libgemmlowp-dev/usr/include/gemmlowp/meta/legacy_multi_thread_gemv.h
-- Installing: /<<PKGBUILDDIR>>/debian/libgemmlowp-dev/usr/include/gemmlowp/meta/legacy_operations_common.h
-- Installing: /<<PKGBUILDDIR>>/debian/libgemmlowp-dev/usr/include/gemmlowp/meta/legacy_single_thread_gemm.h
-- Installing: /<<PKGBUILDDIR>>/debian/libgemmlowp-dev/usr/include/gemmlowp/meta/multi_thread_common.h
-- Installing: /<<PKGBUILDDIR>>/debian/libgemmlowp-dev/usr/include/gemmlowp/meta/multi_thread_gemm.h
-- Installing: /<<PKGBUILDDIR>>/debian/libgemmlowp-dev/usr/include/gemmlowp/meta/multi_thread_transform.h
-- Installing: /<<PKGBUILDDIR>>/debian/libgemmlowp-dev/usr/include/gemmlowp/meta/quantized_mul_kernels.h
-- Installing: /<<PKGBUILDDIR>>/debian/libgemmlowp-dev/usr/include/gemmlowp/meta/quantized_mul_kernels_arm_32.h
-- Installing: /<<PKGBUILDDIR>>/debian/libgemmlowp-dev/usr/include/gemmlowp/meta/quantized_mul_kernels_arm_64.h
-- Installing: /<<PKGBUILDDIR>>/debian/libgemmlowp-dev/usr/include/gemmlowp/meta/single_thread_gemm.h
-- Installing: /<<PKGBUILDDIR>>/debian/libgemmlowp-dev/usr/include/gemmlowp/meta/single_thread_transform.h
-- Installing: /<<PKGBUILDDIR>>/debian/libgemmlowp-dev/usr/include/gemmlowp/meta/streams.h
-- Installing: /<<PKGBUILDDIR>>/debian/libgemmlowp-dev/usr/include/gemmlowp/meta/streams_arm_32.h
-- Installing: /<<PKGBUILDDIR>>/debian/libgemmlowp-dev/usr/include/gemmlowp/meta/streams_arm_64.h
-- Installing: /<<PKGBUILDDIR>>/debian/libgemmlowp-dev/usr/include/gemmlowp/meta/transform_kernels.h
-- Installing: /<<PKGBUILDDIR>>/debian/libgemmlowp-dev/usr/include/gemmlowp/meta/transform_kernels_arm_32.h
-- Installing: /<<PKGBUILDDIR>>/debian/libgemmlowp-dev/usr/include/gemmlowp/meta/transform_kernels_arm_64.h
-- Installing: /<<PKGBUILDDIR>>/debian/libgemmlowp-dev/usr/include/gemmlowp/public/bit_depth.h
-- Installing: /<<PKGBUILDDIR>>/debian/libgemmlowp-dev/usr/include/gemmlowp/public/gemmlowp.h
-- Installing: /<<PKGBUILDDIR>>/debian/libgemmlowp-dev/usr/include/gemmlowp/public/map.h
-- Installing: /<<PKGBUILDDIR>>/debian/libgemmlowp-dev/usr/include/gemmlowp/public/output_stages.h
-- Installing: /<<PKGBUILDDIR>>/debian/libgemmlowp-dev/usr/include/gemmlowp/profiling/instrumentation.h
-- Installing: /<<PKGBUILDDIR>>/debian/libgemmlowp-dev/usr/include/gemmlowp/profiling/profiler.h
-- Installing: /<<PKGBUILDDIR>>/debian/libgemmlowp-dev/usr/include/gemmlowp/profiling/pthread_everywhere.h
-- Installing: /<<PKGBUILDDIR>>/debian/libgemmlowp-dev/usr/include/gemmlowp/internal/allocator.h
-- Installing: /<<PKGBUILDDIR>>/debian/libgemmlowp-dev/usr/include/gemmlowp/internal/block_params.h
-- Installing: /<<PKGBUILDDIR>>/debian/libgemmlowp-dev/usr/include/gemmlowp/internal/common.h
-- Installing: /<<PKGBUILDDIR>>/debian/libgemmlowp-dev/usr/include/gemmlowp/internal/compute.h
-- Installing: /<<PKGBUILDDIR>>/debian/libgemmlowp-dev/usr/include/gemmlowp/internal/detect_platform.h
-- Installing: /<<PKGBUILDDIR>>/debian/libgemmlowp-dev/usr/include/gemmlowp/internal/dispatch_gemm_shape.h
-- Installing: /<<PKGBUILDDIR>>/debian/libgemmlowp-dev/usr/include/gemmlowp/internal/kernel.h
-- Installing: /<<PKGBUILDDIR>>/debian/libgemmlowp-dev/usr/include/gemmlowp/internal/kernel_avx.h
-- Installing: /<<PKGBUILDDIR>>/debian/libgemmlowp-dev/usr/include/gemmlowp/internal/kernel_default.h
-- Installing: /<<PKGBUILDDIR>>/debian/libgemmlowp-dev/usr/include/gemmlowp/internal/kernel_msa.h
-- Installing: /<<PKGBUILDDIR>>/debian/libgemmlowp-dev/usr/include/gemmlowp/internal/kernel_neon.h
-- Installing: /<<PKGBUILDDIR>>/debian/libgemmlowp-dev/usr/include/gemmlowp/internal/kernel_reference.h
-- Installing: /<<PKGBUILDDIR>>/debian/libgemmlowp-dev/usr/include/gemmlowp/internal/kernel_sse.h
-- Installing: /<<PKGBUILDDIR>>/debian/libgemmlowp-dev/usr/include/gemmlowp/internal/multi_thread_gemm.h
-- Installing: /<<PKGBUILDDIR>>/debian/libgemmlowp-dev/usr/include/gemmlowp/internal/output.h
-- Installing: /<<PKGBUILDDIR>>/debian/libgemmlowp-dev/usr/include/gemmlowp/internal/output_avx.h
-- Installing: /<<PKGBUILDDIR>>/debian/libgemmlowp-dev/usr/include/gemmlowp/internal/output_msa.h
-- Installing: /<<PKGBUILDDIR>>/debian/libgemmlowp-dev/usr/include/gemmlowp/internal/output_neon.h
-- Installing: /<<PKGBUILDDIR>>/debian/libgemmlowp-dev/usr/include/gemmlowp/internal/output_sse.h
-- Installing: /<<PKGBUILDDIR>>/debian/libgemmlowp-dev/usr/include/gemmlowp/internal/pack.h
-- Installing: /<<PKGBUILDDIR>>/debian/libgemmlowp-dev/usr/include/gemmlowp/internal/pack_avx.h
-- Installing: /<<PKGBUILDDIR>>/debian/libgemmlowp-dev/usr/include/gemmlowp/internal/pack_msa.h
-- Installing: /<<PKGBUILDDIR>>/debian/libgemmlowp-dev/usr/include/gemmlowp/internal/pack_neon.h
-- Installing: /<<PKGBUILDDIR>>/debian/libgemmlowp-dev/usr/include/gemmlowp/internal/pack_sse.h
-- Installing: /<<PKGBUILDDIR>>/debian/libgemmlowp-dev/usr/include/gemmlowp/internal/platform.h
-- Installing: /<<PKGBUILDDIR>>/debian/libgemmlowp-dev/usr/include/gemmlowp/internal/simd_wrappers.h
-- Installing: /<<PKGBUILDDIR>>/debian/libgemmlowp-dev/usr/include/gemmlowp/internal/simd_wrappers_common_neon_sse.h
-- Installing: /<<PKGBUILDDIR>>/debian/libgemmlowp-dev/usr/include/gemmlowp/internal/simd_wrappers_msa.h
-- Installing: /<<PKGBUILDDIR>>/debian/libgemmlowp-dev/usr/include/gemmlowp/internal/simd_wrappers_neon.h
-- Installing: /<<PKGBUILDDIR>>/debian/libgemmlowp-dev/usr/include/gemmlowp/internal/simd_wrappers_sse.h
-- Installing: /<<PKGBUILDDIR>>/debian/libgemmlowp-dev/usr/include/gemmlowp/internal/single_thread_gemm.h
-- Installing: /<<PKGBUILDDIR>>/debian/libgemmlowp-dev/usr/include/gemmlowp/internal/unpack.h
-- Installing: /<<PKGBUILDDIR>>/debian/libgemmlowp-dev/usr/include/gemmlowp/fixedpoint/fixedpoint.h
-- Installing: /<<PKGBUILDDIR>>/debian/libgemmlowp-dev/usr/include/gemmlowp/fixedpoint/fixedpoint_avx.h
-- Installing: /<<PKGBUILDDIR>>/debian/libgemmlowp-dev/usr/include/gemmlowp/fixedpoint/fixedpoint_msa.h
-- Installing: /<<PKGBUILDDIR>>/debian/libgemmlowp-dev/usr/include/gemmlowp/fixedpoint/fixedpoint_neon.h
-- Installing: /<<PKGBUILDDIR>>/debian/libgemmlowp-dev/usr/include/gemmlowp/fixedpoint/fixedpoint_sse.h
-- Installing: /<<PKGBUILDDIR>>/debian/libgemmlowp-dev/usr/include/gemmlowp/fixedpoint/fixedpoint_wasmsimd.h
-- Installing: /<<PKGBUILDDIR>>/debian/libgemmlowp-dev/usr/lib/arm-linux-gnueabihf/libeight_bit_int_gemm.a
-- Installing: /<<PKGBUILDDIR>>/debian/libgemmlowp-dev/usr/lib/arm-linux-gnueabihf/cmake/gemmlowp/gemmlowp-config.cmake
-- Installing: /<<PKGBUILDDIR>>/debian/libgemmlowp-dev/usr/lib/arm-linux-gnueabihf/cmake/gemmlowp/gemmlowp-config-none.cmake
make[1]: Leaving directory '/<<PKGBUILDDIR>>/obj-arm-linux-gnueabihf'
   dh_install -a -O-Scmake
   debian/rules override_dh_installdocs
make[1]: Entering directory '/<<PKGBUILDDIR>>'
mkdir -p debian/libgemmlowp-dev/usr/share/doc/libgemmlowp-dev/meta/
install meta/README debian/libgemmlowp-dev/usr/share/doc/libgemmlowp-dev/meta/
dh_installdocs
make[1]: Leaving directory '/<<PKGBUILDDIR>>'
   dh_installchangelogs -a -O-Scmake
   dh_installexamples -a -O-Scmake
   dh_installinit -a -O-Scmake
   dh_perl -a -O-Scmake
   dh_link -a -O-Scmake
   dh_strip_nondeterminism -a -O-Scmake
   dh_compress -a -O-Scmake
   dh_fixperms -a -O-Scmake
   dh_missing -a -O-Scmake
   dh_dwz -a -O-Scmake
   dh_strip -a -O-Scmake
   dh_makeshlibs -a -O-Scmake
   dh_shlibdeps -a -O-Scmake
   dh_installdeb -a -O-Scmake
   dh_gencontrol -a -O-Scmake
   dh_md5sums -a -O-Scmake
   dh_builddeb -a -O-Scmake
dpkg-deb: building package 'libgemmlowp-dev' in '../libgemmlowp-dev_0.0~git20211220.e844ffd-1_armhf.deb'.
 dpkg-genbuildinfo --build=any -O../gemmlowp_0.0~git20211220.e844ffd-1_armhf.buildinfo
 dpkg-genchanges --build=any -mRaspbian mythic lxc autobuilder 1 <root@raspbian.org> -O../gemmlowp_0.0~git20211220.e844ffd-1_armhf.changes
dpkg-genchanges: info: binary-only arch-specific upload (source code and arch-indep packages not included)
 dpkg-source --after-build .
dpkg-buildpackage: info: binary-only upload (no source included)
--------------------------------------------------------------------------------
Build finished at 2022-06-29T13:09:11Z

Finished
--------

I: Built successfully

+------------------------------------------------------------------------------+
| Post Build Chroot                                                            |
+------------------------------------------------------------------------------+


+------------------------------------------------------------------------------+
| Changes                                                                      |
+------------------------------------------------------------------------------+


gemmlowp_0.0~git20211220.e844ffd-1_armhf.changes:
-------------------------------------------------

Format: 1.8
Date: Thu, 23 Jun 2022 22:56:13 -0700
Source: gemmlowp
Binary: libgemmlowp-dev
Architecture: armhf
Version: 0.0~git20211220.e844ffd-1
Distribution: bookworm-staging
Urgency: medium
Maintainer: Raspbian mythic lxc autobuilder 1 <root@raspbian.org>
Changed-By: Mo Zhou <lumin@debian.org>
Description:
 libgemmlowp-dev - small self-contained low-precision GEMM library
Changes:
 gemmlowp (0.0~git20211220.e844ffd-1) unstable; urgency=medium
 .
   * New upstream version 0.0~git20211220.e844ffd
Checksums-Sha1:
 d236c0b63d6628b99353b3398ad1de5428a1864c 5599 gemmlowp_0.0~git20211220.e844ffd-1_armhf.buildinfo
 a6e52f6961a517ccd8aa538f7a51be241a711eed 566544 libgemmlowp-dev_0.0~git20211220.e844ffd-1_armhf.deb
Checksums-Sha256:
 a939ab42eb9286b2a853f7a48f375341d6695c5c3282995c17f4a4caff6d311d 5599 gemmlowp_0.0~git20211220.e844ffd-1_armhf.buildinfo
 61ea56f2721e7542818fe53dc71239483e6c2dc6bdee7201fb983a90040695aa 566544 libgemmlowp-dev_0.0~git20211220.e844ffd-1_armhf.deb
Files:
 a6dbdcbe7fa2b3efebac4a2185d6997d 5599 science optional gemmlowp_0.0~git20211220.e844ffd-1_armhf.buildinfo
 067edcbab5debb6450e76842bf39a0ec 566544 libdevel optional libgemmlowp-dev_0.0~git20211220.e844ffd-1_armhf.deb

+------------------------------------------------------------------------------+
| Package contents                                                             |
+------------------------------------------------------------------------------+


libgemmlowp-dev_0.0~git20211220.e844ffd-1_armhf.deb
---------------------------------------------------

 new Debian package, version 2.0.
 size 566544 bytes: control archive=4180 bytes.
    1057 bytes,    24 lines      control              
   10126 bytes,   116 lines      md5sums              
 Package: libgemmlowp-dev
 Source: gemmlowp
 Version: 0.0~git20211220.e844ffd-1
 Architecture: armhf
 Maintainer: Debian Science Maintainers <debian-science-maintainers@lists.alioth.debian.org>
 Installed-Size: 5901
 Section: libdevel
 Priority: optional
 Multi-Arch: foreign
 Homepage: https://github.com/google/gemmlowp
 Description: small self-contained low-precision GEMM library
  This is not a full linear algebra library, only a GEMM library: it only does
  general matrix multiplication ("GEMM").
  .
  Its performance goals differ from typical GEMM performance goals in the
  following ways:
  1. It cares not only about speed, but also about minimizing power usage.
     It specifically cares about charge usage in mobile/embedded devices.
  2. Most GEMMs are optimized primarily for large dense matrix sizes (>= 1000).
     It does care about large sizes, but it also cares specifically about the
     typically smaller matrix sizes encountered in various mobile applications.
  .
  Keep in mind (previous section) that gemmlowp itself is a pure-headers-only
  library.

drwxr-xr-x root/root         0 2022-06-24 05:56 ./
drwxr-xr-x root/root         0 2022-06-24 05:56 ./usr/
drwxr-xr-x root/root         0 2022-06-24 05:56 ./usr/include/
drwxr-xr-x root/root         0 2022-06-24 05:56 ./usr/include/gemmlowp/
drwxr-xr-x root/root         0 2022-06-24 05:56 ./usr/include/gemmlowp/eight_bit_int_gemm/
-rw-r--r-- root/root      3418 2021-12-20 17:33 ./usr/include/gemmlowp/eight_bit_int_gemm/eight_bit_int_gemm.h
drwxr-xr-x root/root         0 2021-12-20 17:33 ./usr/include/gemmlowp/fixedpoint/
-rw-r--r-- root/root     35769 2021-12-20 17:33 ./usr/include/gemmlowp/fixedpoint/fixedpoint.h
-rw-r--r-- root/root     11184 2021-12-20 17:33 ./usr/include/gemmlowp/fixedpoint/fixedpoint_avx.h
-rw-r--r-- root/root     12541 2021-12-20 17:33 ./usr/include/gemmlowp/fixedpoint/fixedpoint_msa.h
-rw-r--r-- root/root      9073 2021-12-20 17:33 ./usr/include/gemmlowp/fixedpoint/fixedpoint_neon.h
-rw-r--r-- root/root     11142 2021-12-20 17:33 ./usr/include/gemmlowp/fixedpoint/fixedpoint_sse.h
-rw-r--r-- root/root     11257 2021-12-20 17:33 ./usr/include/gemmlowp/fixedpoint/fixedpoint_wasmsimd.h
drwxr-xr-x root/root         0 2021-12-20 17:33 ./usr/include/gemmlowp/internal/
-rw-r--r-- root/root      6328 2021-12-20 17:33 ./usr/include/gemmlowp/internal/allocator.h
-rw-r--r-- root/root      6768 2021-12-20 17:33 ./usr/include/gemmlowp/internal/block_params.h
-rw-r--r-- root/root      6676 2021-12-20 17:33 ./usr/include/gemmlowp/internal/common.h
-rw-r--r-- root/root      4299 2021-12-20 17:33 ./usr/include/gemmlowp/internal/compute.h
-rw-r--r-- root/root      4996 2021-12-20 17:33 ./usr/include/gemmlowp/internal/detect_platform.h
-rw-r--r-- root/root      8036 2021-12-20 17:33 ./usr/include/gemmlowp/internal/dispatch_gemm_shape.h
-rw-r--r-- root/root      9218 2021-12-20 17:33 ./usr/include/gemmlowp/internal/kernel.h
-rw-r--r-- root/root     19165 2021-12-20 17:33 ./usr/include/gemmlowp/internal/kernel_avx.h
-rw-r--r-- root/root      4847 2021-12-20 17:33 ./usr/include/gemmlowp/internal/kernel_default.h
-rw-r--r-- root/root     23748 2021-12-20 17:33 ./usr/include/gemmlowp/internal/kernel_msa.h
-rw-r--r-- root/root     75739 2021-12-20 17:33 ./usr/include/gemmlowp/internal/kernel_neon.h
-rw-r--r-- root/root      4837 2021-12-20 17:33 ./usr/include/gemmlowp/internal/kernel_reference.h
-rw-r--r-- root/root     18968 2021-12-20 17:33 ./usr/include/gemmlowp/internal/kernel_sse.h
-rw-r--r-- root/root     28223 2021-12-20 17:33 ./usr/include/gemmlowp/internal/multi_thread_gemm.h
-rw-r--r-- root/root     22604 2021-12-20 17:33 ./usr/include/gemmlowp/internal/output.h
-rw-r--r-- root/root       763 2021-12-20 17:33 ./usr/include/gemmlowp/internal/output_avx.h
-rw-r--r-- root/root     44686 2021-12-20 17:33 ./usr/include/gemmlowp/internal/output_msa.h
-rw-r--r-- root/root     35994 2021-12-20 17:33 ./usr/include/gemmlowp/internal/output_neon.h
-rw-r--r-- root/root     20105 2021-12-20 17:33 ./usr/include/gemmlowp/internal/output_sse.h
-rw-r--r-- root/root     17975 2021-12-20 17:33 ./usr/include/gemmlowp/internal/pack.h
-rw-r--r-- root/root     11519 2021-12-20 17:33 ./usr/include/gemmlowp/internal/pack_avx.h
-rw-r--r-- root/root     18939 2021-12-20 17:33 ./usr/include/gemmlowp/internal/pack_msa.h
-rw-r--r-- root/root     15055 2021-12-20 17:33 ./usr/include/gemmlowp/internal/pack_neon.h
-rw-r--r-- root/root      4972 2021-12-20 17:33 ./usr/include/gemmlowp/internal/pack_sse.h
-rw-r--r-- root/root      2977 2021-12-20 17:33 ./usr/include/gemmlowp/internal/platform.h
-rw-r--r-- root/root     25588 2021-12-20 17:33 ./usr/include/gemmlowp/internal/simd_wrappers.h
-rw-r--r-- root/root     31387 2021-12-20 17:33 ./usr/include/gemmlowp/internal/simd_wrappers_common_neon_sse.h
-rw-r--r-- root/root      5642 2021-12-20 17:33 ./usr/include/gemmlowp/internal/simd_wrappers_msa.h
-rw-r--r-- root/root     19188 2021-12-20 17:33 ./usr/include/gemmlowp/internal/simd_wrappers_neon.h
-rw-r--r-- root/root      4257 2021-12-20 17:33 ./usr/include/gemmlowp/internal/simd_wrappers_sse.h
-rw-r--r-- root/root      5586 2021-12-20 17:33 ./usr/include/gemmlowp/internal/single_thread_gemm.h
-rw-r--r-- root/root     12594 2021-12-20 17:33 ./usr/include/gemmlowp/internal/unpack.h
drwxr-xr-x root/root         0 2022-06-24 05:56 ./usr/include/gemmlowp/meta/
-rw-r--r-- root/root      3960 2021-12-20 17:33 ./usr/include/gemmlowp/meta/base.h
-rw-r--r-- root/root      5384 2021-12-20 17:33 ./usr/include/gemmlowp/meta/legacy_multi_thread_common.h
-rw-r--r-- root/root     11396 2021-12-20 17:33 ./usr/include/gemmlowp/meta/legacy_multi_thread_gemm.h
-rw-r--r-- root/root      6992 2021-12-20 17:33 ./usr/include/gemmlowp/meta/legacy_multi_thread_gemv.h
-rw-r--r-- root/root      1850 2021-12-20 17:33 ./usr/include/gemmlowp/meta/legacy_operations_common.h
-rw-r--r-- root/root      9600 2021-12-20 17:33 ./usr/include/gemmlowp/meta/legacy_single_thread_gemm.h
-rw-r--r-- root/root      1593 2021-12-20 17:33 ./usr/include/gemmlowp/meta/multi_thread_common.h
-rw-r--r-- root/root      5253 2021-12-20 17:33 ./usr/include/gemmlowp/meta/multi_thread_gemm.h
-rw-r--r-- root/root      3519 2021-12-20 17:33 ./usr/include/gemmlowp/meta/multi_thread_transform.h
-rw-r--r-- root/root      5759 2021-12-20 17:33 ./usr/include/gemmlowp/meta/quantized_mul_kernels.h
-rw-r--r-- root/root    131368 2021-12-20 17:33 ./usr/include/gemmlowp/meta/quantized_mul_kernels_arm_32.h
-rw-r--r-- root/root    130137 2021-12-20 17:33 ./usr/include/gemmlowp/meta/quantized_mul_kernels_arm_64.h
-rw-r--r-- root/root     25668 2021-12-20 17:33 ./usr/include/gemmlowp/meta/single_thread_gemm.h
-rw-r--r-- root/root      2957 2021-12-20 17:33 ./usr/include/gemmlowp/meta/single_thread_transform.h
-rw-r--r-- root/root     11049 2021-12-20 17:33 ./usr/include/gemmlowp/meta/streams.h
-rw-r--r-- root/root    390785 2021-12-20 17:33 ./usr/include/gemmlowp/meta/streams_arm_32.h
-rw-r--r-- root/root    410715 2021-12-20 17:33 ./usr/include/gemmlowp/meta/streams_arm_64.h
-rw-r--r-- root/root      7317 2021-12-20 17:33 ./usr/include/gemmlowp/meta/transform_kernels.h
-rw-r--r-- root/root    247365 2021-12-20 17:33 ./usr/include/gemmlowp/meta/transform_kernels_arm_32.h
-rw-r--r-- root/root    260838 2021-12-20 17:33 ./usr/include/gemmlowp/meta/transform_kernels_arm_64.h
drwxr-xr-x root/root         0 2021-12-20 17:33 ./usr/include/gemmlowp/profiling/
-rw-r--r-- root/root      6543 2021-12-20 17:33 ./usr/include/gemmlowp/profiling/instrumentation.h
-rw-r--r-- root/root     11853 2021-12-20 17:33 ./usr/include/gemmlowp/profiling/profiler.h
-rw-r--r-- root/root      3313 2021-12-20 17:33 ./usr/include/gemmlowp/profiling/pthread_everywhere.h
drwxr-xr-x root/root         0 2021-12-20 17:33 ./usr/include/gemmlowp/public/
-rw-r--r-- root/root      2616 2021-12-20 17:33 ./usr/include/gemmlowp/public/bit_depth.h
-rw-r--r-- root/root      4314 2021-12-20 17:33 ./usr/include/gemmlowp/public/gemmlowp.h
-rw-r--r-- root/root      4422 2021-12-20 17:33 ./usr/include/gemmlowp/public/map.h
-rw-r--r-- root/root     11283 2021-12-20 17:33 ./usr/include/gemmlowp/public/output_stages.h
drwxr-xr-x root/root         0 2022-06-24 05:56 ./usr/lib/
drwxr-xr-x root/root         0 2022-06-24 05:56 ./usr/lib/arm-linux-gnueabihf/
drwxr-xr-x root/root         0 2022-06-24 05:56 ./usr/lib/arm-linux-gnueabihf/cmake/
drwxr-xr-x root/root         0 2022-06-24 05:56 ./usr/lib/arm-linux-gnueabihf/cmake/gemmlowp/
-rw-r--r-- root/root       950 2022-06-24 05:56 ./usr/lib/arm-linux-gnueabihf/cmake/gemmlowp/gemmlowp-config-none.cmake
-rw-r--r-- root/root      4187 2022-06-24 05:56 ./usr/lib/arm-linux-gnueabihf/cmake/gemmlowp/gemmlowp-config.cmake
-rw-r--r-- root/root   1036426 2022-06-24 05:56 ./usr/lib/arm-linux-gnueabihf/libeight_bit_int_gemm.a
drwxr-xr-x root/root         0 2022-06-24 05:56 ./usr/share/
drwxr-xr-x root/root         0 2022-06-24 05:56 ./usr/share/doc/
drwxr-xr-x root/root         0 2022-06-24 05:56 ./usr/share/doc/libgemmlowp-dev/
-rw-r--r-- root/root       390 2021-12-20 17:33 ./usr/share/doc/libgemmlowp-dev/AUTHORS
-rw-r--r-- root/root      1977 2021-12-20 17:33 ./usr/share/doc/libgemmlowp-dev/CONTRIBUTING
-rw-r--r-- root/root      1208 2021-12-20 17:33 ./usr/share/doc/libgemmlowp-dev/CONTRIBUTORS
-rw-r--r-- root/root      3774 2021-12-20 17:33 ./usr/share/doc/libgemmlowp-dev/README.md.gz
-rw-r--r-- root/root      1010 2022-06-24 05:56 ./usr/share/doc/libgemmlowp-dev/changelog.Debian.gz
-rw-r--r-- root/root      1864 2022-06-24 05:51 ./usr/share/doc/libgemmlowp-dev/copyright
drwxr-xr-x root/root         0 2022-06-24 05:56 ./usr/share/doc/libgemmlowp-dev/doc/
-rw-r--r-- root/root      2479 2021-12-20 17:33 ./usr/share/doc/libgemmlowp-dev/doc/design.md.gz
-rw-r--r-- root/root      2399 2021-12-20 17:33 ./usr/share/doc/libgemmlowp-dev/doc/kernel.md.gz
-rw-r--r-- root/root      5820 2021-12-20 17:33 ./usr/share/doc/libgemmlowp-dev/doc/less-than-8-bit.md.gz
-rw-r--r-- root/root      3106 2021-12-20 17:33 ./usr/share/doc/libgemmlowp-dev/doc/low-precision.md.gz
-rw-r--r-- root/root      2022 2021-12-20 17:33 ./usr/share/doc/libgemmlowp-dev/doc/output.md
-rw-r--r-- root/root      3499 2021-12-20 17:33 ./usr/share/doc/libgemmlowp-dev/doc/packing.md.gz
-rw-r--r-- root/root      2650 2021-12-20 17:33 ./usr/share/doc/libgemmlowp-dev/doc/public.md.gz
-rw-r--r-- root/root      4673 2021-12-20 17:33 ./usr/share/doc/libgemmlowp-dev/doc/quantization.md.gz
-rw-r--r-- root/root      4572 2021-12-20 17:33 ./usr/share/doc/libgemmlowp-dev/doc/quantization_example.cc.gz
drwxr-xr-x root/root         0 2022-06-24 05:56 ./usr/share/doc/libgemmlowp-dev/examples/
-rw-r--r-- root/root     15511 2021-12-20 17:33 ./usr/share/doc/libgemmlowp-dev/examples/eight_bit_int_gemm.cc
drwxr-xr-x root/root         0 2021-12-20 17:33 ./usr/share/doc/libgemmlowp-dev/examples/test/
-rw-r--r-- root/root     12806 2021-12-20 17:33 ./usr/share/doc/libgemmlowp-dev/examples/test/benchmark.cc
-rw-r--r-- root/root     11382 2021-12-20 17:33 ./usr/share/doc/libgemmlowp-dev/examples/test/benchmark_all_sizes.cc
-rw-r--r-- root/root     10839 2021-12-20 17:33 ./usr/share/doc/libgemmlowp-dev/examples/test/benchmark_meta_gemm.cc
-rw-r--r-- root/root     12064 2021-12-20 17:33 ./usr/share/doc/libgemmlowp-dev/examples/test/correctness_meta_gemm.cc
drwxr-xr-x root/root         0 2021-12-20 17:33 ./usr/share/doc/libgemmlowp-dev/examples/test/ios/
drwxr-xr-x root/root         0 2021-12-20 17:33 ./usr/share/doc/libgemmlowp-dev/examples/test/ios/gemmlowp_test.xcodeproj/
-rw-r--r-- root/root     29285 2021-12-20 17:33 ./usr/share/doc/libgemmlowp-dev/examples/test/ios/gemmlowp_test.xcodeproj/project.pbxproj
drwxr-xr-x root/root         0 2021-12-20 17:33 ./usr/share/doc/libgemmlowp-dev/examples/test/ios/gemmlowp_test/
-rw-r--r-- root/root       279 2021-12-20 17:33 ./usr/share/doc/libgemmlowp-dev/examples/test/ios/gemmlowp_test/AppDelegate.h
-rw-r--r-- root/root      2149 2021-12-20 17:33 ./usr/share/doc/libgemmlowp-dev/examples/test/ios/gemmlowp_test/AppDelegate.mm
drwxr-xr-x root/root         0 2021-12-20 17:33 ./usr/share/doc/libgemmlowp-dev/examples/test/ios/gemmlowp_test/Base.lproj/
-rw-r--r-- root/root      3708 2021-12-20 17:33 ./usr/share/doc/libgemmlowp-dev/examples/test/ios/gemmlowp_test/Base.lproj/LaunchScreen.xib
-rw-r--r-- root/root      1575 2021-12-20 17:33 ./usr/share/doc/libgemmlowp-dev/examples/test/ios/gemmlowp_test/Base.lproj/Main.storyboard
drwxr-xr-x root/root         0 2021-12-20 17:33 ./usr/share/doc/libgemmlowp-dev/examples/test/ios/gemmlowp_test/Images.xcassets/
drwxr-xr-x root/root         0 2021-12-20 17:33 ./usr/share/doc/libgemmlowp-dev/examples/test/ios/gemmlowp_test/Images.xcassets/AppIcon.appiconset/
-rw-r--r-- root/root      1077 2021-12-20 17:33 ./usr/share/doc/libgemmlowp-dev/examples/test/ios/gemmlowp_test/Images.xcassets/AppIcon.appiconset/Contents.json
-rw-r--r-- root/root      1511 2021-12-20 17:33 ./usr/share/doc/libgemmlowp-dev/examples/test/ios/gemmlowp_test/Info.plist
-rw-r--r-- root/root       219 2021-12-20 17:33 ./usr/share/doc/libgemmlowp-dev/examples/test/ios/gemmlowp_test/ViewController.h
-rw-r--r-- root/root       492 2021-12-20 17:33 ./usr/share/doc/libgemmlowp-dev/examples/test/ios/gemmlowp_test/ViewController.m
-rw-r--r-- root/root       334 2021-12-20 17:33 ./usr/share/doc/libgemmlowp-dev/examples/test/ios/gemmlowp_test/main.m
-rw-r--r-- root/root     75539 2021-12-20 17:33 ./usr/share/doc/libgemmlowp-dev/examples/test/test.cc
-rw-r--r-- root/root      4448 2021-12-20 17:33 ./usr/share/doc/libgemmlowp-dev/examples/test/test.h
-rw-r--r-- root/root      2115 2021-12-20 17:33 ./usr/share/doc/libgemmlowp-dev/examples/test/test_allocator.cc
-rw-r--r-- root/root      4408 2021-12-20 17:33 ./usr/share/doc/libgemmlowp-dev/examples/test/test_blocking_counter.cc
-rw-r--r-- root/root   2296359 2021-12-20 17:33 ./usr/share/doc/libgemmlowp-dev/examples/test/test_data.cc
-rw-r--r-- root/root      1288 2021-12-20 17:33 ./usr/share/doc/libgemmlowp-dev/examples/test/test_data.h
-rw-r--r-- root/root     22285 2021-12-20 17:33 ./usr/share/doc/libgemmlowp-dev/examples/test/test_fixedpoint.cc
-rw-r--r-- root/root      4144 2021-12-20 17:33 ./usr/share/doc/libgemmlowp-dev/examples/test/test_math_helpers.cc
drwxr-xr-x root/root         0 2022-06-24 05:56 ./usr/share/doc/libgemmlowp-dev/meta/
-rw-r--r-- root/root      3657 2022-06-24 05:56 ./usr/share/doc/libgemmlowp-dev/meta/README
drwxr-xr-x root/root         0 2022-06-24 05:56 ./usr/share/doc/libgemmlowp-dev/todo/
-rw-r--r-- root/root      1605 2021-12-20 17:33 ./usr/share/doc/libgemmlowp-dev/todo/armv8-64bit-kernel-for-less-than-8-bit.txt
-rw-r--r-- root/root      3277 2021-12-20 17:33 ./usr/share/doc/libgemmlowp-dev/todo/error-diffusion-experiments.txt
-rw-r--r-- root/root      6232 2021-12-20 17:33 ./usr/share/doc/libgemmlowp-dev/todo/fast-gemv.txt.gz
-rw-r--r-- root/root       962 2021-12-20 17:33 ./usr/share/doc/libgemmlowp-dev/todo/less-than-8-bit-without-requantization.txt
-rw-r--r-- root/root      2338 2021-12-20 17:33 ./usr/share/doc/libgemmlowp-dev/todo/multi-threading-experiments.txt.gz
-rw-r--r-- root/root      1114 2021-12-20 17:33 ./usr/share/doc/libgemmlowp-dev/todo/neon-depth-major-sources-packing.txt
-rw-r--r-- root/root       802 2021-12-20 17:33 ./usr/share/doc/libgemmlowp-dev/todo/remove-default-template-param-values.txt
-rw-r--r-- root/root      1666 2021-12-20 17:33 ./usr/share/doc/libgemmlowp-dev/todo/x86-kernels.txt


+------------------------------------------------------------------------------+
| Post Build                                                                   |
+------------------------------------------------------------------------------+


+------------------------------------------------------------------------------+
| Cleanup                                                                      |
+------------------------------------------------------------------------------+

Purging /<<BUILDDIR>>
Not cleaning session: cloned chroot in use

+------------------------------------------------------------------------------+
| Summary                                                                      |
+------------------------------------------------------------------------------+

Build Architecture: armhf
Build-Space: 81104
Build-Time: 817
Distribution: bookworm-staging
Host Architecture: armhf
Install-Time: 244
Job: gemmlowp_0.0~git20211220.e844ffd-1
Machine Architecture: armhf
Package: gemmlowp
Package-Time: 1079
Source-Version: 0.0~git20211220.e844ffd-1
Space: 81104
Status: successful
Version: 0.0~git20211220.e844ffd-1
--------------------------------------------------------------------------------
Finished at 2022-06-29T13:09:11Z
Build needed 00:17:59, 81104k disk space