Commit graph

330 commits

Author SHA1 Message Date
namkazy
d4038bf4ae remove disable optimize 2020-04-05 10:31:30 +07:00
namkazy
e4402955e4 [wip] reimplement SULD.D 2020-04-05 10:31:29 +07:00
Nguyen Dac Nam
731b0dbebc clang-fix 2020-04-05 10:31:28 +07:00
Nguyen Dac Nam
4c3ddd9c99 shader: image - import PredCondition 2020-04-05 10:31:27 +07:00
Nguyen Dac Nam
9f407fad5e shader: SULD.D bits32 implement more complexer method. 2020-04-05 10:31:27 +07:00
Nguyen Dac Nam
06fa4a3a41 shader: SULD.D import StoreType 2020-04-05 10:31:26 +07:00
Nguyen Dac Nam
94fecd1b68 shader: implement SULD.D bits32 2020-04-05 10:31:26 +07:00
ReinUsesLisp
ab3a1db282 shader/memory: Silence no return value warning
Silences a warning about control paths not all returning a value.
2020-04-02 03:34:27 -03:00
Fernando Sahmkow
207bfbf720 Merge pull request #3561 from ReinUsesLisp/f2f-conversion
shader/conversion: Fix F2F rounding operations with different sizes
2020-03-31 14:45:02 -04:00
Fernando Sahmkow
8a4af4f128 Merge pull request #3577 from ReinUsesLisp/lea
shader/lea: Fix LEA implementation
2020-03-31 14:36:07 -04:00
Nguyen Dac Nam
e0add44428 clang-format 2020-03-31 08:08:06 +07:00
Nguyen Dac Nam
455a771f6c shader_decode: fix by suggestion 2020-03-31 08:02:44 +07:00
namkazy
5961fe334c clang-format 2020-03-30 20:46:21 +07:00
namkazy
fd7fb7c1b7 shader_decode: ATOM/ATOMS: add function to avoid code repetition 2020-03-30 18:47:50 +07:00
Nguyen Dac Nam
3809c15721 shader_decode: implement ATOM operation for S32 and U32 2020-03-30 17:44:48 +07:00
namkazy
93a5b51a1f clang-format 2020-03-30 17:44:48 +07:00
Nguyen Dac Nam
e57c348d6e shader_decode: implement ATOMS instr partial. 2020-03-30 17:44:46 +07:00
ReinUsesLisp
74b1f71109 shader/lea: Simplify generated LEA code 2020-03-28 03:55:04 -03:00
ReinUsesLisp
fda10c4b0b shader/lea: Fix op_a and op_b usages
They were swapped.
2020-03-27 18:37:20 -03:00
ReinUsesLisp
ca9309bc07 shader/lea: Remove const and use move when possible 2020-03-27 18:36:38 -03:00
ReinUsesLisp
82d53d445a shader/conversion: Fix F2F rounding operations with different sizes
Rounding operations only matter when the conversion size of source and
destination is the same, i.e. .F16.F16, .F32.F32 and .F64.F64.

When there is a mismatch (.F16.F32), these bits are used for IEEE
rounding, we don't emulate this because GLSL and SPIR-V don't support
configuring it per operation.
2020-03-26 01:58:49 -03:00
makigumo
4a1a5ea61e xmad: fix clang build error 2020-03-23 00:09:31 +01:00
bunnei
4785b963c3 Merge pull request #3505 from namkazt/patch-8
shader_decode: implement XMAD mode CSfu
2020-03-19 17:41:01 -04:00
Rodrigo Locatti
944d38efc8 Merge pull request #3502 from namkazt/patch-3
shader_decode: Reimplement BFE instructions
2020-03-15 21:23:04 -03:00
Nguyen Dac Nam
2cd41ab020 clang-format 2020-03-14 10:07:40 +07:00
Nguyen Dac Nam
d13e860a08 nit 2020-03-14 09:57:24 +07:00
Nguyen Dac Nam
12b08c1725 nit & remove some optional param 2020-03-13 20:47:38 +07:00
Nguyen Dac Nam
0a64ee04e3 shader_decode: implement XMAD mode CSfu 2020-03-13 19:01:49 +07:00
Nguyen Dac Nam
a9e6b48dc0 clang-format 2020-03-13 15:38:57 +07:00
Nguyen Dac Nam
be63f9a0a2 Apply suggestions from code review
Co-Authored-By: Mat M. <mathew1800@gmail.com>
2020-03-13 15:35:15 +07:00
Nguyen Dac Nam
edabb9957a shader_decode: BFE add ref of reverse parallel method. 2020-03-13 14:20:18 +07:00
Nguyen Dac Nam
8b2bc366f8 shader_decode: implement BREV on BFE
Implement reverse parallel follow: https://graphics.stanford.edu/~seander/bithacks.html#ReverseParallel
2020-03-13 14:13:31 +07:00
Nguyen Dac Nam
86eb7ea0c7 shader_decode: Reimplement BFE instructions 2020-03-13 12:48:01 +07:00
ReinUsesLisp
99be31c902 video_core: Rename "const buffer locker" to "registry" 2020-03-09 18:40:06 -03:00
Nguyen Dac Nam
bb39862dfe shader: FMUL switch to using LUT (#3441)
* shader: add FmulPostFactor LUT table

* shader: FMUL apply LUT

* Update src/video_core/engines/shader_bytecode.h

Co-Authored-By: Mat M. <mathew1800@gmail.com>

* nit: mistype

* clang-format & add missing import

* shader: remove post factor LUT.

* shader: move post factor LUT to function and fix incorrect order.

* clang-format

* shader: FMUL: add static to post factor LUT

* nit: typo

Co-authored-by: Mat M. <mathew1800@gmail.com>
2020-02-27 11:14:25 -05:00
bunnei
3cb09e3570 Merge pull request #3440 from namkazt/patch-6
shader: implement LOP3 fast replace for old function
2020-02-26 10:24:35 -05:00
ReinUsesLisp
8ab2e5f561 shader/texture: Fix illegal 3D texture assert
Fix typo in the illegal 3D texture assert logic. We care about catching
arrayed 3D textures or 3D shadow textures, not regular 3D textures.
2020-02-21 15:57:27 -03:00
Nguyen Dac Nam
96e43427e5 nit: add const to where it need. 2020-02-21 21:16:45 +07:00
Nguyen Dac Nam
0c3acedaf9 shader: implement LOP3 fast replace for old function
ref: https://devtalk.nvidia.com/default/topic/1070081/cuda-programming-and-performance/reverse-lut-for-lop3-lut/
2020-02-21 19:08:07 +07:00
bunnei
fbd58d36d1 Merge pull request #3415 from ReinUsesLisp/texture-code
shader/texture: Allow 2D shadow arrays and simplify code
2020-02-19 20:06:14 -05:00
Nguyen Dac Nam
a57853e085 shader_conversion: I2F : add Assert for case src_size is Short 2020-02-19 11:40:35 +07:00
Nguyen Dac Nam
92153118ab fix warning 2020-02-19 11:10:26 +07:00
Nguyen Dac Nam
84fc48b0eb clang-format fix 2020-02-19 11:02:59 +07:00
Nguyen Dac Nam
0d9361d21f shader_conversion: add conversion I2F for Short 2020-02-19 10:54:37 +07:00
ReinUsesLisp
f37f4e76d6 shader/texture: Allow 2D shadow arrays and simplify code
Shadow sampler 2D arrays are supported on OpenGL, so there's no reason
to forbid these. Enable textureLod usage on these.

Minor style changes.
2020-02-15 02:36:28 -03:00
bunnei
082ba6fc64 Merge pull request #3379 from ReinUsesLisp/cbuf-offset
shader/decode: Fix constant buffer offsets
2020-02-14 13:22:53 -05:00
bunnei
5160900bec Merge pull request #3369 from ReinUsesLisp/shf
shader/shift: Implement SHF
2020-02-07 22:06:57 -05:00
ReinUsesLisp
389cb51a33 shader/decode: Fix constant buffer offsets
Some instances were using cbuf34.offset instead of cbuf34.GetOffset().
This returned the an invalid offset. Address those instances and rename
offset to "shifted_offset" to avoid future bugs.
2020-02-05 12:19:09 -03:00
bunnei
a683c3c57c Merge pull request #3357 from ReinUsesLisp/bfi-rc
shader/bfi: Implement register-constant buffer variant
2020-02-04 15:14:13 -05:00
bunnei
223e535f65 Merge pull request #3356 from ReinUsesLisp/fcmp
shader/arithmetic: Implement FCMP
2020-02-04 11:36:59 -05:00