계속 라즈베리 파이 관련 글만 계속 올리는 것 같다. 하지만 재밌는걸.
waifu2x-ncnn-vulkan은 NCNN이라는 경량 신경망 추론 프레임워크를 기반으로 구현된 waifu2x이다.
Vulkan API를 활용해 GPU 가속을 지원하기에, CUDA 없이도 GPU 가속이 가능하므로 라즈베리 파이와 같은 ARM 기반 디바이스에서 적합하다.
그러나, 라즈베리 파이 5의 하드웨어 자체가 그리 좋다고 할 수는 없기 때문에, waifu2x-ncnn-vulkan이라고 해도 GPU를 이용한 가속이 어려울 수 있다.
그래도 일단 사용은 가능하니, 이 글에 방법을 서술한다.
사전 준비
사전 준비물이다.
아래의 명령을 입력한다.
sudo apt update
sudo apt install git cmake g++ libvulkan-dev libopencv-dev -y # 개발 도구
sudo apt install mesa-vulkan-drivers vulkan-tools -y # vulkan 드라이버와 도구
sudo apt install glslang-tools -y # glslang 관련 도구
vulkaninfo # 유틸리티

다 설치하면 이제 컴파일할 차례이다.
컴파일하기
컴파일에 필요한 환경을 갖추었으니, 이제 컴파일한다.
아래의 명령을 한 줄씩 실행한다.
git clone https://github.com/nihui/waifu2x-ncnn-vulkan.git # GitHub에서 소스 코드 clone
cd waifu2x-ncnn-vulkan # clone한 디렉터리로 이동
git submodule update --init --recursive # 서브모듈 초기화 및 업데이트
mkdir build && cd build # 빌드 디렉터리를 생성하고 해당 디렉터리로 이동
cmake ../src # cmake로 빌드 파일 생성
make -j$(nproc) # make로 컴파일
아래 접은 글은 cmake 출력 전문이다.
sprout1345@sproutpi:~/waifu2x-ncnn-vulkan/build$ cmake ../src
CMake Warning (dev) at CMakeLists.txt:5 (project):
cmake_minimum_required() should be called prior to this top-level project()
call. Please see the cmake-commands(7) manual for usage documentation of
both commands.
This warning is for project developers. Use -Wno-dev to suppress it.
-- The C compiler identification is GNU 13.3.0
-- The CXX compiler identification is GNU 13.3.0
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check for working C compiler: /usr/bin/cc - skipped
-- Detecting C compile features
-- Detecting C compile features - done
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Check for working CXX compiler: /usr/bin/c++ - skipped
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Success
-- Found Threads: TRUE
-- Found OpenMP_C: -fopenmp (found version "4.5")
-- Found OpenMP_CXX: -fopenmp (found version "4.5")
-- Found OpenMP: TRUE (found version "4.5")
-- Found Vulkan: /usr/lib/aarch64-linux-gnu/libvulkan.so (found version "1.3.275") found components: glslangValidator missing components: glslc
-- CMAKE_INSTALL_PREFIX = /usr/local
-- NCNN_VERSION_STRING = 1.0.20250110
CMake Deprecation Warning at ncnn/CMakeLists.txt:32 (cmake_minimum_required):
Compatibility with CMake < 3.5 will be removed from a future version of
CMake.
Update the VERSION argument <min> value or use a ...<max> suffix to tell
CMake that the project does not need compatibility with older versions.
-- Performing Test NCNN_COMPILER_SUPPORT_ARM_VFPV4
-- Performing Test NCNN_COMPILER_SUPPORT_ARM_VFPV4 - Failed
-- Performing Test NCNN_COMPILER_SUPPORT_ARM_VFPV4_FP16
-- Performing Test NCNN_COMPILER_SUPPORT_ARM_VFPV4_FP16 - Failed
-- Performing Test NCNN_COMPILER_SUPPORT_ARM82_FP16
-- Performing Test NCNN_COMPILER_SUPPORT_ARM82_FP16 - Success
-- Performing Test NCNN_COMPILER_SUPPORT_ARM82_DOTPROD
-- Performing Test NCNN_COMPILER_SUPPORT_ARM82_DOTPROD - Success
-- Performing Test NCNN_COMPILER_SUPPORT_ARM82_FP16FML
-- Performing Test NCNN_COMPILER_SUPPORT_ARM82_FP16FML - Success
-- Performing Test NCNN_COMPILER_SUPPORT_ARM84_BF16
-- Performing Test NCNN_COMPILER_SUPPORT_ARM84_BF16 - Success
-- Performing Test NCNN_COMPILER_SUPPORT_ARM84_I8MM
-- Performing Test NCNN_COMPILER_SUPPORT_ARM84_I8MM - Success
-- Performing Test NCNN_COMPILER_SUPPORT_ARM86_SVE
-- Performing Test NCNN_COMPILER_SUPPORT_ARM86_SVE - Success
-- Performing Test NCNN_COMPILER_SUPPORT_ARM86_SVE2
-- Performing Test NCNN_COMPILER_SUPPORT_ARM86_SVE2 - Success
-- Performing Test NCNN_COMPILER_SUPPORT_ARM86_SVEBF16
-- Performing Test NCNN_COMPILER_SUPPORT_ARM86_SVEBF16 - Success
-- Performing Test NCNN_COMPILER_SUPPORT_ARM86_SVEI8MM
-- Performing Test NCNN_COMPILER_SUPPORT_ARM86_SVEI8MM - Success
-- Performing Test NCNN_COMPILER_SUPPORT_ARM86_SVEF32MM
-- Performing Test NCNN_COMPILER_SUPPORT_ARM86_SVEF32MM - Success
CMake Warning at ncnn/CMakeLists.txt:200 (message):
The compiler does not support arm vfpv4. NCNN_VFPV4 will be OFF.
-- Target arch: arm
CMake Deprecation Warning at ncnn/glslang/CMakeLists.txt:36 (cmake_minimum_required):
Compatibility with CMake < 3.5 will be removed from a future version of
CMake.
Update the VERSION argument <min> value or use a ...<max> suffix to tell
CMake that the project does not need compatibility with older versions.
-- Performing Test COMPILER_HAS_HIDDEN_VISIBILITY
-- Performing Test COMPILER_HAS_HIDDEN_VISIBILITY - Success
-- Performing Test COMPILER_HAS_HIDDEN_INLINE_VISIBILITY
-- Performing Test COMPILER_HAS_HIDDEN_INLINE_VISIBILITY - Success
-- Performing Test COMPILER_HAS_DEPRECATED_ATTR
-- Performing Test COMPILER_HAS_DEPRECATED_ATTR - Success
-- Found OpenMP_C: -fopenmp (found version "4.5")
-- Found OpenMP_CXX: -fopenmp (found version "4.5")
-- Performing Test HAVE_BUILTIN_BSWAP16
-- Performing Test HAVE_BUILTIN_BSWAP16 - Success
-- Performing Test HAVE_BUILTIN_BSWAP32
-- Performing Test HAVE_BUILTIN_BSWAP32 - Success
-- Performing Test HAVE_BUILTIN_BSWAP64
-- Performing Test HAVE_BUILTIN_BSWAP64 - Success
-- Performing Test HAVE_PTHREAD_PRIO_INHERIT
-- Performing Test HAVE_PTHREAD_PRIO_INHERIT - Success
-- Performing Test PTHREAD_CREATE_UNDETACHED
-- Performing Test PTHREAD_CREATE_UNDETACHED - Success
-- Could NOT find OpenGL (missing: OPENGL_opengl_LIBRARY OPENGL_glx_LIBRARY OPENGL_INCLUDE_DIR)
-- Performing Test HAVE_MATH_LIBRARY
-- Performing Test HAVE_MATH_LIBRARY - Failed
-- Adding -lm flag.
-- Found ZLIB: /usr/lib/aarch64-linux-gnu/libz.so (found version "1.3")
-- Found PNG: /usr/lib/aarch64-linux-gnu/libpng.so (found version "1.6.43")
-- Found JPEG: /usr/lib/aarch64-linux-gnu/libjpeg.so (found version "80")
-- Found TIFF: /usr/lib/aarch64-linux-gnu/libtiff.so (found version "4.5.1")
-- Could NOT find GIF (missing: GIF_LIBRARY GIF_INCLUDE_DIR)
-- Looking for 4 include files stdlib.h, ..., float.h
-- Looking for 4 include files stdlib.h, ..., float.h - found
-- Looking for include file dlfcn.h
-- Looking for include file dlfcn.h - found
-- Looking for include file GLUT/glut.h
-- Looking for include file GLUT/glut.h - not found
-- Looking for include file GL/glut.h
-- Looking for include file GL/glut.h - not found
-- Looking for include file inttypes.h
-- Looking for include file inttypes.h - found
-- Looking for include file memory.h
-- Looking for include file memory.h - found
-- Looking for include file OpenGL/glut.h
-- Looking for include file OpenGL/glut.h - not found
-- Looking for include file shlwapi.h
-- Looking for include file shlwapi.h - not found
-- Looking for include file stdint.h
-- Looking for include file stdint.h - found
-- Looking for include file stdlib.h
-- Looking for include file stdlib.h - found
-- Looking for include file strings.h
-- Looking for include file strings.h - found
-- Looking for include file string.h
-- Looking for include file string.h - found
-- Looking for include file sys/stat.h
-- Looking for include file sys/stat.h - found
-- Looking for include file sys/types.h
-- Looking for include file sys/types.h - found
-- Looking for include file unistd.h
-- Looking for include file unistd.h - found
-- Looking for include file wincodec.h
-- Looking for include file wincodec.h - not found
-- Looking for include file windows.h
-- Looking for include file windows.h - not found
-- Performing Test WEBP_HAVE_FLAG_SSE41
-- Performing Test WEBP_HAVE_FLAG_SSE41 - Failed
-- Performing Test WEBP_HAVE_FLAG_SSE41
-- Performing Test WEBP_HAVE_FLAG_SSE41 - Failed
-- Performing Test HAS_COMPILE_FLAG
-- Performing Test HAS_COMPILE_FLAG - Failed
-- Performing Test WEBP_HAVE_FLAG_SSE2
-- Performing Test WEBP_HAVE_FLAG_SSE2 - Failed
-- Performing Test WEBP_HAVE_FLAG_SSE2
-- Performing Test WEBP_HAVE_FLAG_SSE2 - Failed
-- Performing Test HAS_COMPILE_FLAG
-- Performing Test HAS_COMPILE_FLAG - Failed
-- Performing Test WEBP_HAVE_FLAG_MIPS32
-- Performing Test WEBP_HAVE_FLAG_MIPS32 - Failed
-- Performing Test WEBP_HAVE_FLAG_MIPS32
-- Performing Test WEBP_HAVE_FLAG_MIPS32 - Failed
-- Performing Test WEBP_HAVE_FLAG_MIPS_DSP_R2
-- Performing Test WEBP_HAVE_FLAG_MIPS_DSP_R2 - Failed
-- Performing Test WEBP_HAVE_FLAG_MIPS_DSP_R2
-- Performing Test WEBP_HAVE_FLAG_MIPS_DSP_R2 - Failed
-- Performing Test HAS_COMPILE_FLAG
-- Performing Test HAS_COMPILE_FLAG - Failed
-- Performing Test WEBP_HAVE_FLAG_NEON
-- Performing Test WEBP_HAVE_FLAG_NEON - Success
-- Performing Test WEBP_HAVE_FLAG_MSA
-- Performing Test WEBP_HAVE_FLAG_MSA - Failed
-- Performing Test WEBP_HAVE_FLAG_MSA
-- Performing Test WEBP_HAVE_FLAG_MSA - Failed
-- Performing Test HAS_COMPILE_FLAG
-- Performing Test HAS_COMPILE_FLAG - Failed
-- Configuring done (10.9s)
-- Generating done (0.1s)
-- Build files have been written to: /home/sprout1345/waifu2x-ncnn-vulkan/build
중간에 오류가 나면 이를 조치한다.



아래 접은 글은 make 출력 전문이다.
sprout1345@sproutpi:~/waifu2x-ncnn-vulkan/build$ make -j$(nproc)
[ 1%] Building C object libwebp/CMakeFiles/webpdsp.dir/src/dsp/alpha_processing.c.o
[ 1%] Building C object libwebp/CMakeFiles/webpdecode.dir/src/dec/alpha_dec.c.o
[ 1%] Building C object libwebp/CMakeFiles/webputils.dir/src/utils/bit_reader_utils.c.o
[ 1%] Building C object libwebp/CMakeFiles/webpencode.dir/src/enc/alpha_enc.c.o
[ 1%] Building C object libwebp/CMakeFiles/webpdsp.dir/src/dsp/cpu.c.o
[ 1%] Building C object libwebp/CMakeFiles/webputils.dir/src/utils/color_cache_utils.c.o
[ 1%] Building C object libwebp/CMakeFiles/webpdecode.dir/src/dec/buffer_dec.c.o
[ 2%] Building C object libwebp/CMakeFiles/webpencode.dir/src/enc/analysis_enc.c.o
[ 2%] Building C object libwebp/CMakeFiles/webpdsp.dir/src/dsp/dec.c.o
[ 2%] Building C object libwebp/CMakeFiles/webputils.dir/src/utils/filters_utils.c.o
[ 2%] Building C object libwebp/CMakeFiles/webpdecode.dir/src/dec/frame_dec.c.o
[ 3%] Building C object libwebp/CMakeFiles/webpdsp.dir/src/dsp/dec_clip_tables.c.o
[ 3%] Building C object libwebp/CMakeFiles/webpencode.dir/src/enc/backward_references_cost_enc.c.o
[ 3%] Building C object libwebp/CMakeFiles/webputils.dir/src/utils/huffman_utils.c.o
[ 3%] Building C object libwebp/CMakeFiles/webpdsp.dir/src/dsp/filters.c.o
[ 3%] Building C object libwebp/CMakeFiles/webpdecode.dir/src/dec/idec_dec.c.o
[ 4%] Building C object libwebp/CMakeFiles/webputils.dir/src/utils/quant_levels_dec_utils.c.o
[ 4%] Building C object libwebp/CMakeFiles/webpdsp.dir/src/dsp/lossless.c.o
[ 4%] Building C object libwebp/CMakeFiles/webpencode.dir/src/enc/backward_references_enc.c.o
[ 4%] Building C object libwebp/CMakeFiles/webputils.dir/src/utils/rescaler_utils.c.o
[ 5%] Building C object libwebp/CMakeFiles/webpdecode.dir/src/dec/io_dec.c.o
[ 5%] Building C object libwebp/CMakeFiles/webputils.dir/src/utils/random_utils.c.o
[ 5%] Building C object libwebp/CMakeFiles/webpencode.dir/src/enc/config_enc.c.o
[ 5%] Building C object libwebp/CMakeFiles/webpdsp.dir/src/dsp/rescaler.c.o
[ 5%] Building C object libwebp/CMakeFiles/webputils.dir/src/utils/thread_utils.c.o
[ 5%] Building C object libwebp/CMakeFiles/webpencode.dir/src/enc/cost_enc.c.o
[ 5%] Building C object libwebp/CMakeFiles/webpdecode.dir/src/dec/quant_dec.c.o
[ 6%] Building C object libwebp/CMakeFiles/webpdsp.dir/src/dsp/upsampling.c.o
[ 7%] Building C object libwebp/CMakeFiles/webputils.dir/src/utils/utils.c.o
[ 7%] Building C object libwebp/CMakeFiles/webpdecode.dir/src/dec/tree_dec.c.o
[ 8%] Building C object libwebp/CMakeFiles/webpencode.dir/src/enc/filter_enc.c.o
[ 8%] Building C object libwebp/CMakeFiles/webpdsp.dir/src/dsp/yuv.c.o
[ 8%] Building C object libwebp/CMakeFiles/webputils.dir/src/utils/bit_writer_utils.c.o
[ 8%] Building C object libwebp/CMakeFiles/webpencode.dir/src/enc/frame_enc.c.o
[ 8%] Building C object libwebp/CMakeFiles/webpdecode.dir/src/dec/vp8_dec.c.o
[ 8%] Building C object libwebp/CMakeFiles/webpdsp.dir/src/dsp/alpha_processing_neon.c.o
[ 8%] Building C object libwebp/CMakeFiles/webputils.dir/src/utils/huffman_encode_utils.c.o
[ 8%] Building C object libwebp/CMakeFiles/webputils.dir/src/utils/quant_levels_utils.c.o
[ 9%] Building C object libwebp/CMakeFiles/webpdecode.dir/src/dec/vp8l_dec.c.o
[ 9%] Building C object libwebp/CMakeFiles/webpencode.dir/src/enc/histogram_enc.c.o
[ 9%] Built target webputils
[ 9%] Preprocessing shader source waifu2x_preproc.comp
[ 10%] Preprocessing shader source waifu2x_postproc.comp
[ 10%] Preprocessing shader source waifu2x_preproc_tta.comp
[ 10%] Building C object libwebp/CMakeFiles/webpencode.dir/src/enc/iterator_enc.c.o
[ 10%] Preprocessing shader source waifu2x_postproc_tta.comp
[ 10%] Building C object libwebp/CMakeFiles/webpdecode.dir/src/dec/webp_dec.c.o
[ 10%] Built target generate-spirv
[ 10%] Building C object libwebp/CMakeFiles/webpdsp.dir/src/dsp/dec_neon.c.o
[ 10%] Building CXX object ncnn/glslang/glslang/CMakeFiles/GenericCodeGen.dir/GenericCodeGen/CodeGen.cpp.o
[ 11%] Building C object libwebp/CMakeFiles/webpencode.dir/src/enc/near_lossless_enc.c.o
[ 11%] Built target webpdecode
[ 11%] Building CXX object ncnn/glslang/glslang/CMakeFiles/GenericCodeGen.dir/GenericCodeGen/Link.cpp.o
[ 11%] Building C object libwebp/CMakeFiles/webpencode.dir/src/enc/picture_enc.c.o
[ 11%] Building C object libwebp/CMakeFiles/webpencode.dir/src/enc/picture_csp_enc.c.o
[ 11%] Building C object libwebp/CMakeFiles/webpencode.dir/src/enc/picture_psnr_enc.c.o
[ 12%] Building C object libwebp/CMakeFiles/webpdsp.dir/src/dsp/filters_neon.c.o
[ 13%] Building C object libwebp/CMakeFiles/webpencode.dir/src/enc/picture_rescale_enc.c.o
[ 13%] Building C object libwebp/CMakeFiles/webpencode.dir/src/enc/picture_tools_enc.c.o
[ 13%] Building CXX object ncnn/glslang/OGLCompilersDLL/CMakeFiles/OGLCompiler.dir/InitializeDll.cpp.o
[ 13%] Building C object libwebp/CMakeFiles/webpencode.dir/src/enc/predictor_enc.c.o
[ 13%] Linking CXX static library libGenericCodeGen.a
[ 13%] Built target GenericCodeGen
[ 13%] Building CXX object ncnn/glslang/glslang/OSDependent/Unix/CMakeFiles/OSDependent.dir/ossource.cpp.o
[ 13%] Building C object libwebp/CMakeFiles/webpencode.dir/src/enc/quant_enc.c.o
[ 13%] Linking CXX static library libOSDependent.a
[ 13%] Building C object libwebp/CMakeFiles/webpdsp.dir/src/dsp/lossless_neon.c.o
[ 13%] Built target OSDependent
[ 14%] Linking CXX static library libOGLCompiler.a
[ 14%] Preprocessing shader source convolution.comp
[ 14%] Preprocessing shader source convolution_1x1s1d1.comp
[ 14%] Built target OGLCompiler
[ 15%] Preprocessing shader source convolution_3x3s1d1_winograd23_transform_input.comp
[ 15%] Building C object libwebp/CMakeFiles/webpdspdecode.dir/src/dsp/alpha_processing.c.o
[ 15%] Preprocessing shader source convolution_3x3s1d1_winograd23_transform_output.comp
[ 16%] Building C object libwebp/CMakeFiles/webpdspdecode.dir/src/dsp/cpu.c.o
[ 16%] Preprocessing shader source convolution_3x3s1d1_winograd43_transform_input.comp
[ 16%] Building C object libwebp/CMakeFiles/webpdspdecode.dir/src/dsp/dec.c.o
[ 16%] Preprocessing shader source convolution_3x3s1d1_winograd43_transform_output.comp
[ 17%] Preprocessing shader source convolution_3x3s1d1_winograd_gemm.comp
[ 17%] Building C object libwebp/CMakeFiles/webpdspdecode.dir/src/dsp/dec_clip_tables.c.o
[ 17%] Building C object libwebp/CMakeFiles/webpdsp.dir/src/dsp/rescaler_neon.c.o
[ 17%] Preprocessing shader source convolution_gemm.comp
[ 18%] Building C object libwebp/CMakeFiles/webpencode.dir/src/enc/syntax_enc.c.o
[ 18%] Building C object libwebp/CMakeFiles/webpdspdecode.dir/src/dsp/filters.c.o
[ 18%] Preprocessing shader source convolution_pack1to4.comp
[ 19%] Building C object libwebp/CMakeFiles/webpdspdecode.dir/src/dsp/lossless.c.o
[ 19%] Preprocessing shader source convolution_pack1to4_1x1s1d1.comp
[ 19%] Building C object libwebp/CMakeFiles/webpencode.dir/src/enc/token_enc.c.o
[ 20%] Preprocessing shader source convolution_pack1to4_3x3s1d1_winograd_gemm.comp
[ 20%] Preprocessing shader source convolution_pack1to4_gemm.comp
[ 20%] Building C object libwebp/CMakeFiles/webpencode.dir/src/enc/tree_enc.c.o
[ 20%] Preprocessing shader source convolution_pack1to8.comp
[ 20%] Building C object libwebp/CMakeFiles/webpdspdecode.dir/src/dsp/rescaler.c.o
[ 20%] Preprocessing shader source convolution_pack1to8_1x1s1d1.comp
[ 20%] Building C object libwebp/CMakeFiles/webpencode.dir/src/enc/vp8l_enc.c.o
[ 20%] Preprocessing shader source convolution_pack1to8_3x3s1d1_winograd_gemm.comp
[ 20%] Building C object libwebp/CMakeFiles/webpdspdecode.dir/src/dsp/upsampling.c.o
[ 21%] Preprocessing shader source convolution_pack1to8_gemm.comp
[ 21%] Building C object libwebp/CMakeFiles/webpdsp.dir/src/dsp/upsampling_neon.c.o
[ 21%] Preprocessing shader source convolution_pack4.comp
[ 21%] Building C object libwebp/CMakeFiles/webpdspdecode.dir/src/dsp/yuv.c.o
[ 21%] Preprocessing shader source convolution_pack4_1x1s1d1.comp
[ 22%] Building C object libwebp/CMakeFiles/webpencode.dir/src/enc/webp_enc.c.o
[ 22%] Preprocessing shader source convolution_pack4_1x1s1d1_cm_16_8_8.comp
[ 23%] Building C object libwebp/CMakeFiles/webpdspdecode.dir/src/dsp/alpha_processing_neon.c.o
[ 24%] Preprocessing shader source convolution_pack4_3x3s1d1_winograd23_transform_input.comp
[ 24%] Built target webpencode
[ 24%] Preprocessing shader source convolution_pack4_3x3s1d1_winograd23_transform_output.comp
[ 24%] Preprocessing shader source convolution_pack4_3x3s1d1_winograd43_transform_input.comp
[ 24%] Preprocessing shader source convolution_pack4_3x3s1d1_winograd43_transform_output.comp
[ 25%] Preprocessing shader source convolution_pack4_3x3s1d1_winograd_gemm.comp
[ 26%] Building C object libwebp/CMakeFiles/webpdsp.dir/src/dsp/yuv_neon.c.o
[ 26%] Preprocessing shader source convolution_pack4_3x3s1d1_winograd_gemm_cm_16_8_8.comp
[ 26%] Preprocessing shader source convolution_pack4_gemm.comp
[ 26%] Preprocessing shader source convolution_pack4_gemm_cm_16_8_8.comp
[ 27%] Preprocessing shader source convolution_pack4to1.comp
[ 27%] Building C object libwebp/CMakeFiles/webpdspdecode.dir/src/dsp/dec_neon.c.o
[ 27%] Building C object libwebp/CMakeFiles/webpdsp.dir/src/dsp/cost.c.o
[ 27%] Preprocessing shader source convolution_pack4to1_1x1s1d1.comp
[ 27%] Preprocessing shader source convolution_pack4to1_3x3s1d1_winograd_gemm.comp
[ 27%] Preprocessing shader source convolution_pack4to1_gemm.comp
[ 28%] Preprocessing shader source convolution_pack4to8.comp
[ 28%] Preprocessing shader source convolution_pack4to8_1x1s1d1.comp
[ 28%] Preprocessing shader source convolution_pack4to8_3x3s1d1_winograd_gemm.comp
[ 28%] Building C object libwebp/CMakeFiles/webpdsp.dir/src/dsp/enc.c.o
[ 28%] Building C object libwebp/CMakeFiles/webpdsp.dir/src/dsp/lossless_enc.c.o
[ 28%] Preprocessing shader source convolution_pack4to8_gemm.comp
[ 28%] Preprocessing shader source convolution_pack8.comp
[ 29%] Preprocessing shader source convolution_pack8_1x1s1d1.comp
[ 30%] Building C object libwebp/CMakeFiles/webpdsp.dir/src/dsp/ssim.c.o
[ 30%] Preprocessing shader source convolution_pack8_3x3s1d1_winograd23_transform_input.comp
[ 31%] Building C object libwebp/CMakeFiles/webputilsdecode.dir/src/utils/bit_reader_utils.c.o
[ 31%] Building C object libwebp/CMakeFiles/webpdsp.dir/src/dsp/cost_neon.c.o
[ 31%] Preprocessing shader source convolution_pack8_3x3s1d1_winograd23_transform_output.comp
[ 31%] Building C object libwebp/CMakeFiles/webpdspdecode.dir/src/dsp/filters_neon.c.o
[ 31%] Building C object libwebp/CMakeFiles/webputilsdecode.dir/src/utils/color_cache_utils.c.o
[ 31%] Preprocessing shader source convolution_pack8_3x3s1d1_winograd43_transform_input.comp
[ 31%] Building C object libwebp/CMakeFiles/webputilsdecode.dir/src/utils/filters_utils.c.o
[ 32%] Preprocessing shader source convolution_pack8_3x3s1d1_winograd43_transform_output.comp
[ 32%] Preprocessing shader source convolution_pack8_3x3s1d1_winograd_gemm.comp
[ 32%] Building C object libwebp/CMakeFiles/webputilsdecode.dir/src/utils/huffman_utils.c.o
[ 32%] Preprocessing shader source convolution_pack8_gemm.comp
[ 32%] Preprocessing shader source convolution_pack8to1.comp
[ 33%] Building C object libwebp/CMakeFiles/webputilsdecode.dir/src/utils/quant_levels_dec_utils.c.o
[ 34%] Preprocessing shader source convolution_pack8to1_1x1s1d1.comp
[ 34%] Building C object libwebp/CMakeFiles/webpdspdecode.dir/src/dsp/lossless_neon.c.o
[ 34%] Building C object libwebp/CMakeFiles/webpdsp.dir/src/dsp/enc_neon.c.o
[ 34%] Preprocessing shader source convolution_pack8to1_3x3s1d1_winograd_gemm.comp
[ 34%] Building C object libwebp/CMakeFiles/webputilsdecode.dir/src/utils/rescaler_utils.c.o
[ 34%] Preprocessing shader source convolution_pack8to1_gemm.comp
[ 34%] Preprocessing shader source convolution_pack8to4.comp
[ 34%] Building C object libwebp/CMakeFiles/webputilsdecode.dir/src/utils/random_utils.c.o
[ 35%] Building C object libwebp/CMakeFiles/webputilsdecode.dir/src/utils/thread_utils.c.o
[ 35%] Preprocessing shader source convolution_pack8to4_1x1s1d1.comp
[ 35%] Preprocessing shader source convolution_pack8to4_3x3s1d1_winograd_gemm.comp
[ 36%] Building C object libwebp/CMakeFiles/webputilsdecode.dir/src/utils/utils.c.o
[ 36%] Preprocessing shader source convolution_pack8to4_gemm.comp
[ 36%] Preprocessing shader source crop.comp
[ 36%] Built target webputilsdecode
[ 37%] Preprocessing shader source crop_pack1to4.comp
[ 37%] Building C object libwebp/CMakeFiles/webpdsp.dir/src/dsp/lossless_enc_neon.c.o
[ 38%] Building C object libwebp/CMakeFiles/webpdspdecode.dir/src/dsp/rescaler_neon.c.o
[ 38%] Preprocessing shader source crop_pack1to8.comp
[ 38%] Building C object libwebp/CMakeFiles/webpdspdecode.dir/src/dsp/upsampling_neon.c.o
[ 38%] Preprocessing shader source crop_pack4.comp
[ 38%] Preprocessing shader source crop_pack4to1.comp
[ 38%] Preprocessing shader source crop_pack4to8.comp
[ 39%] Preprocessing shader source crop_pack8.comp
[ 39%] Preprocessing shader source crop_pack8to1.comp
[ 39%] Built target webpdsp
[ 39%] Preprocessing shader source crop_pack8to4.comp
[ 39%] Preprocessing shader source deconvolution.comp
[ 40%] Preprocessing shader source deconvolution_col2im.comp
[ 40%] Building CXX object ncnn/glslang/glslang/CMakeFiles/MachineIndependent.dir/MachineIndependent/glslang_tab.cpp.o
[ 41%] Building CXX object ncnn/glslang/glslang/CMakeFiles/MachineIndependent.dir/MachineIndependent/attribute.cpp.o
[ 41%] Preprocessing shader source deconvolution_gemm.comp
[ 41%] Preprocessing shader source deconvolution_pack1to4.comp
[ 41%] Preprocessing shader source deconvolution_pack1to4_gemm.comp
[ 41%] Building C object libwebp/CMakeFiles/webpdspdecode.dir/src/dsp/yuv_neon.c.o
[ 42%] Preprocessing shader source deconvolution_pack1to8.comp
[ 42%] Preprocessing shader source deconvolution_pack1to8_gemm.comp
[ 42%] Preprocessing shader source deconvolution_pack4.comp
[ 42%] Preprocessing shader source deconvolution_pack4_col2im.comp
[ 43%] Preprocessing shader source deconvolution_pack4_gemm.comp
[ 43%] Preprocessing shader source deconvolution_pack4_gemm_cm_16_8_8.comp
[ 43%] Preprocessing shader source deconvolution_pack4to1.comp
[ 43%] Preprocessing shader source deconvolution_pack4to1_gemm.comp
[ 43%] Built target webpdspdecode
[ 44%] Preprocessing shader source deconvolution_pack4to8.comp
[ 44%] Linking C static library libwebp.a
[ 44%] Preprocessing shader source deconvolution_pack4to8_gemm.comp
[ 44%] Preprocessing shader source deconvolution_pack8.comp
[ 44%] Preprocessing shader source deconvolution_pack8_col2im.comp
[ 44%] Preprocessing shader source deconvolution_pack8_gemm.comp
[ 44%] Built target webp
[ 45%] Preprocessing shader source deconvolution_pack8to1.comp
[ 45%] Building CXX object ncnn/glslang/glslang/CMakeFiles/MachineIndependent.dir/MachineIndependent/Constant.cpp.o
[ 45%] Preprocessing shader source deconvolution_pack8to1_gemm.comp
[ 45%] Preprocessing shader source deconvolution_pack8to4.comp
[ 45%] Preprocessing shader source deconvolution_pack8to4_gemm.comp
[ 46%] Preprocessing shader source eltwise.comp
[ 46%] Preprocessing shader source eltwise_pack4.comp
[ 46%] Preprocessing shader source eltwise_pack8.comp
[ 46%] Preprocessing shader source flatten.comp
[ 47%] Preprocessing shader source flatten_pack1to4.comp
[ 47%] Preprocessing shader source flatten_pack1to8.comp
[ 47%] Preprocessing shader source flatten_pack4.comp
[ 47%] Preprocessing shader source flatten_pack4to8.comp
[ 48%] Preprocessing shader source flatten_pack8.comp
[ 48%] Preprocessing shader source innerproduct.comp
[ 48%] Preprocessing shader source innerproduct_gemm.comp
[ 48%] Preprocessing shader source innerproduct_gemm_wp1to4.comp
[ 49%] Preprocessing shader source innerproduct_gemm_wp1to8.comp
[ 49%] Preprocessing shader source innerproduct_gemm_wp4.comp
[ 49%] Preprocessing shader source innerproduct_gemm_wp4to1.comp
[ 49%] Preprocessing shader source innerproduct_gemm_wp4to8.comp
[ 49%] Preprocessing shader source innerproduct_gemm_wp8.comp
[ 50%] Preprocessing shader source innerproduct_gemm_wp8to1.comp
[ 50%] Preprocessing shader source innerproduct_gemm_wp8to4.comp
[ 50%] Preprocessing shader source innerproduct_pack1to4.comp
[ 50%] Preprocessing shader source innerproduct_pack1to8.comp
[ 51%] Preprocessing shader source innerproduct_pack4.comp
[ 51%] Preprocessing shader source innerproduct_pack4to1.comp
[ 51%] Linking C static library libwebpdecoder.a
[ 51%] Preprocessing shader source innerproduct_pack4to8.comp
[ 51%] Preprocessing shader source innerproduct_pack8.comp
[ 52%] Preprocessing shader source innerproduct_pack8to1.comp
[ 52%] Built target webpdecoder
[ 52%] Preprocessing shader source innerproduct_pack8to4.comp
[ 52%] Building C object libwebp/CMakeFiles/webpdemux.dir/src/demux/anim_decode.c.o
[ 52%] Preprocessing shader source innerproduct_reduce_sum8.comp
[ 52%] Preprocessing shader source innerproduct_reduce_sum8_pack4.comp
[ 53%] Preprocessing shader source innerproduct_reduce_sum8_pack8.comp
[ 54%] Building C object libwebp/CMakeFiles/webpdemux.dir/src/demux/demux.c.o
[ 54%] Preprocessing shader source innerproduct_sum8.comp
[ 54%] Preprocessing shader source innerproduct_sum8_pack1to4.comp
[ 54%] Preprocessing shader source innerproduct_sum8_pack1to8.comp
[ 55%] Preprocessing shader source innerproduct_sum8_pack4.comp
[ 55%] Linking C static library libwebpdemux.a
[ 55%] Preprocessing shader source innerproduct_sum8_pack4to1.comp
[ 55%] Built target webpdemux
[ 55%] Preprocessing shader source innerproduct_sum8_pack4to8.comp
[ 55%] Building CXX object ncnn/glslang/glslang/CMakeFiles/MachineIndependent.dir/MachineIndependent/iomapper.cpp.o
[ 55%] Preprocessing shader source innerproduct_sum8_pack8.comp
[ 55%] Preprocessing shader source innerproduct_sum8_pack8to1.comp
[ 56%] Preprocessing shader source innerproduct_sum8_pack8to4.comp
[ 56%] Preprocessing shader source pooling.comp
[ 56%] Preprocessing shader source pooling_adaptive.comp
[ 56%] Preprocessing shader source pooling_adaptive_pack4.comp
[ 57%] Preprocessing shader source pooling_adaptive_pack8.comp
[ 57%] Preprocessing shader source pooling_global.comp
[ 57%] Preprocessing shader source pooling_global_pack4.comp
[ 57%] Preprocessing shader source pooling_global_pack8.comp
[ 58%] Preprocessing shader source pooling_pack4.comp
[ 58%] Preprocessing shader source pooling_pack8.comp
[ 58%] Preprocessing shader source relu.comp
[ 58%] Preprocessing shader source relu_pack4.comp
[ 58%] Building CXX object ncnn/glslang/glslang/CMakeFiles/MachineIndependent.dir/MachineIndependent/InfoSink.cpp.o
[ 59%] Preprocessing shader source relu_pack8.comp
[ 59%] Preprocessing shader source scale.comp
[ 59%] Preprocessing shader source scale_pack4.comp
[ 59%] Preprocessing shader source scale_pack8.comp
[ 59%] Preprocessing shader source padding.comp
[ 59%] Preprocessing shader source padding_3d.comp
[ 60%] Preprocessing shader source padding_3d_pack4.comp
[ 60%] Preprocessing shader source padding_3d_pack8.comp
[ 60%] Preprocessing shader source padding_pack1to4.comp
[ 60%] Preprocessing shader source padding_pack1to8.comp
[ 60%] Preprocessing shader source padding_pack4.comp
[ 61%] Preprocessing shader source padding_pack4to1.comp
[ 61%] Preprocessing shader source padding_pack4to8.comp
[ 62%] Building CXX object ncnn/glslang/glslang/CMakeFiles/MachineIndependent.dir/MachineIndependent/Initialize.cpp.o
[ 62%] Preprocessing shader source padding_pack8.comp
[ 62%] Preprocessing shader source padding_pack8to1.comp
[ 63%] Preprocessing shader source padding_pack8to4.comp
[ 63%] Preprocessing shader source interp.comp
[ 63%] Preprocessing shader source interp_bicubic.comp
[ 63%] Preprocessing shader source interp_bicubic_coeffs.comp
[ 64%] Preprocessing shader source interp_bicubic_pack4.comp
[ 64%] Building CXX object ncnn/glslang/glslang/CMakeFiles/MachineIndependent.dir/MachineIndependent/IntermTraverse.cpp.o
[ 64%] Preprocessing shader source interp_bicubic_pack8.comp
[ 64%] Preprocessing shader source interp_pack4.comp
[ 64%] Preprocessing shader source interp_pack8.comp
[ 65%] Preprocessing shader source packing.comp
[ 65%] Preprocessing shader source packing_fp16_to_fp32.comp
[ 65%] Preprocessing shader source packing_fp32_to_fp16.comp
[ 65%] Preprocessing shader source packing_pack1to4.comp
[ 66%] Preprocessing shader source packing_pack1to4_fp16_to_fp32.comp
[ 66%] Preprocessing shader source packing_pack1to4_fp32_to_fp16.comp
[ 66%] Preprocessing shader source packing_pack1to8.comp
[ 66%] Preprocessing shader source packing_pack1to8_fp16_to_fp32.comp
[ 67%] Preprocessing shader source packing_pack1to8_fp32_to_fp16.comp
[ 67%] Preprocessing shader source packing_pack4.comp
[ 67%] Preprocessing shader source packing_pack4_fp16_to_fp32.comp
[ 67%] Preprocessing shader source packing_pack4_fp32_to_fp16.comp
[ 67%] Preprocessing shader source packing_pack4to1.comp
[ 68%] Preprocessing shader source packing_pack4to1_fp16_to_fp32.comp
[ 68%] Preprocessing shader source packing_pack4to1_fp32_to_fp16.comp
[ 68%] Preprocessing shader source packing_pack4to8.comp
[ 68%] Preprocessing shader source packing_pack4to8_fp16_to_fp32.comp
[ 69%] Preprocessing shader source packing_pack4to8_fp32_to_fp16.comp
[ 69%] Preprocessing shader source packing_pack8.comp
[ 69%] Preprocessing shader source packing_pack8_fp16_to_fp32.comp
[ 69%] Preprocessing shader source packing_pack8_fp32_to_fp16.comp
[ 70%] Preprocessing shader source packing_pack8to1.comp
[ 70%] Preprocessing shader source packing_pack8to1_fp16_to_fp32.comp
[ 70%] Preprocessing shader source packing_pack8to1_fp32_to_fp16.comp
[ 70%] Preprocessing shader source packing_pack8to4.comp
[ 71%] Preprocessing shader source packing_pack8to4_fp16_to_fp32.comp
[ 71%] Building CXX object ncnn/glslang/glslang/CMakeFiles/MachineIndependent.dir/MachineIndependent/Intermediate.cpp.o
[ 71%] Preprocessing shader source packing_pack8to4_fp32_to_fp16.comp
[ 71%] Preprocessing shader source cast_fp16_to_fp32.comp
[ 72%] Preprocessing shader source cast_fp16_to_fp32_pack4.comp
[ 72%] Preprocessing shader source cast_fp16_to_fp32_pack8.comp
[ 72%] Preprocessing shader source cast_fp32_to_fp16.comp
[ 72%] Preprocessing shader source cast_fp32_to_fp16_pack4.comp
[ 73%] Preprocessing shader source cast_fp32_to_fp16_pack8.comp
[ 73%] Building CXX object ncnn/glslang/glslang/CMakeFiles/MachineIndependent.dir/MachineIndependent/ParseContextBase.cpp.o
[ 73%] Preprocessing shader source convert_ycbcr.comp
[ 73%] Preprocessing shader source vulkan_activation.comp
[ 73%] Built target ncnn-generate-spirv
[ 74%] Building CXX object ncnn/glslang/glslang/CMakeFiles/MachineIndependent.dir/MachineIndependent/ParseHelper.cpp.o
[ 74%] Building CXX object ncnn/glslang/glslang/CMakeFiles/MachineIndependent.dir/MachineIndependent/PoolAlloc.cpp.o
[ 74%] Building CXX object ncnn/glslang/glslang/CMakeFiles/MachineIndependent.dir/MachineIndependent/RemoveTree.cpp.o
[ 74%] Building CXX object ncnn/glslang/glslang/CMakeFiles/MachineIndependent.dir/MachineIndependent/Scan.cpp.o
[ 75%] Building CXX object ncnn/glslang/glslang/CMakeFiles/MachineIndependent.dir/MachineIndependent/ShaderLang.cpp.o
[ 75%] Building CXX object ncnn/glslang/glslang/CMakeFiles/MachineIndependent.dir/MachineIndependent/SpirvIntrinsics.cpp.o
[ 75%] Building CXX object ncnn/glslang/glslang/CMakeFiles/MachineIndependent.dir/MachineIndependent/SymbolTable.cpp.o
[ 75%] Building CXX object ncnn/glslang/glslang/CMakeFiles/MachineIndependent.dir/MachineIndependent/Versions.cpp.o
[ 76%] Building CXX object ncnn/glslang/glslang/CMakeFiles/MachineIndependent.dir/MachineIndependent/intermOut.cpp.o
[ 76%] Building CXX object ncnn/glslang/glslang/CMakeFiles/MachineIndependent.dir/MachineIndependent/limits.cpp.o
[ 76%] Building CXX object ncnn/glslang/glslang/CMakeFiles/MachineIndependent.dir/MachineIndependent/linkValidate.cpp.o
[ 76%] Building CXX object ncnn/glslang/glslang/CMakeFiles/MachineIndependent.dir/MachineIndependent/parseConst.cpp.o
[ 76%] Building CXX object ncnn/glslang/glslang/CMakeFiles/MachineIndependent.dir/MachineIndependent/reflection.cpp.o
[ 77%] Building CXX object ncnn/glslang/glslang/CMakeFiles/MachineIndependent.dir/MachineIndependent/preprocessor/Pp.cpp.o
[ 77%] Building CXX object ncnn/glslang/glslang/CMakeFiles/MachineIndependent.dir/MachineIndependent/preprocessor/PpAtom.cpp.o
[ 77%] Building CXX object ncnn/glslang/glslang/CMakeFiles/MachineIndependent.dir/MachineIndependent/preprocessor/PpContext.cpp.o
[ 77%] Building CXX object ncnn/glslang/glslang/CMakeFiles/MachineIndependent.dir/MachineIndependent/preprocessor/PpScanner.cpp.o
[ 78%] Building CXX object ncnn/glslang/glslang/CMakeFiles/MachineIndependent.dir/MachineIndependent/preprocessor/PpTokens.cpp.o
[ 78%] Building CXX object ncnn/glslang/glslang/CMakeFiles/MachineIndependent.dir/MachineIndependent/propagateNoContraction.cpp.o
[ 78%] Linking CXX static library libMachineIndependent.a
[ 78%] Built target MachineIndependent
[ 80%] Building CXX object ncnn/glslang/SPIRV/CMakeFiles/SPIRV.dir/InReadableOrder.cpp.o
[ 80%] Building CXX object ncnn/glslang/SPIRV/CMakeFiles/SPIRV.dir/GlslangToSpv.cpp.o
[ 80%] Building CXX object ncnn/glslang/SPIRV/CMakeFiles/SPIRV.dir/Logger.cpp.o
[ 80%] Building CXX object ncnn/glslang/glslang/CMakeFiles/glslang.dir/CInterface/glslang_c_interface.cpp.o
[ 80%] Building CXX object ncnn/glslang/SPIRV/CMakeFiles/SPIRV.dir/SpvBuilder.cpp.o
[ 80%] Building CXX object ncnn/glslang/SPIRV/CMakeFiles/SPIRV.dir/SpvPostProcess.cpp.o
[ 80%] Linking CXX static library libglslang.a
[ 80%] Built target glslang
[ 81%] Building CXX object ncnn/glslang/SPIRV/CMakeFiles/SPIRV.dir/doc.cpp.o
/home/sprout1345/waifu2x-ncnn-vulkan/src/ncnn/glslang/SPIRV/GlslangToSpv.cpp: In member function ‘void {anonymous}::TGlslangToSpvTraverser::TranslateLiterals(const glslang::TVector<const glslang::TIntermConstantUnion*>&, std::vector<unsigned int>&) const’:
/home/sprout1345/waifu2x-ncnn-vulkan/src/ncnn/glslang/SPIRV/GlslangToSpv.cpp:1331:33: warning: dereferencing type-punned pointer will break strict-aliasing rules [-Wstrict-aliasing]
1331 | unsigned literal = *reinterpret_cast<unsigned*>(&floatValue);
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
[ 81%] Building CXX object ncnn/glslang/SPIRV/CMakeFiles/SPIRV.dir/SpvTools.cpp.o
[ 81%] Building CXX object ncnn/glslang/SPIRV/CMakeFiles/SPIRV.dir/disassemble.cpp.o
/home/sprout1345/waifu2x-ncnn-vulkan/src/ncnn/glslang/SPIRV/GlslangToSpv.cpp: In member function ‘spv::Id {anonymous}::TGlslangToSpvTraverser::convertGlslangToSpvType(const glslang::TType&, glslang::TLayoutPacking, const glslang::TQualifier&, bool, bool)’:
/home/sprout1345/waifu2x-ncnn-vulkan/src/ncnn/glslang/SPIRV/GlslangToSpv.cpp:4203:41: warning: dereferencing type-punned pointer will break strict-aliasing rules [-Wstrict-aliasing]
4203 | unsigned literal = *reinterpret_cast<unsigned*>(&floatValue);
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
[ 81%] Building CXX object ncnn/glslang/SPIRV/CMakeFiles/SPIRV.dir/CInterface/spirv_c_interface.cpp.o
[ 81%] Linking CXX static library libSPIRV.a
[ 81%] Built target SPIRV
[ 82%] Building CXX object ncnn/src/CMakeFiles/ncnn.dir/benchmark.cpp.o
[ 82%] Building CXX object ncnn/src/CMakeFiles/ncnn.dir/allocator.cpp.o
[ 82%] Building CXX object ncnn/src/CMakeFiles/ncnn.dir/blob.cpp.o
[ 82%] Building CXX object ncnn/src/CMakeFiles/ncnn.dir/c_api.cpp.o
[ 82%] Building CXX object ncnn/src/CMakeFiles/ncnn.dir/command.cpp.o
[ 82%] Building CXX object ncnn/src/CMakeFiles/ncnn.dir/cpu.cpp.o
[ 83%] Building CXX object ncnn/src/CMakeFiles/ncnn.dir/datareader.cpp.o
[ 83%] Building CXX object ncnn/src/CMakeFiles/ncnn.dir/gpu.cpp.o
[ 83%] Building CXX object ncnn/src/CMakeFiles/ncnn.dir/layer.cpp.o
[ 83%] Building CXX object ncnn/src/CMakeFiles/ncnn.dir/mat.cpp.o
[ 84%] Building CXX object ncnn/src/CMakeFiles/ncnn.dir/mat_pixel.cpp.o
[ 84%] Building CXX object ncnn/src/CMakeFiles/ncnn.dir/mat_pixel_affine.cpp.o
[ 84%] Building CXX object ncnn/src/CMakeFiles/ncnn.dir/mat_pixel_drawing.cpp.o
[ 84%] Building CXX object ncnn/src/CMakeFiles/ncnn.dir/mat_pixel_resize.cpp.o
[ 84%] Building CXX object ncnn/src/CMakeFiles/ncnn.dir/mat_pixel_rotate.cpp.o
[ 85%] Building CXX object ncnn/src/CMakeFiles/ncnn.dir/modelbin.cpp.o
[ 85%] Building CXX object ncnn/src/CMakeFiles/ncnn.dir/net.cpp.o
[ 85%] Building CXX object ncnn/src/CMakeFiles/ncnn.dir/option.cpp.o
[ 85%] Building CXX object ncnn/src/CMakeFiles/ncnn.dir/paramdict.cpp.o
[ 86%] Building CXX object ncnn/src/CMakeFiles/ncnn.dir/pipeline.cpp.o
[ 86%] Building CXX object ncnn/src/CMakeFiles/ncnn.dir/pipelinecache.cpp.o
[ 86%] Building CXX object ncnn/src/CMakeFiles/ncnn.dir/simpleocv.cpp.o
[ 86%] Building CXX object ncnn/src/CMakeFiles/ncnn.dir/simpleomp.cpp.o
[ 87%] Building CXX object ncnn/src/CMakeFiles/ncnn.dir/simplestl.cpp.o
[ 87%] Building CXX object ncnn/src/CMakeFiles/ncnn.dir/layer/convolution.cpp.o
[ 87%] Building CXX object ncnn/src/CMakeFiles/ncnn.dir/layer/arm/convolution_arm.cpp.o
[ 87%] Building CXX object ncnn/src/CMakeFiles/ncnn.dir/layer/vulkan/convolution_vulkan.cpp.o
[ 88%] Building CXX object ncnn/src/CMakeFiles/ncnn.dir/layer/arm/convolution_arm_asimdhp.cpp.o
[ 88%] Building CXX object ncnn/src/CMakeFiles/ncnn.dir/layer/arm/convolution_arm_asimddp.cpp.o
[ 88%] Building CXX object ncnn/src/CMakeFiles/ncnn.dir/layer/arm/convolution_arm_i8mm.cpp.o
[ 88%] Building CXX object ncnn/src/CMakeFiles/ncnn.dir/layer/crop.cpp.o
[ 89%] Building CXX object ncnn/src/CMakeFiles/ncnn.dir/layer/arm/crop_arm.cpp.o
[ 89%] Building CXX object ncnn/src/CMakeFiles/ncnn.dir/layer/vulkan/crop_vulkan.cpp.o
[ 89%] Building CXX object ncnn/src/CMakeFiles/ncnn.dir/layer/deconvolution.cpp.o
[ 89%] Building CXX object ncnn/src/CMakeFiles/ncnn.dir/layer/arm/deconvolution_arm.cpp.o
[ 89%] Building CXX object ncnn/src/CMakeFiles/ncnn.dir/layer/vulkan/deconvolution_vulkan.cpp.o
[ 90%] Building CXX object ncnn/src/CMakeFiles/ncnn.dir/layer/arm/deconvolution_arm_asimdhp.cpp.o
[ 90%] Building CXX object ncnn/src/CMakeFiles/ncnn.dir/layer/eltwise.cpp.o
[ 90%] Building CXX object ncnn/src/CMakeFiles/ncnn.dir/layer/arm/eltwise_arm.cpp.o
[ 90%] Building CXX object ncnn/src/CMakeFiles/ncnn.dir/layer/vulkan/eltwise_vulkan.cpp.o
[ 91%] Building CXX object ncnn/src/CMakeFiles/ncnn.dir/layer/arm/eltwise_arm_asimdhp.cpp.o
[ 91%] Building CXX object ncnn/src/CMakeFiles/ncnn.dir/layer/flatten.cpp.o
[ 91%] Building CXX object ncnn/src/CMakeFiles/ncnn.dir/layer/arm/flatten_arm.cpp.o
[ 91%] Building CXX object ncnn/src/CMakeFiles/ncnn.dir/layer/vulkan/flatten_vulkan.cpp.o
[ 92%] Building CXX object ncnn/src/CMakeFiles/ncnn.dir/layer/innerproduct.cpp.o
[ 92%] Building CXX object ncnn/src/CMakeFiles/ncnn.dir/layer/arm/innerproduct_arm.cpp.o
[ 92%] Building CXX object ncnn/src/CMakeFiles/ncnn.dir/layer/vulkan/innerproduct_vulkan.cpp.o
[ 92%] Building CXX object ncnn/src/CMakeFiles/ncnn.dir/layer/arm/innerproduct_arm_asimdhp.cpp.o
[ 93%] Building CXX object ncnn/src/CMakeFiles/ncnn.dir/layer/arm/innerproduct_arm_asimdfhm.cpp.o
[ 93%] Building CXX object ncnn/src/CMakeFiles/ncnn.dir/layer/input.cpp.o
[ 93%] Building CXX object ncnn/src/CMakeFiles/ncnn.dir/layer/pooling.cpp.o
[ 93%] Building CXX object ncnn/src/CMakeFiles/ncnn.dir/layer/arm/pooling_arm.cpp.o
[ 94%] Building CXX object ncnn/src/CMakeFiles/ncnn.dir/layer/vulkan/pooling_vulkan.cpp.o
[ 94%] Building CXX object ncnn/src/CMakeFiles/ncnn.dir/layer/arm/pooling_arm_asimdhp.cpp.o
[ 94%] Building CXX object ncnn/src/CMakeFiles/ncnn.dir/layer/relu.cpp.o
[ 94%] Building CXX object ncnn/src/CMakeFiles/ncnn.dir/layer/arm/relu_arm.cpp.o
[ 94%] Building CXX object ncnn/src/CMakeFiles/ncnn.dir/layer/vulkan/relu_vulkan.cpp.o
[ 95%] Building CXX object ncnn/src/CMakeFiles/ncnn.dir/layer/arm/relu_arm_asimdhp.cpp.o
[ 95%] Building CXX object ncnn/src/CMakeFiles/ncnn.dir/layer/scale.cpp.o
[ 95%] Building CXX object ncnn/src/CMakeFiles/ncnn.dir/layer/arm/scale_arm.cpp.o
[ 95%] Building CXX object ncnn/src/CMakeFiles/ncnn.dir/layer/vulkan/scale_vulkan.cpp.o
[ 96%] Building CXX object ncnn/src/CMakeFiles/ncnn.dir/layer/split.cpp.o
[ 96%] Building CXX object ncnn/src/CMakeFiles/ncnn.dir/layer/padding.cpp.o
[ 96%] Building CXX object ncnn/src/CMakeFiles/ncnn.dir/layer/arm/padding_arm.cpp.o
[ 96%] Building CXX object ncnn/src/CMakeFiles/ncnn.dir/layer/vulkan/padding_vulkan.cpp.o
[ 97%] Building CXX object ncnn/src/CMakeFiles/ncnn.dir/layer/interp.cpp.o
[ 97%] Building CXX object ncnn/src/CMakeFiles/ncnn.dir/layer/arm/interp_arm.cpp.o
[ 97%] Building CXX object ncnn/src/CMakeFiles/ncnn.dir/layer/vulkan/interp_vulkan.cpp.o
[ 97%] Building CXX object ncnn/src/CMakeFiles/ncnn.dir/layer/arm/interp_arm_asimdhp.cpp.o
[ 98%] Building CXX object ncnn/src/CMakeFiles/ncnn.dir/layer/packing.cpp.o
[ 98%] Building CXX object ncnn/src/CMakeFiles/ncnn.dir/layer/arm/packing_arm.cpp.o
[ 98%] Building CXX object ncnn/src/CMakeFiles/ncnn.dir/layer/vulkan/packing_vulkan.cpp.o
[ 98%] Building CXX object ncnn/src/CMakeFiles/ncnn.dir/layer/cast.cpp.o
[ 99%] Building CXX object ncnn/src/CMakeFiles/ncnn.dir/layer/arm/cast_arm.cpp.o
[ 99%] Building CXX object ncnn/src/CMakeFiles/ncnn.dir/layer/vulkan/cast_vulkan.cpp.o
[ 99%] Building CXX object ncnn/src/CMakeFiles/ncnn.dir/layer/arm/cast_arm_bf16.cpp.o
[ 99%] Linking CXX static library libncnn.a
[ 99%] Built target ncnn
[100%] Building CXX object CMakeFiles/waifu2x-ncnn-vulkan.dir/waifu2x.cpp.o
[100%] Building CXX object CMakeFiles/waifu2x-ncnn-vulkan.dir/main.cpp.o
In file included from /home/sprout1345/waifu2x-ncnn-vulkan/src/main.cpp:99:
/home/sprout1345/waifu2x-ncnn-vulkan/src/filesystem_utils.h: In function ‘path_t get_executable_directory()’:
/home/sprout1345/waifu2x-ncnn-vulkan/src/filesystem_utils.h:144:13: warning: ignoring return value of ‘ssize_t readlink(const char*, char*, size_t)’ declared with attribute ‘warn_unused_result’ [-Wunused-result]
144 | readlink("/proc/self/exe", filepath, 256);
| ~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
[100%] Linking CXX executable waifu2x-ncnn-vulkan
In member function ‘record_clone’,
inlined from ‘record_clone.constprop.isra’ at /home/sprout1345/waifu2x-ncnn-vulkan/src/ncnn/src/command.cpp:1206:6:
/home/sprout1345/waifu2x-ncnn-vulkan/src/ncnn/src/command.cpp:1323:57: warning: argument 1 value ‘18446744073709551615’ exceeds maximum object size 9223372036854775807 [-Walloc-size-larger-than=]
1323 | regions = new VkBufferImageCopy[region_count];
| ^
/usr/include/c++/13/new: In member function ‘record_clone.constprop.isra’:
/usr/include/c++/13/new:128:26: note: in a call to allocation function ‘operator new []’ declared here
128 | _GLIBCXX_NODISCARD void* operator new[](std::size_t) _GLIBCXX_THROW (std::bad_alloc)
| ^
/home/sprout1345/waifu2x-ncnn-vulkan/src/ncnn/glslang/glslang/MachineIndependent/ShaderLang.cpp: In function ‘ProcessDeferred.constprop.isra’:
/home/sprout1345/waifu2x-ncnn-vulkan/src/ncnn/glslang/glslang/MachineIndependent/ShaderLang.cpp:844:39: warning: argument 1 value ‘18446744073709551615’ exceeds maximum object size 9223372036854775807 [-Walloc-size-larger-than=]
844 | std::unique_ptr<size_t[]> lengths(new size_t[numTotal]);
| ^
/usr/include/c++/13/new:128:26: note: in a call to allocation function ‘operator new []’ declared here
128 | _GLIBCXX_NODISCARD void* operator new[](std::size_t) _GLIBCXX_THROW (std::bad_alloc)
| ^
/home/sprout1345/waifu2x-ncnn-vulkan/src/ncnn/glslang/glslang/MachineIndependent/ShaderLang.cpp:845:44: warning: argument 1 value ‘18446744073709551615’ exceeds maximum object size 9223372036854775807 [-Walloc-size-larger-than=]
845 | std::unique_ptr<const char*[]> strings(new const char*[numTotal]);
| ^
/usr/include/c++/13/new:128:26: note: in a call to allocation function ‘operator new []’ declared here
128 | _GLIBCXX_NODISCARD void* operator new[](std::size_t) _GLIBCXX_THROW (std::bad_alloc)
| ^
/home/sprout1345/waifu2x-ncnn-vulkan/src/ncnn/glslang/glslang/MachineIndependent/ShaderLang.cpp:846:42: warning: argument 1 value ‘18446744073709551615’ exceeds maximum object size 9223372036854775807 [-Walloc-size-larger-than=]
846 | std::unique_ptr<const char*[]> names(new const char*[numTotal]);
| ^
/usr/include/c++/13/new:128:26: note: in a call to allocation function ‘operator new []’ declared here
128 | _GLIBCXX_NODISCARD void* operator new[](std::size_t) _GLIBCXX_THROW (std::bad_alloc)
| ^
/home/sprout1345/waifu2x-ncnn-vulkan/src/ncnn/src/command.cpp: In member function ‘record_upload’:
/home/sprout1345/waifu2x-ncnn-vulkan/src/ncnn/src/command.cpp:3260:68: warning: argument 1 value ‘18446744073709551615’ exceeds maximum object size 9223372036854775807 [-Walloc-size-larger-than=]
3260 | VkBufferImageCopy* regions = new VkBufferImageCopy[channels];
| ^
/usr/include/c++/13/new:128:26: note: in a call to allocation function ‘operator new []’ declared here
128 | _GLIBCXX_NODISCARD void* operator new[](std::size_t) _GLIBCXX_THROW (std::bad_alloc)
| ^
In member function ‘record_clone’,
inlined from ‘record_clone.isra’ at /home/sprout1345/waifu2x-ncnn-vulkan/src/ncnn/src/command.cpp:1366:6:
/home/sprout1345/waifu2x-ncnn-vulkan/src/ncnn/src/command.cpp:1453:57: warning: argument 1 value ‘18446744073709551615’ exceeds maximum object size 9223372036854775807 [-Walloc-size-larger-than=]
1453 | regions = new VkBufferImageCopy[region_count];
| ^
/usr/include/c++/13/new: In member function ‘record_clone.isra’:
/usr/include/c++/13/new:128:26: note: in a call to allocation function ‘operator new []’ declared here
128 | _GLIBCXX_NODISCARD void* operator new[](std::size_t) _GLIBCXX_THROW (std::bad_alloc)
| ^
[100%] Built target waifu2x-ncnn-vulkan
컴파일이 끝나면 build 디렉터리에 waifu2x-ncnn-vulkan 실행 파일이 생성된다.
원활한 사용을 위해서는 실행 파일과 모델들을 다른 곳으로 옮겨주는 것이 좋다.
아래의 명령을 실행하면 사용자 홈 디렉터리 아래에 "waifu2x" 디렉터리가 생성되고, 그 디렉터리에 waifu2x-ncnn-vulkan과 모델 디렉터리가 복사된다.
mkdir ~/waifu2x
cp waifu2x-ncnn-vulkan ~/waifu2x/
cp -r ../models/* ~/waifu2x/

이후 Git을 이용해 clone 한 디렉터리는 제거하여도 된다.
사용해 봐요
사용법은 아래와 같다.
Usage: waifu2x-ncnn-vulkan -i infile -o outfile [options]...
-h show this help
-v verbose output
-i input-path input image path (jpg/png/webp) or directory
-o output-path output image path (jpg/png/webp) or directory
-n noise-level denoise level (-1/0/1/2/3, default=0)
-s scale upscale ratio (1/2/4/8/16/32, default=2)
-t tile-size tile size (>=32/0=auto, default=0) can be 0,0,0 for multi-gpu
-m model-path waifu2x model path (default=models-cunet)
-g gpu-id gpu device to use (-1=cpu, default=auto) can be 0,1,2 for multi-gpu
-j load:proc:save thread count for load/proc/save (default=1:2:2) can be 1:2,2,2:2 for multi-gpu
-x enable tta mode
-f format output image format (jpg/png/webp, default=ext/png)
예를 들어, 아래의 명령을 이용하면 CPU를 이용해 이미지를 x2 업스케일한다.
./waifu2x-ncnn-vulkan -i input.jpg -o output.png -n 2 -s 2 -g -1
오류가 나요
경우에 따라 아래와 같이 GPU를 명시적으로 지정하지 않으면 오류가 나는 경우가 있다.
GPU를 명시적으로 지정하지 않으면 가장 적절한 GPU를 자동으로 찾는다. 보통 물리적으로 존재하고, 가장 빠른 GPU를 찾아 실행한다.
sprout1345@sproutpi:~/waifu2x$ time ./waifu2x-ncnn-vulkan -i 20241126003440.jpg -o output.png -n 2 -s 2 -v
[0 V3D 7.1.10] queueC=0[1] queueG=0[1] queueT=0[1]
[0 V3D 7.1.10] bugsbn1=0 bugbilz=0 bugcopc=0 bugihfa=0
[0 V3D 7.1.10] fp16-p/s/a=1/1/0 int8-p/s/a=1/1/0
[0 V3D 7.1.10] subgroup=16 basic=1 vote=0 ballot=0 shuffle=0
[1 llvmpipe (LLVM 17.0.6, 128 bits)] queueC=0[1] queueG=0[1] queueT=0[1]
[1 llvmpipe (LLVM 17.0.6, 128 bits)] bugsbn1=0 bugbilz=0 bugcopc=0 bugihfa=0
[1 llvmpipe (LLVM 17.0.6, 128 bits)] fp16-p/s/a=1/1/1 int8-p/s/a=1/1/1
[1 llvmpipe (LLVM 17.0.6, 128 bits)] subgroup=4 basic=1 vote=1 ballot=1 shuffle=1
unknown NIR ALU inst: div 16 %166 = u2f16 %148
Aborted (core dumped)
real 0m1.615s
user 0m0.321s
sys 0m0.105s
문제는 라즈베리 파이의 GPU와 V3D 드라이버의 제한으로 인해 waifu2x가 제대로 작동하지 않을 수 있다는 것이다.
V3D는 Raspberry Pi의 GPU 드라이버로, 하드웨어 가속을 사용하지만 일부 고급 Vulkan 기능(FP16 assembler, vote, ballot 등)을 지원하지 않기에 waifu2x와 같은 응용 프로그램에서 문제가 발생할 수 있다.
llvmpipe는 CPU 기반 소프트웨어 렌더러로, 하드웨어 가속 없이 모든 Vulkan 기능을 CPU를 이용해 에뮬레이션 한다. 따라서 기능 지원 면에서는 더 강력하지만 성능(속도)은 느리다.
이 때문에 하드웨어 GPU를 이용한 실행이 실패하는 것이다.
하지만 llvmpipe나, CPU를 이용하면 Vulkan 기능을 완전히 지원하기에 실행에 성공한다.
물론 llvmpipe는 CPU를 사용하므로 성능 저하가 심하다.
아래는 CPU와, llvmpipe의 실행 결과를 비교한 것이다.
# CPU 실행
sprout1345@sproutpi:~/waifu2x$ time ./waifu2x-ncnn-vulkan -i 20241126003440.jpg -o output.png -n 2 -s 2 -v -g -1
[0 V3D 7.1.10] queueC=0[1] queueG=0[1] queueT=0[1]
[0 V3D 7.1.10] bugsbn1=0 bugbilz=0 bugcopc=0 bugihfa=0
[0 V3D 7.1.10] fp16-p/s/a=1/1/0 int8-p/s/a=1/1/0
[0 V3D 7.1.10] subgroup=16 basic=1 vote=0 ballot=0 shuffle=0
[1 llvmpipe (LLVM 17.0.6, 128 bits)] queueC=0[1] queueG=0[1] queueT=0[1]
[1 llvmpipe (LLVM 17.0.6, 128 bits)] bugsbn1=0 bugbilz=0 bugcopc=0 bugihfa=0
[1 llvmpipe (LLVM 17.0.6, 128 bits)] fp16-p/s/a=1/1/1 int8-p/s/a=1/1/1
[1 llvmpipe (LLVM 17.0.6, 128 bits)] subgroup=4 basic=1 vote=1 ballot=1 shuffle=1
20241126003440.jpg -> output.png done
real 3m39.174s
user 7m10.248s
sys 0m0.722s
# llvmpipe로 실행
sprout1345@sproutpi:~/waifu2x$ time ./waifu2x-ncnn-vulkan -i 20241126003440.jpg -o output.png -n 2 -s 2 -v -g 1
[0 V3D 7.1.10] queueC=0[1] queueG=0[1] queueT=0[1]
[0 V3D 7.1.10] bugsbn1=0 bugbilz=0 bugcopc=0 bugihfa=0
[0 V3D 7.1.10] fp16-p/s/a=1/1/0 int8-p/s/a=1/1/0
[0 V3D 7.1.10] subgroup=16 basic=1 vote=0 ballot=0 shuffle=0
[1 llvmpipe (LLVM 17.0.6, 128 bits)] queueC=0[1] queueG=0[1] queueT=0[1]
[1 llvmpipe (LLVM 17.0.6, 128 bits)] bugsbn1=0 bugbilz=0 bugcopc=0 bugihfa=0
[1 llvmpipe (LLVM 17.0.6, 128 bits)] fp16-p/s/a=1/1/1 int8-p/s/a=1/1/1
[1 llvmpipe (LLVM 17.0.6, 128 bits)] subgroup=4 basic=1 vote=1 ballot=1 shuffle=1
20241126003440.jpg -> output.png done
real 7m4.523s
user 27m53.693s
sys 0m1.172s
보면 알겠지만, llvmpipe를 사용한 실행이 CPU를 이용한 결과물보다 4배로 코어를 갈구고, 2배 더 오래 걸린다는 것을 알 수 있다(RAM도 더 먹는다).
실제로 결과물 또한 차이가 나는데, 이상한 점은 필자의 컴퓨터에서 nvidia 그래픽을 이용해 얻은 결과물과 비슷한 것은 CPU를 이용한 결과물이라는 것이다.
왜인지는 모르겠다. llvmpipe이 과하게 무언가를 하는 것이 아닌가 싶다.
각종 옵션
Usage: waifu2x-ncnn-vulkan -i infile -o outfile [options]...
-h show this help
-v verbose output
-i input-path input image path (jpg/png/webp) or directory
-o output-path output image path (jpg/png/webp) or directory
-n noise-level denoise level (-1/0/1/2/3, default=0)
-s scale upscale ratio (1/2/4/8/16/32, default=2)
-t tile-size tile size (>=32/0=auto, default=0) can be 0,0,0 for multi-gpu
-m model-path waifu2x model path (default=models-cunet)
-g gpu-id gpu device to use (-1=cpu, default=auto) can be 0,1,2 for multi-gpu
-j load:proc:save thread count for load/proc/save (default=1:2:2) can be 1:2,2,2:2 for multi-gpu
-x enable tta mode
-f format output image format (jpg/png/webp, default=ext/png)
"-v": 자세한 설명을 표시
"-n" 디노이즈 레벨을 설정한다. -1은 jpg, webp에서만 적합한 노이즈 레벨 지정, 0=디노이즈 끄기, 1/2/3=단계적으로 디노이즈 강도 조절. 3이 최대.
"-s": 업스케일 배수 설정
"-i": 업스케일링 시도할 파일 및 폴더 입력
"-o": 업스케일링 파일 및 폴더 출력 (폴더의 경우 생성해줘야 함)
입력 파일과 출력 파일의 확장자가 같지 않아도 된다.
"-f": 확장자[jpg/png/webp]를 결정한다. 앞쪽에 명령어를 넣을 것을 추천한다. 다량의 파일을 폴더로 출력할 때 유용하다.
"-g -1"옵션을 주면 CPU가, 그 외에는 0번 GPU부터 지정된다.
"-x" 옵션은 TTA모드 활성화인데, Test-time augmentation이라고 프로그램이 이미지를 이리저리 돌려보면서 업스케일링 정확도를 올리는 기술이라고 생각하면 된다. 정확한 설명은 아니지만 다들 이렇게 설명한다. 활성화하면 8배 정도 오래 걸리는데, 시간이 넉넉하다면 켜보는 것도 좋다.
"-j" 옵션은 스레드 설정으로, "인코딩:프로세싱:디코딩" 순이다. 기본 설정은 1:2:2이다.
CPU에서도 적용가능하며, 적절하게 사용하면 큰 효과를 볼 수 있다. 참고로, 라즈베리 파이 5의 코어는 4 코어이므로 옵션을 2:4:2, 또는 4:4:4 등 프로세싱에 사용되는 CPU 스레드를 4로 두면 된다. 나머지는 큰 의미가 없다. 심지어 인코딩과 디코딩 스레드를 1로 할당해도 딱히 큰 문제가 없다.
아래 접은 글은 필자의 라즈베리 파이 5에서 구동한 여러 결과들이다. 길어져서 접었다.

# 옵션 없음
sprout1345@sproutpi:~/waifu2x$ time ./waifu2x-ncnn-vulkan -i 20241126003440.jpg -o output.png -n 2 -s 2 -v -g -1
[0 V3D 7.1.10] queueC=0[1] queueG=0[1] queueT=0[1]
[0 V3D 7.1.10] bugsbn1=0 bugbilz=0 bugcopc=0 bugihfa=0
[0 V3D 7.1.10] fp16-p/s/a=1/1/0 int8-p/s/a=1/1/0
[0 V3D 7.1.10] subgroup=16 basic=1 vote=0 ballot=0 shuffle=0
[1 llvmpipe (LLVM 17.0.6, 128 bits)] queueC=0[1] queueG=0[1] queueT=0[1]
[1 llvmpipe (LLVM 17.0.6, 128 bits)] bugsbn1=0 bugbilz=0 bugcopc=0 bugihfa=0
[1 llvmpipe (LLVM 17.0.6, 128 bits)] fp16-p/s/a=1/1/1 int8-p/s/a=1/1/1
[1 llvmpipe (LLVM 17.0.6, 128 bits)] subgroup=4 basic=1 vote=1 ballot=1 shuffle=1
20241126003440.jpg -> output.png done
real 3m40.180s
user 7m10.479s
sys 0m0.737s
# 1:4:1
sprout1345@sproutpi:~/waifu2x$ time ./waifu2x-ncnn-vulkan -i 20241126003440.jpg -o output.png -n 2 -s 2 -v -g -1 -j 1:4:1
[0 V3D 7.1.10] queueC=0[1] queueG=0[1] queueT=0[1]
[0 V3D 7.1.10] bugsbn1=0 bugbilz=0 bugcopc=0 bugihfa=0
[0 V3D 7.1.10] fp16-p/s/a=1/1/0 int8-p/s/a=1/1/0
[0 V3D 7.1.10] subgroup=16 basic=1 vote=0 ballot=0 shuffle=0
[1 llvmpipe (LLVM 17.0.6, 128 bits)] queueC=0[1] queueG=0[1] queueT=0[1]
[1 llvmpipe (LLVM 17.0.6, 128 bits)] bugsbn1=0 bugbilz=0 bugcopc=0 bugihfa=0
[1 llvmpipe (LLVM 17.0.6, 128 bits)] fp16-p/s/a=1/1/1 int8-p/s/a=1/1/1
[1 llvmpipe (LLVM 17.0.6, 128 bits)] subgroup=4 basic=1 vote=1 ballot=1 shuffle=1
20241126003440.jpg -> output.png done
real 1m50.529s
user 7m9.456s
sys 0m0.954s
# 2:2:2
sprout1345@sproutpi:~/waifu2x$ time ./waifu2x-ncnn-vulkan -i 20241126003440.jpg -o output.png -n 2 -s 2 -v -g -1 -j 2:2:2
[0 V3D 7.1.10] queueC=0[1] queueG=0[1] queueT=0[1]
[0 V3D 7.1.10] bugsbn1=0 bugbilz=0 bugcopc=0 bugihfa=0
[0 V3D 7.1.10] fp16-p/s/a=1/1/0 int8-p/s/a=1/1/0
[0 V3D 7.1.10] subgroup=16 basic=1 vote=0 ballot=0 shuffle=0
[1 llvmpipe (LLVM 17.0.6, 128 bits)] queueC=0[1] queueG=0[1] queueT=0[1]
[1 llvmpipe (LLVM 17.0.6, 128 bits)] bugsbn1=0 bugbilz=0 bugcopc=0 bugihfa=0
[1 llvmpipe (LLVM 17.0.6, 128 bits)] fp16-p/s/a=1/1/1 int8-p/s/a=1/1/1
[1 llvmpipe (LLVM 17.0.6, 128 bits)] subgroup=4 basic=1 vote=1 ballot=1 shuffle=1
20241126003440.jpg -> output.png done
real 3m39.623s
user 7m11.167s
sys 0m0.712s
# 2:4:2
sprout1345@sproutpi:~/waifu2x$ time ./waifu2x-ncnn-vulkan -i 20241126003440.jpg -o output.png -n 2 -s 2 -v -g -1 -j 2:4:2
[0 V3D 7.1.10] queueC=0[1] queueG=0[1] queueT=0[1]
[0 V3D 7.1.10] bugsbn1=0 bugbilz=0 bugcopc=0 bugihfa=0
[0 V3D 7.1.10] fp16-p/s/a=1/1/0 int8-p/s/a=1/1/0
[0 V3D 7.1.10] subgroup=16 basic=1 vote=0 ballot=0 shuffle=0
[1 llvmpipe (LLVM 17.0.6, 128 bits)] queueC=0[1] queueG=0[1] queueT=0[1]
[1 llvmpipe (LLVM 17.0.6, 128 bits)] bugsbn1=0 bugbilz=0 bugcopc=0 bugihfa=0
[1 llvmpipe (LLVM 17.0.6, 128 bits)] fp16-p/s/a=1/1/1 int8-p/s/a=1/1/1
[1 llvmpipe (LLVM 17.0.6, 128 bits)] subgroup=4 basic=1 vote=1 ballot=1 shuffle=1
20241126003440.jpg -> output.png done
real 1m51.233s
user 7m10.978s
sys 0m1.002s
# 4:4:4
sprout1345@sproutpi:~/waifu2x$ time ./waifu2x-ncnn-vulkan -i 20241126003440.jpg -o output.png -n 2 -s 2 -v -g -1 -j 4:4:4
[0 V3D 7.1.10] queueC=0[1] queueG=0[1] queueT=0[1]
[0 V3D 7.1.10] bugsbn1=0 bugbilz=0 bugcopc=0 bugihfa=0
[0 V3D 7.1.10] fp16-p/s/a=1/1/0 int8-p/s/a=1/1/0
[0 V3D 7.1.10] subgroup=16 basic=1 vote=0 ballot=0 shuffle=0
[1 llvmpipe (LLVM 17.0.6, 128 bits)] queueC=0[1] queueG=0[1] queueT=0[1]
[1 llvmpipe (LLVM 17.0.6, 128 bits)] bugsbn1=0 bugbilz=0 bugcopc=0 bugihfa=0
[1 llvmpipe (LLVM 17.0.6, 128 bits)] fp16-p/s/a=1/1/1 int8-p/s/a=1/1/1
[1 llvmpipe (LLVM 17.0.6, 128 bits)] subgroup=4 basic=1 vote=1 ballot=1 shuffle=1
20241126003440.jpg -> output.png done
real 1m51.060s
user 7m10.431s
sys 0m0.996s
# 4:8:4
sprout1345@sproutpi:~/waifu2x$ time ./waifu2x-ncnn-vulkan -i 20241126003440.jpg -o output.png -n 2 -s 2 -v -g -1 -j 4:8:4
[0 V3D 7.1.10] queueC=0[1] queueG=0[1] queueT=0[1]
[0 V3D 7.1.10] bugsbn1=0 bugbilz=0 bugcopc=0 bugihfa=0
[0 V3D 7.1.10] fp16-p/s/a=1/1/0 int8-p/s/a=1/1/0
[0 V3D 7.1.10] subgroup=16 basic=1 vote=0 ballot=0 shuffle=0
[1 llvmpipe (LLVM 17.0.6, 128 bits)] queueC=0[1] queueG=0[1] queueT=0[1]
[1 llvmpipe (LLVM 17.0.6, 128 bits)] bugsbn1=0 bugbilz=0 bugcopc=0 bugihfa=0
[1 llvmpipe (LLVM 17.0.6, 128 bits)] fp16-p/s/a=1/1/1 int8-p/s/a=1/1/1
[1 llvmpipe (LLVM 17.0.6, 128 bits)] subgroup=4 basic=1 vote=1 ballot=1 shuffle=1
20241126003440.jpg -> output.png done
real 1m51.265s
user 7m12.254s
sys 0m0.941s
real: 실제 소요된 시간,
user: CPU가 사용자 모드에서 프로그램을 실행하는 데 소요된 시간,
sys: CPU가 커널 모드에서 프로그램을 실행하는 데 소요된 시간이다.
실제 시간(real time)보다 CPU 시간(user time)이 더 많이 소요된 이유는 프로그램이 여러 개의 CPU 코어를 병렬로 사용하였기 때문이다.
예를 들어, 프로그램이 4개의 코어를 동시에 사용하면서 2분 동안 실행되었다면, 각각의 코어에서 2분씩 소요되어 총 8분의 CPU 시간을 소비하게 되는 것이다. 즉, CPU 시간은 프로그램이 각 코어에서 실행된 시간의 합이다.
어쨌든, 결과를 살펴보면 알 수 있듯이, 프로세싱에 사용되는 CPU 스레드가 전체 작업 시간을 결정한다는 것을 알 수 있다.
여담
GPU 가속을 사용하지 못하는 것이 드라이버의 문제인지, 아니면 하드웨어가 지원을 못하는 것인지는 아직 알 수 없다. 만약 사용 가능하다면 그 방법을 따로 작성하도록 할 것이다.
'Raspberry Pi' 카테고리의 다른 글
| Raspberry Pi #13 - Raspberry Pi 5에서 UART를 통해 Raspberry Pi Zero 2 WH에 접속하기 (0) | 2025.11.01 |
|---|---|
| Raspberry Pi #11 - arduino-cli에서 Arduino R4 Minima 사용 (0) | 2025.09.01 |
| Raspberry Pi #10 - Arduino CLI를 이용한 Arduino 프로그래밍 (3) | 2025.08.01 |
| Raspberry Pi #9 - Raspberry Pi에 저장된 네트워크 연결 정보 변경하기 (0) | 2025.07.01 |
| Raspberry Pi #8 - Raspberry Pi 5에서 BitNet AI 모델을 로컬로 실행 (2) | 2025.05.01 |
댓글