当前位置: 首页 > news >正文

网站html5自适应屏幕淘宝优惠券返利网站怎么做

网站html5自适应屏幕,淘宝优惠券返利网站怎么做,下载天眼查企业查询官网,如何设置wordpress静态页面RK3588上CPU和GPU算力以及opencv resize的性能对比测试 一.背景二.小结三.相关链接四.操作步骤1.环境搭建A.安装依赖B.设置GPU为高性能模式C.获取GPU信息D.获取CPU信息 2.调用OpenCL SDK获取GPU信息3.使用OpenCL API计算矩阵乘4.使用clpeak测试GPU的性能5.使用OpenBLAS测试CPU的… RK3588上CPU和GPU算力以及opencv resize的性能对比测试 一.背景二.小结三.相关链接四.操作步骤1.环境搭建A.安装依赖B.设置GPU为高性能模式C.获取GPU信息D.获取CPU信息 2.调用OpenCL SDK获取GPU信息3.使用OpenCL API计算矩阵乘4.使用clpeak测试GPU的性能5.使用OpenBLAS测试CPU的算力6.分别用CPU与OpenCL测试opencv resize的性能A.编译OpenCV支持OpenCLB.运行OpenCV测试程序 一.背景 希望对比RK3588上CPU和Mali-GPU的性能差异Mali-GPU算力测试采用clpeakCPU-FP32的性能测试采用Openblas(开启了NEON优化)分别用CPU和opencl测试opencv resize在不同算法下的性能:从32x32放大到8192x8192再缩放回32x32,循环100次 二.小结 GPU型号: Mali-LODX r0p0 Mali-G610 4 cores r0p0 0xA867GPU FP32(clpeak): 441.95 GFLOPSCPU FP32(openblasneon): 53.68 GFLOPS插值方法INTER_NEAREST CPU耗时(秒):3.01526 GPU耗时(秒):0.0672681插值方法INTER_LINEAR CPU耗时(秒):5.3227 GPU耗时(秒):0.0189366插值方法INTER_CUBIC CPU耗时(秒):8.22734 GPU耗时(秒):11.6337插值方法INTER_AREA CPU耗时(秒):20.4999 GPU耗时(秒):27.3197插值方法INTER_LANCZOS4 CPU耗时(秒):29.3602 GPU耗时(秒):43.9484 三.相关链接 opencv编译 四.操作步骤 1.环境搭建 A.安装依赖 mv /lib/aarch64-linux-gnu/libOpenCL.so.1 /lib/aarch64-linux-gnu/libOpenCL.so.1.bk ln -s /usr/lib/aarch64-linux-gnu/libmali.so /lib/aarch64-linux-gnu/libOpenCL.so.1sudo apt install opencl-headers sudo apt install ocl-icd-libopencl1 sudo apt install ocl-icd-opencl-dev sudo apt install clinfoB.设置GPU为高性能模式 echo performance /sys/class/devfreq/fb000000.gpu/governor echo performance /sys/class/devfreq/fdab0000.npu/governorC.获取GPU信息 cat /sys/class/misc/mali0/device/gpuinfo clinfo输出 Mali-G610 4 cores r0p0 0xA867Number of platforms 1Platform Name ARM PlatformPlatform Vendor ARMPlatform Version OpenCL 2.1 v1.g6p0-01eac0.ba52c908d926792b8f5fe28f383a2b03Platform Profile FULL_PROFILEPlatform Extensions cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_byte_addressable_store cl_khr_3d_image_writes cl_khr_int64_base_atomics cl_khr_int64_extended_atomics cl_khr_fp16 cl_khr_icd cl_khr_egl_image cl_khr_image2d_from_buffer cl_khr_depth_images cl_khr_subgroups cl_khr_subgroup_extended_types cl_khr_subgroup_non_uniform_vote cl_khr_subgroup_ballot cl_khr_il_program cl_khr_priority_hints cl_khr_create_command_queue cl_khr_spirv_no_integer_wrap_decoration cl_khr_extended_versioning cl_khr_device_uuid cl_arm_core_id cl_arm_printf cl_arm_non_uniform_work_group_size cl_arm_import_memory cl_arm_import_memory_dma_buf cl_arm_import_memory_host cl_arm_integer_dot_product_int8 cl_arm_integer_dot_product_accumulate_int8 cl_arm_integer_dot_product_accumulate_saturate_int8 cl_arm_scheduling_controls cl_arm_controlled_kernel_termination cl_ext_cxx_for_openclPlatform Host timer resolution 1nsPlatform Extensions function suffix ARMPlatform Name ARM Platform Number of devices 1 arm_release_ver of this libmali is g6p0-01eac0, rk_so_ver is 6.Device Name Mali-LODX r0p0Device Vendor ARMDevice Vendor ID 0xa8670000Device Version OpenCL 2.1 v1.g6p0-01eac0.ba52c908d926792b8f5fe28f383a2b03Driver Version 2.1Device OpenCL C Version OpenCL C 2.0 v1.g6p0-01eac0.ba52c908d926792b8f5fe28f383a2b03Device Type GPUDevice Profile FULL_PROFILEDevice Available YesCompiler Available YesLinker Available YesMax compute units 4Max clock frequency 1000MHzDevice Partition (core)Max number of sub-devices 0Supported partition types NoneSupported affinity domains (n/a)Max work item dimensions 3Max work item sizes 1024x1024x1024Max work group size 1024Preferred work group size multiple 16Max sub-groups per work group 64Preferred / native vector sizeschar 16 / 4short 8 / 2int 4 / 1long 2 / 1half 8 / 2 (cl_khr_fp16)float 4 / 1double 0 / 0 (n/a)Half-precision Floating-point support (cl_khr_fp16)Denormals YesInfinity and NANs YesRound to nearest YesRound to zero YesRound to infinity YesIEEE754-2008 fused multiply-add YesSupport is emulated in software NoSingle-precision Floating-point support (core)Denormals YesInfinity and NANs YesRound to nearest YesRound to zero YesRound to infinity YesIEEE754-2008 fused multiply-add YesSupport is emulated in software NoCorrectly-rounded divide and sqrt operations NoDouble-precision Floating-point support (n/a)Address bits 64, Little-EndianGlobal memory size 16643870720 (15.5GiB)Error Correction support NoMax memory allocation 16643870720 (15.5GiB)Unified memory for Host and Device YesShared Virtual Memory (SVM) capabilities (core)Coarse-grained buffer sharing YesFine-grained buffer sharing NoFine-grained system sharing NoAtomics NoMinimum alignment for any data type 128 bytesAlignment of base address 1024 bits (128 bytes)Preferred alignment for atomicsSVM 0 bytesGlobal 0 bytesLocal 0 bytesMax size for global variable 65536 (64KiB)Preferred total size of global vars 0Global Memory cache type Read/WriteGlobal Memory cache size 1048576 (1024KiB)Global Memory cache line size 64 bytesImage support YesMax number of samplers per kernel 16Max size for 1D images from buffer 65536 pixelsMax 1D or 2D image array size 2048 imagesBase address alignment for 2D image buffers 32 bytesPitch alignment for 2D image buffers 64 pixelsMax 2D image size 65536x65536 pixelsMax 3D image size 65536x65536x65536 pixelsMax number of read image args 128Max number of write image args 64Max number of read/write image args 64Max number of pipe args 16Max active pipe reservations 1Max pipe packet size 1024Local memory type GlobalLocal memory size 32768 (32KiB)Max number of constant args 128Max constant buffer size 16643870720 (15.5GiB)Max size of kernel argument 1024Queue properties (on host)Out-of-order execution YesProfiling YesQueue properties (on device)Out-of-order execution YesProfiling YesPreferred size 2097152 (2MiB)Max size 16777216 (16MiB)Max queues on device 1Max events on device 1024Prefer user sync for interop NoProfiling timer resolution 1000nsExecution capabilitiesRun OpenCL kernels YesRun native kernels NoSub-group independent forward progress YesIL version SPIR-V_1.0SPIR versions printDeviceInfo:161: get CL_DEVICE_SPIR_VERSIONS size : error -30printf() buffer size 1048576 (1024KiB)Built-in kernels (n/a)Device Extensions cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_byte_addressable_store cl_khr_3d_image_writes cl_khr_int64_base_atomics cl_khr_int64_extended_atomics cl_khr_fp16 cl_khr_icd cl_khr_egl_image cl_khr_image2d_from_buffer cl_khr_depth_images cl_khr_subgroups cl_khr_subgroup_extended_types cl_khr_subgroup_non_uniform_vote cl_khr_subgroup_ballot cl_khr_il_program cl_khr_priority_hints cl_khr_create_command_queue cl_khr_spirv_no_integer_wrap_decoration cl_khr_extended_versioning cl_khr_device_uuid cl_arm_core_id cl_arm_printf cl_arm_non_uniform_work_group_size cl_arm_import_memory cl_arm_import_memory_dma_buf cl_arm_import_memory_host cl_arm_integer_dot_product_int8 cl_arm_integer_dot_product_accumulate_int8 cl_arm_integer_dot_product_accumulate_saturate_int8 cl_arm_scheduling_controls cl_arm_controlled_kernel_termination cl_ext_cxx_for_openclNULL platform behaviorclGetPlatformInfo(NULL, CL_PLATFORM_NAME, ...) ARM PlatformclGetDeviceIDs(NULL, CL_DEVICE_TYPE_ALL, ...) Success [ARM]clCreateContext(NULL, ...) [default] Success [ARM]clCreateContextFromType(NULL, CL_DEVICE_TYPE_DEFAULT) Success (1)Platform Name ARM PlatformDevice Name Mali-LODX r0p0clCreateContextFromType(NULL, CL_DEVICE_TYPE_CPU) No devices found in platformclCreateContextFromType(NULL, CL_DEVICE_TYPE_GPU) Success (1)Platform Name ARM PlatformDevice Name Mali-LODX r0p0clCreateContextFromType(NULL, CL_DEVICE_TYPE_ACCELERATOR) No devices found in platformclCreateContextFromType(NULL, CL_DEVICE_TYPE_CUSTOM) No devices found in platformclCreateContextFromType(NULL, CL_DEVICE_TYPE_ALL) Success (1)Platform Name ARM PlatformDevice Name Mali-LODX r0p0D.获取CPU信息 lscpu输出 Architecture: aarch64 CPU op-mode(s): 32-bit, 64-bit Byte Order: Little Endian CPU(s): 8 On-line CPU(s) list: 0-7 Thread(s) per core: 1 Core(s) per socket: 2 Socket(s): 3 Vendor ID: ARM Model: 0 Model name: Cortex-A55 Stepping: r2p0 CPU max MHz: 2208.0000 CPU min MHz: 408.0000 BogoMIPS: 48.00 L1d cache: 256 KiB L1i cache: 256 KiB L2 cache: 1 MiB L3 cache: 3 MiB Vulnerability Itlb multihit: Not affected Vulnerability L1tf: Not affected Vulnerability Mds: Not affected Vulnerability Meltdown: Not affected Vulnerability Spec store bypass: Mitigation; Speculative Store Bypass disabled via prctl Vulnerability Spectre v1: Mitigation; __user pointer sanitization Vulnerability Spectre v2: Not affected Vulnerability Srbds: Not affected Vulnerability Tsx async abort: Not affected Flags: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm lrcpc dcpop asimddp2.调用OpenCL SDK获取GPU信息 cat cl_query.c -EOF #include stdio.h #include stdlib.h #include CL/cl.hint main() {cl_platform_id *platforms NULL;cl_uint num_platforms 0;// 获取可用的平台数量cl_int clStatus clGetPlatformIDs(0, NULL, num_platforms);platforms (cl_platform_id*) malloc(sizeof(cl_platform_id) * num_platforms);// 获取所有平台IDclStatus clGetPlatformIDs(num_platforms, platforms, NULL);printf(OpenCL平台数量: %d\n, num_platforms);// 遍历每个平台for (cl_uint i 0; i num_platforms; i) {char buffer[10240];printf(\n平台 %d:\n, i1);// 获取平台名称clGetPlatformInfo(platforms[i], CL_PLATFORM_NAME, sizeof(buffer), buffer, NULL);printf( 名称: %s\n, buffer);// 获取平台供应商clGetPlatformInfo(platforms[i], CL_PLATFORM_VENDOR, sizeof(buffer), buffer, NULL);printf( 供应商: %s\n, buffer);// 获取平台版本clGetPlatformInfo(platforms[i], CL_PLATFORM_VERSION, sizeof(buffer), buffer, NULL);printf( 版本: %s\n, buffer);// 获取设备数量cl_uint num_devices 0;clGetDeviceIDs(platforms[i], CL_DEVICE_TYPE_ALL, 0, NULL, num_devices);cl_device_id *devices (cl_device_id*) malloc(sizeof(cl_device_id) * num_devices);clGetDeviceIDs(platforms[i], CL_DEVICE_TYPE_ALL, num_devices, devices, NULL);// 遍历每个设备for (cl_uint j 0; j num_devices; j) {printf( 设备 %d:\n, j1);// 获取设备名称clGetDeviceInfo(devices[j], CL_DEVICE_NAME, sizeof(buffer), buffer, NULL);printf( 名称: %s\n, buffer);// 获取设备类型cl_device_type device_type;clGetDeviceInfo(devices[j], CL_DEVICE_TYPE, sizeof(device_type), device_type, NULL);if (device_type CL_DEVICE_TYPE_CPU)printf( 类型: CPU\n);if (device_type CL_DEVICE_TYPE_GPU)printf( 类型: GPU\n);if (device_type CL_DEVICE_TYPE_ACCELERATOR)printf( 类型: 加速器\n);// 获取计算单元数量cl_uint compute_units;clGetDeviceInfo(devices[j], CL_DEVICE_MAX_COMPUTE_UNITS, sizeof(compute_units), compute_units, NULL);printf( 计算单元数: %d\n, compute_units);// 获取全局内存大小cl_ulong global_mem;clGetDeviceInfo(devices[j], CL_DEVICE_GLOBAL_MEM_SIZE, sizeof(global_mem), global_mem, NULL);printf( 全局内存大小: %llu MB\n, (unsigned long long)(global_mem / (1024 * 1024)));}free(devices);}free(platforms);return 0; } EOFgcc -o cl_query cl_query.c -lOpenCL ./cl_query输出 OpenCL平台数量: 1平台 1:名称: ARM Platform供应商: ARM版本: OpenCL 2.1 v1.g6p0-01eac0.ba52c908d926792b8f5fe28f383a2b03设备 1: arm_release_ver of this libmali is g6p0-01eac0, rk_so_ver is 6.名称: Mali-LODX r0p0类型: GPU计算单元数: 4全局内存大小: 15872 MB3.使用OpenCL API计算矩阵乘 cat matmul.c -EOF #include stdio.h #include stdlib.h #include CL/cl.h #include time.h #include sys/time.h#define MATRIX_SIZE 8192 #define TILE_SIZE 32// 获取当前时间秒用于计算耗时 double get_current_time() {struct timeval tp;gettimeofday(tp, NULL);return (double)(tp.tv_sec) (double)(tp.tv_usec) / 1e6; }#define xstr(s) str(s) #define str(s) #sconst char *kernelSource \n \ __kernel void mat_mul_optimized(const int N, \n \__global float* A, \n \__global float* B, \n \__global float* C) { \n \const int TILE_SIZE xstr(TILE_SIZE) ; \n \__local float Asub[TILE_SIZE][TILE_SIZE]; \n \__local float Bsub[TILE_SIZE][TILE_SIZE]; \n \int global_row get_global_id(1); \n \int global_col get_global_id(0); \n \int local_row get_local_id(1); \n \int local_col get_local_id(0); \n \float sum 0.0f; \n \int numTiles (N TILE_SIZE - 1) / TILE_SIZE; \n \for (int t 0; t numTiles; t) { \n \int tiled_row global_row; \n \int tiled_col t * TILE_SIZE local_col; \n \if (tiled_row N tiled_col N) \n \Asub[local_row][local_col] A[tiled_row * N tiled_col];\n \else \n \Asub[local_row][local_col] 0.0f; \n \tiled_row t * TILE_SIZE local_row; \n \tiled_col global_col; \n \if (tiled_row N tiled_col N) \n \Bsub[local_row][local_col] B[tiled_row * N tiled_col];\n \else \n \Bsub[local_row][local_col] 0.0f; \n \barrier(CLK_LOCAL_MEM_FENCE); \n \for (int k 0; k TILE_SIZE; k) { \n \sum Asub[local_row][k] * Bsub[k][local_col]; \n \} \n \barrier(CLK_LOCAL_MEM_FENCE); \n \} \n \if (global_row N global_col N) \n \C[global_row * N global_col] sum; \n \ } \n;int main() {int N MATRIX_SIZE;size_t bytes N * N * sizeof(float);// 分配主机内存float *h_A (float*)malloc(bytes);float *h_B (float*)malloc(bytes);float *h_C (float*)malloc(bytes);// 初始化矩阵for(int i 0; i N*N; i) {h_A[i] 1.0f;h_B[i] 1.0f;}// 获取平台和设备信息cl_platform_id platformId NULL;cl_device_id deviceID NULL;cl_uint retNumDevices;cl_uint retNumPlatforms;cl_int ret clGetPlatformIDs(1, platformId, retNumPlatforms);ret clGetDeviceIDs(platformId, CL_DEVICE_TYPE_DEFAULT, 1, deviceID, retNumDevices);// 创建 OpenCL 上下文cl_context context clCreateContext(NULL, 1, deviceID, NULL, NULL, ret);// 创建命令队列cl_command_queue commandQueue clCreateCommandQueue(context, deviceID, 0, ret);// 创建内存缓冲区cl_mem d_A clCreateBuffer(context, CL_MEM_READ_ONLY, bytes, NULL, ret);cl_mem d_B clCreateBuffer(context, CL_MEM_READ_ONLY, bytes, NULL, ret);cl_mem d_C clCreateBuffer(context, CL_MEM_WRITE_ONLY, bytes, NULL, ret);// 将数据写入缓冲区ret clEnqueueWriteBuffer(commandQueue, d_A, CL_TRUE, 0, bytes, h_A, 0, NULL, NULL);ret clEnqueueWriteBuffer(commandQueue, d_B, CL_TRUE, 0, bytes, h_B, 0, NULL, NULL);// 记录编译开始时间double compile_start get_current_time();// 创建程序对象cl_program program clCreateProgramWithSource(context, 1, (const char**)kernelSource, NULL, ret);// 编译内核程序ret clBuildProgram(program, 1, deviceID, NULL, NULL, NULL);// 检查编译错误if (ret ! CL_SUCCESS) {size_t log_size;clGetProgramBuildInfo(program, deviceID, CL_PROGRAM_BUILD_LOG, 0, NULL, log_size);char *log (char *)malloc(log_size);clGetProgramBuildInfo(program, deviceID, CL_PROGRAM_BUILD_LOG, log_size, log, NULL);printf(CL Compilation failed:\n%s\n, log);free(log);return 1;}// 记录编译结束时间double compile_end get_current_time();double compile_time compile_end - compile_start;// 创建 OpenCL 内核cl_kernel kernel clCreateKernel(program, mat_mul_optimized, ret);// 设置内核参数ret clSetKernelArg(kernel, 0, sizeof(int), (void*)N);ret clSetKernelArg(kernel, 1, sizeof(cl_mem), (void*)d_A);ret clSetKernelArg(kernel, 2, sizeof(cl_mem), (void*)d_B);ret clSetKernelArg(kernel, 3, sizeof(cl_mem), (void*)d_C);// 定义全局和本地工作区大小size_t local[2] {TILE_SIZE, TILE_SIZE};size_t global[2] {(size_t)((N TILE_SIZE - 1) / TILE_SIZE) * TILE_SIZE,(size_t)((N TILE_SIZE - 1) / TILE_SIZE) * TILE_SIZE};// 记录第一次内核执行开始时间double launch_start get_current_time();// 执行内核ret clEnqueueNDRangeKernel(commandQueue, kernel, 2, NULL, global, local, 0, NULL, NULL);printf(clEnqueueNDRangeKernel:%d\n,ret);// 等待命令队列执行完成clFinish(commandQueue);// 记录第一次内核执行结束时间double launch_end get_current_time();double launch_time launch_end - launch_start;// 读取结果ret clEnqueueReadBuffer(commandQueue, d_C, CL_TRUE, 0, bytes, h_C, 0, NULL, NULL);// 计算 GFLOPSdouble total_ops 2.0 * N * N * N;double gflops (total_ops / 1e9) / launch_time;// 输出结果printf(编译时间: %f 秒\n, compile_time);printf(第一次内核执行时间: %f 秒\n, launch_time);printf(计算性能: %f GFLOPS\n, gflops);// 释放资源ret clFlush(commandQueue);ret clFinish(commandQueue);ret clReleaseKernel(kernel);ret clReleaseProgram(program);ret clReleaseMemObject(d_A);ret clReleaseMemObject(d_B);ret clReleaseMemObject(d_C);ret clReleaseCommandQueue(commandQueue);ret clReleaseContext(context);free(h_A);free(h_B);free(h_C);return 0; }EOF gcc -o matmul matmul.c -lOpenCL ./matmul输出 编译时间: 0.031085 秒 第一次内核执行时间: 62.258528 秒 计算性能: 17.660418 GFLOPS4.使用clpeak测试GPU的性能 git clone https://gitcode.com/gh_mirrors/cl/clpeak.git git submodule update --init --recursive --remote mkdir build cd build cmake -DCMAKE_BUILD_TYPERelease .. cmake --build . ./clpeak输出 Platform: ARM Platform arm_release_ver of this libmali is g6p0-01eac0, rk_so_ver is 6.Device: Mali-LODX r0p0Driver version : 2.1 (Linux ARM64)Compute units : 4Clock frequency : 1000 MHzGlobal memory bandwidth (GBPS)float : 25.71float2 : 24.45float4 : 23.70float8 : 12.05float16 : 12.01Single-precision compute (GFLOPS)float : 441.77float2 : 470.27float4 : 466.52float8 : 435.65float16 : 411.38Half-precision compute (GFLOPS)half : 441.96half2 : 878.25half4 : 911.51half8 : 886.19half16 : 846.44No double precision support! SkippedInteger compute (GIOPS)int : 124.96int2 : 125.71int4 : 125.16int8 : 123.82int16 : 124.24Integer compute Fast 24bit (GIOPS)int : 125.16int2 : 125.63int4 : 125.20int8 : 123.73int16 : 124.33Integer char (8bit) compute (GIOPS)char : 126.47char2 : 251.55char4 : 498.03char8 : 497.37char16 : 491.94Integer short (16bit) compute (GIOPS)short : 126.31short2 : 250.90short4 : 249.47short8 : 248.51short16 : 245.30Transfer bandwidth (GBPS)enqueueWriteBuffer : 8.54enqueueReadBuffer : 9.97enqueueWriteBuffer non-blocking : 8.55enqueueReadBuffer non-blocking : 9.99enqueueMapBuffer(for read) : 61.66memcpy from mapped ptr : 11.95enqueueUnmap(after write) : 62.02memcpy to mapped ptr : 11.89Kernel launch latency : 26.81 us5.使用OpenBLAS测试CPU的算力 git clone https://github.com/xianyi/OpenBLAS.git cd OpenBLAS make TARGETARMV8 make install cd benchmark make TARGETARMV8 sgemm cc sgemm.o -o sgemm /opt/OpenBLAS/lib/libopenblas.so -Wl,-rpath/opt/OpenBLAS/lib/ export OPENBLAS_NUM_THREADS8 export OPENBLAS_LOOPS10 export OPENBLAS_PARAM_M8192 export OPENBLAS_PARAM_N8192 export OPENBLAS_PARAM_K8192 ./sgemm输出 From : 1 To : 200 Step1 : TransaN : TransbNSIZE Flops TimeM8192, N8192, K8192 : 53485.68 MFlops 205.571220 sec6.分别用CPU与OpenCL测试opencv resize的性能 A.编译OpenCV支持OpenCL Opencv修改点[链接libmali.so] diff --git a/cmake/OpenCVDetectOpenCL.cmake b/cmake/OpenCVDetectOpenCL.cmake index 6ab2cae070..c3cf235e45 100644 --- a/cmake/OpenCVDetectOpenCL.cmakeb/cmake/OpenCVDetectOpenCL.cmake-3,9 3,8 if(APPLE)set(OPENCL_LIBRARY -framework OpenCL CACHE STRING OpenCL library)set(OPENCL_INCLUDE_DIR CACHE PATH OpenCL include directory)else() - set(OPENCL_LIBRARY CACHE STRING OpenCL library) - set(OPENCL_INCLUDE_DIR ${OpenCV_SOURCE_DIR}/3rdparty/include/opencl/1.2 CACHE PATH OpenCL include directory) - ocv_install_3rdparty_licenses(opencl-headers ${OpenCV_SOURCE_DIR}/3rdparty/include/opencl/LICENSE.txt)set(OPENCL_LIBRARY /usr/lib/aarch64-linux-gnu/libmali.so)set(OPENCL_INCLUDE_DIR /usr/include)endif()mark_as_advanced(OPENCL_INCLUDE_DIR OPENCL_LIBRARY)编译Opencv git clone https://github.com/opencv/opencv.git cd opencv git checkout bdb6a968ce69a2bf7c34724f9052c20e941ab47b mkdir build cd build cmake -DCMAKE_BUILD_TYPERelease \-DCMAKE_INSTALL_PREFIXpwd/_install \-DWITH_OPENCLON -DWITH_NEONON \-DBUILD_SHARED_LIBSON \-D BUILD_opencv_worldON -DBUILD_TESTSOFF -DBUILD_EXAMPLESOFF -DBUILD_opencv_appsOFF \-DBUILD_opencv_dnnOFF -DBUILD_opencv_calib3dOFF \-DBUILD_opencv_imgprocON -DBUILD_opencv_imgcodecsON .. make -j4 make installB.运行OpenCV测试程序 cat opencv_resize.cpp -EOF #include opencv2/opencv.hpp #include opencv2/core/ocl.hpp #include iostream #include mapvoid run(int resize_mode) {// 创建一个32x32的随机图像cv::Mat src cv::Mat::zeros(32, 32, CV_8UC3);cv::randu(src, cv::Scalar::all(0), cv::Scalar::all(255));// ------------------------------------// 在CPU上执行// ------------------------------------cv::ocl::setUseOpenCL(false);cv::Mat enlarged_cpu, resized_back_cpu;// 记录放大操作的开始时间int64 start_time_cpu cv::getTickCount();for(int i0;i100;i){// 放大到8192x8192cv::resize(src, enlarged_cpu, cv::Size(8192, 8192), 0, 0, resize_mode);// 缩小回32x32cv::resize(enlarged_cpu, resized_back_cpu, cv::Size(32, 32), 0, 0, resize_mode);}// 记录缩小操作的结束时间int64 end_time_cpu cv::getTickCount();// 计算缩小操作的耗时double time_resize_cpu (end_time_cpu - start_time_cpu) / cv::getTickFrequency();// ------------------------------------// 在GPUOpenCL上执行// ------------------------------------cv::ocl::setUseOpenCL(true);cv::UMat src_umat;src.copyTo(src_umat);cv::UMat enlarged_gpu, resized_back_gpu;// 记录放大操作的开始时间int64 start_time_gpu cv::getTickCount();for(int i0;i100;i){// 放大到8192x8192cv::resize(src_umat, enlarged_gpu, cv::Size(8192, 8192), 0, 0, resize_mode);// 缩小回32x32cv::resize(enlarged_gpu, resized_back_gpu, cv::Size(32, 32), 0, 0, resize_mode);}// 记录缩小操作的结束时间int64 end_time_gpu cv::getTickCount();// 计算缩小操作的耗时double time_resize_gpu (end_time_gpu - start_time_gpu) / cv::getTickFrequency();std::cout CPU耗时(秒): time_resize_cpu GPU耗时(秒): time_resize_gpu std::endl; }int main() {// 检查系统是否支持OpenCLif (!cv::ocl::haveOpenCL()) {std::cout 系统不支持OpenCL。 std::endl;return -1;}// 输出OpenCL设备信息cv::ocl::Context context;if (!context.create(cv::ocl::Device::TYPE_GPU)) {std::cout 未找到可用的GPU设备使用CPU执行。 std::endl;} else {cv::ocl::Device device cv::ocl::Device::getDefault();std::cout 使用的OpenCL设备 device.name() std::endl;}// 定义要测试的插值方法std::vectorint interpolation_methods {cv::INTER_NEAREST,cv::INTER_LINEAR,cv::INTER_CUBIC,cv::INTER_AREA,cv::INTER_LANCZOS4};// 插值方法的名称用于输出结果std::vectorstd::string interpolation_names {INTER_NEAREST,INTER_LINEAR,INTER_CUBIC,INTER_AREA,INTER_LANCZOS4};for (size_t i 0; i interpolation_methods.size(); i) {int interpolation interpolation_methods[i];std::string method_name interpolation_names[i];std::cout 插值方法 method_name ;run(interpolation);} return 0; } EOF g -o opencv_resize opencv_resize.cpp -I _install/include/opencv4 \_install/lib/libopencv_world.so -Wl,-rpath_install/lib export OPENBLAS_NUM_THREADS8 ./opencv_resize输出 arm_release_ver of this libmali is g6p0-01eac0, rk_so_ver is 6. 使用的OpenCL设备Mali-LODX r0p0 插值方法INTER_NEAREST CPU耗时(秒):3.01526 GPU耗时(秒):0.0672681 插值方法INTER_LINEAR CPU耗时(秒):5.3227 GPU耗时(秒):0.0189366 插值方法INTER_CUBIC CPU耗时(秒):8.22734 GPU耗时(秒):11.6337 插值方法INTER_AREA CPU耗时(秒):20.4999 GPU耗时(秒):27.3197 插值方法INTER_LANCZOS4 CPU耗时(秒):29.3602 GPU耗时(秒):43.9484
http://www.hkea.cn/news/14588248/

相关文章:

  • 房产中介网站建设模板绵阳新农网的网站是哪个公司做的
  • 网站建设的知名品牌android软件开发
  • 淘宝做促销的网站平谷网站建设
  • 双井网站建设免费推广的网站
  • 用c 实现网站开发重庆网站建设解决方案
  • 十大免费cad网站入口软件贵阳网站建设app开发
  • 深圳宝安住房和建设局网站官网替老外做网站
  • 福田网站制作比较好的如果做好网站社区的建设
  • 站长之家查询网站免费系统小说大全
  • 临淄区住房和城乡建设局网站著名的网站有哪些
  • 建网站需要多少钱和什么条件有关安阳手机网站建设
  • 手机电影网站怎么做做资料分享网站有哪些
  • 盐城网站平台建设网站服务器是什么
  • 网站设计项目明细招生网站开发
  • 专做老酒的网站中华室内设计网伍飒爽
  • 百度网站的优势二维码生成器微信小程序
  • 做网站需要记哪些代码重新建网站需要转域名吗
  • 建个网站费用多少wordpress不能识别语言
  • 海拉尔网站建设 网站设计四川省建设注册资格中心网站
  • js 网站怎么做中英文管家婆软件
  • 开发网站有什么用开网站的宣传图片怎么做
  • 怎么改一个网站的关键词密度网站建设及推广
  • 南昌市城市建设档案馆网站打码赚钱
  • 烟台学校网站建设wordpress 文件
  • 开发网站的财务分析加强政务公开网站建设
  • 学校招聘教师网站建设网站设计过程介绍
  • 响应式网站设计与实现论文wordpress 文本小工具栏
  • 自己做物流网站网页一般用什么语言编写
  • 外管局网站做延期收款报告佛山新网站建设服务公司
  • 河南广宇建设集团有限公司网站网站建设三网合一指的是什么